Skip to content

S3

Questions

Q: What is Durability ? What is Availability ? A: Availability measures how readily available a service is. Durability is used to measure the likelihood of data loss. More on this blog

Q: When MFA, only the root can delete objects ? is this guy for real ? A: What does "Min storage duration" in Storage class ? > From Amazon Docs: > Note: The S3 Standard-IA and S3 One Zone-IA storage classes are suitable for objects larger than 128 KB that you plan to store for at least 30 days. If an object is less than 128 KB, Amazon S3 charges you for 128 KB. If you delete an object before the end of the 30-day minimum storage duration period, you are charged for 30 days.

Q: What is "Object Lock" ? A: > Store objects using a write-once-read-many (WORM) model to help you prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely. Learn more

Q: What IAM role should we use to create replication and replication jobs ? A: Apparently This is a though one, read more here

Q: After answering the previous question, set up a replication on AWS console to see if I get everything right

Intro

data storage architecture that manages data as objects, as opposed to other storage architectures like file systems and block storage. Stores unlimited amount of data without worry of underlying storage infrastructure.

S3 Object

  • S3 Object contain data which can go from 0 Bytes to 5 Terabytes and has: Key, Value, Version ID and Metadata
  • S3 Buckets hold objects. Buckets can also have folders which in trun hold objects
  • S3 is a universal namespace so bucket names mush be unique (think like having a domain name)

Classes

AZ: Availabilty Zone - Standard: Fast! 99.99% Availability, 11 9's Durability, 3 AZs - Standard Infrequently Accessed (IA): Still fast! Cheaper if you access files < once a month. Additional retreival fee. 50% less than standard (reduced availabilty) - One Zone IA: Still Fast! 1 AZ (99.5%). Cheaper than Standard IA by 20% less (reduced durability) data could get destroyed. Retreival fee is applied. - Glacier: long term cold storage. Retreival could take minutes to hours. Very cheap - Glacier Deep Archive: Lowest cost. Retreival is 12 hours. - Intelligent tiering: Uses ML to analyze your object usage and determine the appropriate stoage class. DAta is moved to the most cost-effective access tier, without any performance impact or added overhead.

![[s3-storage-class-comparison-new.png]]

Security

  • All buckets are private by default
  • Logging per request in another bucket even in another account if desired
  • ACL are a legacy feature, simpler but not recommended
  • Bucket policies json document that define a complex rule access

Encryption

  • Trafic is encrypted via SSL/TLS by default
  • Server side encryption (SSE):
    • SSE-AES S3 handles the key, uses AES-256 algo
    • SSE-KMS Envelope encryption, AWS KMS and you manage the keys
    • SSE-C Customer provided key
  • Client side encryption: you encrypt the files before uploading them to S3

Data Consistency

  • For new objects (PUTS): Read After Write consistency, you are able to read immediately after writing
  • Overwite (PUTS) or Delete objects (DELETES): Eventual Consistency, it may return an old copy if you read immeditely, it may take time for S3 to replicate versions to AZs, you need to generally wait a few seconds before reading.

Cross-Region Replication

  • When enabled, any object that is uplaoded will be automatically replicated to another region(s)
  • Versionning must be turned on on both the source and the destination buckets.
  • We can have CCR replicate to another AWS account

Versionning

  • Once enabled it cannot be disabled for already versionned files, it can only suspended
  • Fully integrates with S3 lifecycle rules
  • When working with versionning (does that mean when it's enabled?), multi factor authentication can provide extra protection against deletion of data

Lifecycle Management

  • Automate the process of moving files to diffrent Storage classes or deleting files altogether
  • Can be used together with versionning and can be applied to bothe current and previous versions

Example: [Standard] --> (After 7 days) --> [Glacier] --> (After 356 days) --> [Delete permanetly]

Transfer acceleration

  • Fast and secure transfer between s3 and users
  • Users use a distinct URL for an Edge LOcation
  • As data arrives to the Edge Location it is automatically routes to s3 over a specially optimized network path called Amazon Backbone Network

Preseiged URL

Temporarly access to private objects, only from CLI and SDK, have an expiration time

MFA Delete

  • CLI must be used to turn on MFA
  • Versionning must be turned on for the bucket
  • Only the bucket owner root can delete objects from the bucket !!! Is this guy for real !!!, said here: https://youtu.be/Ia-UEYYR44s?t=1898 and here: https://youtu.be/Ia-UEYYR44s?t=3669
  • Ensure objects are not deleted by accident

Follow along: Messing arround with AWS

  • By default we can't make objects public, we have to, first, uncheck the "Block all public" checkbox first in the permission tab of the bucket (not the object)
  • Files uploaded before versioning is enabled will have "null" versions
  • new versions don't inherit the public access
  • When we delete an object, we actionaly delete the last version.
  • When an object is encrypted, we still can access it, it just mean it's encrypted in the servers
# List all buckets
aws s3 ls

# List all folders and objects in bucket
aws s3 ls s3://example

# Download
aws s3 cp s3://example/folder/file.png

# Upload
aws s3 cp /Downloads/file.png s3://example/folder/file.png

# Create preseigned URL
aws s3 presign s3://example --exprires-in 300 # seconds
  • Minimum value for transition in lifecycle managment is 30 days
  • We can change the storage class and the object ownership (with an account id) when creating Cross Region replication
  • S3 succeful upload return 200 HTTP

Getting hands dirty

  • Policies are tied to buckets while ACL could be applied to indivual objects
  • When we try to set Permissions for an object before Uploading it, we get this message:

    This bucket has the bucket owner enforced setting applied for Object Ownership. Use bucket policies to control access. Learn more

  • User defined Metadata keys should be prefixed with x-amz-meta

  • Bucket policies could be used to require encrypted uploads > If your bucket policy requires encrypted uploads, you must specify an encryption key or your upload will fail.

  • When uploading objects we can configure a checksum function for additional data integrity validation

  • There is another Storage class than the ones listed above: Glacier Instant Retreival and the formelly called Glacier is now called Glacier Flexible Retreival

Server side encryption

  • Enabling SSE for a bucket doesn't afface already uploaded objects
  • There are only 2 options for server side encryption: SSE-S3 and SSE-KMS but there is this note: > To upload an object with a customer-provided encryption key (SSE-C), use the AWS CLI, AWS SDK, or Amazon S3 REST API.

    and according to this guide SSE-C properties are specified in request headers (HTTP ?): - x-amz-server-side​-encryption​-customer-algorithm - x-amz-server-side​-encryption​-customer-key - x-amz-server-side​-encryption​-customer-key-MD5

  • Edit server-side encryption for an object > - This action creates a new version of the object with updated settings and a new last-modified date. > - Objects copied with customer-provided encryption keys (SSE-C) will fail to be copied using the S3 console. To copy objects encrypted with SSE-C, use the AWS CLI, AWS SDK, or the Amazon S3 REST API. > - If this bucket uses the bucket owner enforced setting for S3 Object Ownership, object ACLs will not be copied. > Learn more

Access points

https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-points.html

Amazon S3 access points simplify data access for any AWS service or customer application that stores data in S3. Access points are named network endpoints that are attached to buckets that you can use to perform S3 object operations, such as GetObject and PutObject. Each access point has distinct permissions and network controls that S3 applies for any request that is made through that access point. Each access point enforces a customized access point policy that works in conjunction with the bucket policy that is attached to the underlying bucket. You can configure any access point to accept requests only from a virtual private cloud (VPC) to restrict Amazon S3 data access to a private network. You can also configure custom block public access settings for each access point.