AWS Services

Amazon S3

Understand S3 as AWS object storage, including buckets, objects, durability, security, storage classes, lifecycle, replication, and exam traps.

foundation7 min readUpdated 2026-05-31CloudCertificationSecurityCost
Object StorageBucketObject KeyVersioningStorage ClassLifecycle PolicyBucket PolicyReplication

After this, you will understand

S3 appears in storage, security, data lake, backup, static website, logging, and disaster recovery questions because it is the default durable object store in AWS.

Plain version

S3 stores objects inside buckets and is designed for durable, scalable storage accessed through APIs.

Decision pressure

Learners treat S3 like a normal filesystem and miss object keys, policies, storage classes, lifecycle, and regional behavior.

Exam-ready model

Use S3 for durable object storage, then choose access controls, encryption, lifecycle, replication, and storage class based on the data's use pattern.

Think before readingWhy is S3 usually a better fit than EC2 disk storage for uploaded files?
S3 is managed, durable, scalable object storage, while EC2 disks tie data to compute placement and require more operational design.

Reading in progress

This page is saved in your local study history so you can continue later.

Study path

Read these in order

Start with the mechanics, then move into the patterns that explain why the system is shaped this way.

  1. 1S3 Lifecycle And Storage Classesaws-services
  2. 2S3 Replicationaws-services

Concepts Covered

  • Buckets and objects
  • Object keys
  • Regional storage
  • Durability and availability
  • Versioning
  • Bucket policies and Block Public Access
  • Encryption
  • Storage classes
  • Lifecycle rules
  • Replication and static websites

1. Plain-English Mental Model

Amazon Simple Storage Service, or S3, is AWS object storage.

Object storage means you store complete objects, not mounted blocks or traditional files. An object is data plus metadata. It lives in a bucket and has a key, which is the object's name or path-like identifier.

The simple model is:

bucket -> object key -> object data and metadata

S3 is regional. You create a bucket in a Region, then store objects in it. You do not manage servers, disks, RAID, or file systems. AWS handles the storage platform. You design access, encryption, lifecycle, versioning, replication, and data organization.

S3 is one of the most important services for SAA-C03 because it appears in many architectures: static websites, user uploads, backups, logs, data lakes, analytics, cross-Region replication, event-driven processing, and disaster recovery.

2. Why This Service Exists

Applications need a place to put durable data that is not tied to one server.

If a user uploads a profile image, storing it on one EC2 instance creates problems. What happens when the instance is replaced? How do other instances read the file? How do you scale storage? How do you back it up? How do you serve it globally?

S3 solves this by making storage independent from compute. Applications call S3 APIs to put, get, list, copy, or delete objects. S3 scales automatically and is designed for very high durability.

This changes architecture. Instead of treating the app server as the owner of files, the app server becomes a client of durable object storage. That enables stateless compute, easier scaling, cleaner backups, and integrations with analytics or event processing.

3. The Naive Approach And Where It Breaks

The naive approach is to store application files on the local disk of an EC2 instance.

It breaks when you add a second instance, replace an instance, deploy across Availability Zones, or need long-term retention. Local disk is not a shared durable object store. EBS can persist block data for one AZ, but it still does not give you globally accessible object storage semantics.

Another naive approach is to make an S3 bucket public because a website or users need to read objects. That can expose far more data than intended. A better design uses CloudFront, origin access controls, signed URLs, narrow bucket policies, or application-mediated access depending on the use case.

S3 is easy to start with, which is why security and lifecycle mistakes are common. Good S3 architecture is not "put stuff in a bucket." It is deciding who can access which objects, how long data should live, which storage class fits the access pattern, and what happens if data is deleted or overwritten.

4. Core Primitives

A bucket is the top-level container for objects. Bucket names are globally unique across AWS partitions. A bucket belongs to one Region.

An object has a key, data, metadata, and optionally versions. Keys can look like paths, such as users/123/avatar.png, but S3 is not a traditional hierarchical filesystem.

Versioning keeps multiple versions of objects. It helps protect against accidental overwrite or delete, but it can increase cost because old versions still occupy storage.

Storage classes tune cost and access behavior. S3 Standard is for frequently accessed data. Standard-IA and One Zone-IA reduce storage cost for infrequently accessed data. Glacier classes are for archival access with different retrieval times. Intelligent-Tiering can move objects between access tiers based on usage.

Lifecycle rules transition or expire objects over time. Replication copies objects to another bucket, often in another Region or account.

5. Architecture Use Cases

Use S3 for user uploads, static assets, application logs, backups, data lake storage, report exports, media files, machine learning datasets, analytics staging, disaster recovery copies, and static website hosting.

A common web app design is:

browser -> app -> S3 bucket
browser -> CloudFront -> S3 origin

The application controls upload authorization. CloudFront caches read-heavy content close to users. S3 stores the durable object.

A logging architecture may send CloudTrail, load balancer logs, VPC Flow Logs, and application logs into centralized S3 buckets. Lifecycle rules can move older data to cheaper storage classes.

A disaster recovery design may use Cross-Region Replication for critical objects, paired with versioning and object lock when immutability is required.

7. Security Model

S3 security has several layers.

IAM identity policies can allow principals to call S3 actions. Bucket policies can grant or deny access at the bucket level and support cross-account access. Access control lists exist but should usually be avoided unless a legacy use case requires them.

S3 Block Public Access is an important guardrail. It helps prevent accidental public exposure through policies or ACLs.

Encryption can be server-side with S3-managed keys, KMS-managed keys, or customer-provided keys. Client-side encryption is also possible but shifts more responsibility to the application.

Pre-signed URLs can grant temporary access to specific objects without making the bucket public.

VPC endpoints can keep S3 traffic private from a VPC and can be combined with bucket policy conditions.

Logging, CloudTrail data events, Access Analyzer, and Macie can help with visibility and sensitive data discovery.

8. Reliability And Resilience

S3 is designed for very high durability by storing object data redundantly across multiple facilities within a Region, except for storage classes that intentionally use one Availability Zone such as One Zone-IA.

Durability protects against data loss. Availability is about whether the service can serve requests at a given time. Storage classes have different availability characteristics, so read the class requirements rather than assuming all classes behave the same.

Versioning can protect against accidental overwrite or delete. MFA Delete can add protection for some operations but has operational friction. Object Lock can support write-once-read-many retention for compliance needs.

Replication can copy objects to another bucket in another Region or account. Replication does not replace backup strategy by itself, but it is a common resilience and compliance tool.

9. Performance And Scaling

S3 scales horizontally behind the service API. You do not provision bucket capacity.

Performance design still matters. Use multipart upload for large objects. Use byte-range GETs for parallel reads when appropriate. Put CloudFront in front of S3 for globally distributed content. Avoid designs that require listing massive prefixes for latency-sensitive paths.

S3 is object storage, not a low-latency block device. It is not a replacement for EBS attached to an EC2 instance or for a database with query semantics.

Event notifications can trigger Lambda, SQS, or SNS when objects are created. This is useful for image processing, ingestion pipelines, and asynchronous workflows.

10. Cost Model

S3 cost includes storage, requests, retrievals for some classes, data transfer, replication, inventory, analytics, and optional features.

The cheapest storage class is not always cheapest overall. Archival classes can have retrieval costs and delays. Infrequent access classes can charge retrieval fees. Intelligent-Tiering has monitoring and automation charges but can reduce manual lifecycle mistakes for unknown access patterns.

Lifecycle policies are the main cost control tool. Logs might move from Standard to Standard-IA, then Glacier, then expire. Temporary uploads might expire quickly. Old object versions should be managed deliberately.

Replication doubles storage in the destination and adds request and transfer costs.

12. SAA-C03 Exam Signals

"Durable storage for user-uploaded objects" points to S3.

"Static website or static assets with global low latency" often points to S3 plus CloudFront.

"Archive rarely accessed data for lowest cost" points to Glacier storage classes, with retrieval time considered.

"Unknown or changing access patterns" can point to S3 Intelligent-Tiering.

"Prevent accidental public access" points to Block Public Access, bucket policies, and IAM controls.

"Temporary access to private objects" points to pre-signed URLs or CloudFront signed URLs/cookies.

"Replicate objects to another Region" points to Cross-Region Replication.

13. Common Exam Traps

Do not use S3 as a mounted POSIX filesystem for EC2. Use EFS for shared file access when that is the requirement.

Do not use S3 when the application needs low-latency block storage. Use EBS for block storage.

Do not make a whole bucket public just to serve a few objects. Use CloudFront, signed access, or scoped policies.

Do not choose Glacier classes when immediate retrieval is required unless the specific Glacier option supports the needed retrieval behavior.

Do not forget versioned old objects in cost calculations.

Do not assume replication is retroactive for objects created before replication was configured unless a separate batch replication process is used.

S3 becomes much clearer after IAM Foundations, because bucket policies and identity policies often work together.

Next, study Amazon RDS and Amazon DynamoDB to contrast object storage with relational and NoSQL databases.

Official AWS references:

What to study next

These links keep the session moving: read prerequisites first, then open the systems, concepts, and patterns that deepen this page.