AWS Scenarios

Backup vs Replication Recovery Design

Compare backups, snapshots, replication, read replicas, cross-Region copies, immutability, point-in-time recovery, and recovery objectives for AWS architecture decisions.

intermediate5 min readUpdated 2026-06-03CloudCertificationReliabilityOperations

BackupSnapshotReplicationPoint-In-Time RecoveryCross-Region CopyCross-Account CopyBackup Vault LockRPO And RTO

After this, you will understand

This comparison prevents a classic architecture mistake: using replication when the real requirement is recoverability, or using backups when the real requirement is low-lag continuity.

Plain version

Backups preserve recovery points; replication keeps another copy current. They solve different failure modes and often need to be combined.

Decision pressure

Teams replicate accidental deletes, store backups in the same account, or assume a read replica is a protected historical restore point.

Exam-ready model

Separate restore history, live continuity, account isolation, Region isolation, and immutability before selecting AWS Backup, snapshots, PITR, replication, or read replicas.

Think before readingWhy is replication not a full backup strategy by itself?

Replication can copy bad changes quickly, while backups preserve older recovery points that let you restore before the bad change happened.

Reading in progress

This page is saved in your local study history so you can continue later.

Next: Multi-Region Disaster Recovery On AWS

Study path

Read these in order

Start with the mechanics, then move into the patterns that explain why the system is shaped this way.

1Multi-Region Disaster Recovery On AWSaws-scenarios

Concepts Covered

Backups and snapshots
Replication
Point-in-time recovery
Cross-Region recovery
Cross-account recovery
Backup Vault Lock
Read replicas
Corruption and deletion risk
Restore testing
SAA-C03 decision traps

1. Situation

A production workload stores data in S3, RDS, DynamoDB, and EBS volumes. The business asks for protection against accidental deletion, ransomware, Region disruption, and database failure.

The team asks whether it should use backups or replication.

That question is too broad. Backups and replication answer different problems:

backup = recover an earlier point in time
replication = maintain another copy somewhere else

Many serious designs use both, because live continuity and historical recovery are not the same thing.

2. Naive Design

The naive design enables replication and assumes the data is protected.

Replication helps if the primary location becomes unavailable. But if a bad application deployment deletes objects, corrupts records, or encrypts files, replication may copy that bad state to the destination.

Another naive design keeps local snapshots in the same account and Region, then calls that disaster recovery. That may not help if account access is compromised, the Region has a major disruption, or privileged users delete recovery points.

A third mistake is having backups with no tested restore process. Backup success is not the same as restore success.

3. What Breaks

Recoverability breaks when there are no historical points. A replica that is current can be current with the wrong data.

Isolation breaks when the same principals can delete production and backup copies.

Regional recovery breaks when backups never leave the Region.

Compliance breaks when retention is not enforced or privileged users can shorten lifecycle policies.

Application recovery breaks when the data restore is possible but infrastructure, DNS, secrets, and dependencies are not ready.

4. AWS Architecture

Use AWS Backup for centralized backup plans across supported services. Use backup vaults, lifecycle policies, cross-Region copies, cross-account copies, and vault access policies based on requirements.

Use AWS Backup Vault Lock when backups require stronger protection against early deletion or retention changes.

Use S3 versioning and replication when object-level continuity or geographic copies are required. Use S3 Batch Replication for existing objects when live replication was not already configured.

Use database-specific features where they match the workload: RDS automated backups, snapshots, read replicas, Multi-AZ, Aurora replicas, Aurora Global Database, and DynamoDB point-in-time recovery or global tables.

Use EBS snapshots and AMIs for EC2 recovery, but remember that compute recovery also requires launch templates, IAM roles, networking, and user data.

5. Request Or Data Flow

A normal application write lands in the primary data store.

For backup, the system captures recovery points according to a plan. Those recovery points may be copied to another account or Region.

For replication, the system asynchronously copies new or changed data to a destination. The replica may support reads, standby recovery, analytics, or regional access.

During a bad deployment, the team may need a point before the error:

identify blast radius
choose restore point
restore to isolated environment
validate data
promote or merge carefully
resume traffic

During a Region disruption, the team may instead need a destination copy that is already close to current and can serve traffic.

6. Security Controls

Separate backup administration from workload administration. Cross-account copies help keep recovery points away from the account being protected.

Use vault policies and least-privilege restore roles. Backup operators do not always need broad production admin access.

Use KMS carefully. Cross-account and cross-Region restore can fail if key policies do not allow the right principals or if the key is unavailable.

Use immutability where the threat model includes malicious deletion or ransomware.

Monitor backup job failures, copy job failures, vault policy changes, KMS key changes, and restore events.

7. Resilience Controls

Keep multiple recovery points. One good backup is not enough if it was taken after corruption began.

Use cross-Region copies when regional recovery is required.

Use cross-account copies when account compromise or privilege separation matters.

Test restores into a separate environment. This catches missing permissions, corrupted assumptions, incompatible versions, quota limits, and runbook gaps.

Track replication lag. A replica that is minutes behind may be acceptable for some systems and unacceptable for others.

8. Performance Controls

Backups and snapshots can affect performance depending on service behavior, schedule, and workload. Use managed service guidance rather than assuming every snapshot is free operationally.

Replication consumes network, write capacity, database resources, or service-specific throughput. DynamoDB global tables, Aurora Global Database, S3 replication, and read replicas have different mechanics.

Read replicas can improve read scaling, but they do not replace backups. They often replicate the current state, including bad writes.

Restore performance matters. A backup that takes hours to restore may not meet a low RTO.

9. Cost Controls

Backups cost storage, copy, and retrieval. Long retention across accounts and Regions can add up.

Replication costs include storage in the destination, request costs, data transfer, service-specific replication charges, and sometimes increased write capacity.

Vault Lock can create durable retention obligations. Be careful with forever retention because immutability also makes mistakes persistent.

Use lifecycle policies for backups and replicated objects based on restore needs.

Do not pay for active-active replication when the business only requires daily restore.

10. Exam Variants

"Recover from accidental deletion last night" points to backups, versioning, snapshots, or PITR.

"Keep a copy in another Region for compliance" can point to S3 CRR, backup copy, or service-specific replication.

"Protect backups from deletion by administrators" points to cross-account copies, vault policies, and Backup Vault Lock.

"Scale read traffic" points to read replicas, not backup.

"Minimize downtime during database instance failure in one AZ" points to Multi-AZ, not cross-Region backup.

"Recover to any second in the last retention window" points to point-in-time recovery where supported.

11. Common Traps

Do not call a read replica a backup.

Do not call replication a corruption recovery plan by itself.

Do not store every backup only in the same account and Region.

Do not forget KMS key permissions during restore.

Do not design retention without deletion and cost implications.

Do not skip restore testing.

Review AWS Backup, S3 Replication, Amazon RDS, Amazon DynamoDB, and Amazon Aurora.

Official AWS references:

What to study next

These links keep the session moving: read prerequisites first, then open the systems, concepts, and patterns that deepen this page.

Prerequisites

Read these first if the mechanics feel unfamiliar.

AWS BackupStart here if AWS Backup is still fuzzy.S3 ReplicationStart here if S3 Replication is still fuzzy.Amazon RDSStart here if Amazon RDS is still fuzzy.

Read these in order

What to study next

Prerequisites

More Links