Concepts

Read Replicas

Copies of a primary database that serve read traffic so systems can scale reads and reduce pressure on the write database.

foundation4 min readUpdated unknownCapacityReliabilityOperationsTradeoffs
Replication LagRead ScalingPrimary DatabaseRead-After-Write ConsistencyStale Reads

Concepts Covered

  • Primary database
  • Read replica
  • Replication lag
  • Read scaling
  • Read-after-write consistency
  • Replica routing
  • Failover
  • Stale reads

Definition

A read replica is a copy of a primary database that receives changes from the primary and serves read queries.

The primary handles writes:

INSERT, UPDATE, DELETE

Replicas handle reads:

SELECT

This lets a system spread read traffic across multiple database nodes while keeping one primary place where writes are accepted.

The Pain That Forces Read Replicas

Many products are read-heavy. A URL shortener may create a small number of links but serve a huge number of redirects. A social app may receive far more feed reads than post writes.

If all reads and writes hit the same primary database, read traffic can steal resources from writes:

many SELECT queries
  -> CPU and memory pressure
  -> connection pool saturation
  -> writes wait longer
  -> replication and background jobs fall behind
  -> user-facing latency rises

Read replicas exist because the primary database should not always be responsible for every read in the system.

Mental Model

The primary is the authority. Replicas are followers.

When a write commits on the primary, the change must be copied to each replica. That copying takes time. Sometimes it is milliseconds. Sometimes, during load or network problems, it can be seconds or minutes.

That delay is replication lag.

write to primary at 12:00:00
replica receives it at 12:00:02

During those two seconds, a read from the replica can return stale data.

Example: Read-After-Write Surprise

A user creates a short link:

1. POST /links writes short_code abc123 to primary.
2. API returns success.
3. User immediately opens /abc123.
4. Redirect service reads from replica.
5. Replica has not received abc123 yet.
6. User sees 404.

The database did not lose the row. The system routed the read to a replica that had not caught up.

This is why read scaling creates a consistency problem. More read capacity comes with the risk of stale reads.

Common Routing Strategies

Different reads can tolerate different freshness.

StrategyHow it worksGood for
Read from primary after writeRecent user actions go to primary for a short windowAvoiding immediate stale reads
Read from replicas for public trafficHigh-volume reads use replicasScaling read-heavy endpoints
Lag-aware routingAvoid replicas that are too far behindReducing stale results
Region-local readsUsers read from nearby replicasLower latency

For a URL shortener, redirects for old links may safely use replicas. Redirects immediately after link creation may need primary reads or cache warming.

What Read Replicas Guarantee

Read replicas can improve read throughput and reduce primary load.

They can help with:

  • read-heavy endpoints
  • reporting queries
  • geographic latency reduction
  • isolating expensive reads from writes
  • operational redundancy

They do not automatically guarantee:

  • fresh reads
  • no replication lag
  • safe failover
  • lower write latency
  • correctness for workflows that require immediate consistency

Failure Modes

Important failure modes:

  • Replica lag returns stale data.
  • A replica goes down and read traffic overloads the remaining replicas.
  • Expensive analytics queries starve user-facing reads.
  • Failover promotes a replica that is missing recent writes.
  • Application code accidentally sends writes to a read-only replica.
  • Connection pools are not separated, so slow replica reads still affect primary traffic.

Read replicas make capacity easier, but they make correctness more subtle.

Operational Reality

Production systems track:

  • replication lag
  • replica CPU and memory
  • query latency per replica
  • failed replication events
  • read traffic distribution
  • primary write latency
  • failover time
  • stale read incidents

The design question is not "should reads go to replicas?" The real question is: which reads can tolerate stale data, and which reads must see the newest committed state?

Knowledge links

Use these links to understand what to know first, where this idea appears, and what to study next.

Prerequisites

Read these first if this topic feels unfamiliar.

Used In Systems

System studies where this idea appears in context.

Related Concepts

Core ideas that connect to this topic.