Concepts

Read Replicas

Copies of a primary database that serve read traffic so systems can scale reads and reduce pressure on the write database.

foundation4 min readUpdated 2026-05-14CapacityReliabilityOperationsTradeoffs

Replication LagRead ScalingPrimary DatabaseRead-After-Write ConsistencyStale Reads

After this, you will understand

How Read Replicas helps you see where this idea appears in production systems, what problem forces it, and how to reason about the tradeoffs.

Naive mental model

Treat the idea as a definition to memorize.

Production pressure

Real systems force the idea to handle Replication Lag, Read Scaling, and Primary Database.

Better reasoning

Use the concept to decide what the system guarantees, what it risks, and what it costs to operate.

Think before readingWhere would Read Replicas appear in a real production system, and what failure or bottleneck would it help you reason about?

As you read, look for the pressure that creates the idea first. The mechanics matter more once the reason is clear.

Reading in progress

This page is saved in your local study history so you can continue later.

Next: Realtime Gateways

Concepts Covered

Primary database
Read replica
Replication lag
Read scaling
Read-after-write consistency
Replica routing
Failover
Stale reads

Definition

A read replica is a copy of a primary database that receives changes from the primary and serves read queries.

The primary handles writes:

INSERT, UPDATE, DELETE

Replicas handle reads:

SELECT

This lets a system spread read traffic across multiple database nodes while keeping one primary place where writes are accepted.

The Pain That Forces Read Replicas

Many products are read-heavy. A URL shortener may create a small number of links but serve a huge number of redirects. A social app may receive far more feed reads than post writes.

If all reads and writes hit the same primary database, read traffic can steal resources from writes:

many SELECT queries
  -> CPU and memory pressure
  -> connection pool saturation
  -> writes wait longer
  -> replication and background jobs fall behind
  -> user-facing latency rises

Read replicas exist because the primary database should not always be responsible for every read in the system.

Mental Model

The primary is the authority. Replicas are followers.

When a write commits on the primary, the change must be copied to each replica. That copying takes time. Sometimes it is milliseconds. Sometimes, during load or network problems, it can be seconds or minutes.

That delay is replication lag.

write to primary at 12:00:00
replica receives it at 12:00:02

During those two seconds, a read from the replica can return stale data.

Example: Read-After-Write Surprise

A user creates a short link:

1. POST /links writes short_code abc123 to primary.
2. API returns success.
3. User immediately opens /abc123.
4. Redirect service reads from replica.
5. Replica has not received abc123 yet.
6. User sees 404.

The database did not lose the row. The system routed the read to a replica that had not caught up.

This is why read scaling creates a consistency problem. More read capacity comes with the risk of stale reads.

Common Routing Strategies

Different reads can tolerate different freshness.

Strategy	How it works	Good for
Read from primary after write	Recent user actions go to primary for a short window	Avoiding immediate stale reads
Read from replicas for public traffic	High-volume reads use replicas	Scaling read-heavy endpoints
Lag-aware routing	Avoid replicas that are too far behind	Reducing stale results
Region-local reads	Users read from nearby replicas	Lower latency

For a URL shortener, redirects for old links may safely use replicas. Redirects immediately after link creation may need primary reads or cache warming.

What Read Replicas Guarantee

Read replicas can improve read throughput and reduce primary load.

They can help with:

read-heavy endpoints
reporting queries
geographic latency reduction
isolating expensive reads from writes
operational redundancy

They do not automatically guarantee:

fresh reads
no replication lag
safe failover
lower write latency
correctness for workflows that require immediate consistency

Failure Modes

Important failure modes:

Replica lag returns stale data.
A replica goes down and read traffic overloads the remaining replicas.
Expensive analytics queries starve user-facing reads.
Failover promotes a replica that is missing recent writes.
Application code accidentally sends writes to a read-only replica.
Connection pools are not separated, so slow replica reads still affect primary traffic.

Read replicas make capacity easier, but they make correctness more subtle.

Operational Reality

Production systems track:

replication lag
replica CPU and memory
query latency per replica
failed replication events
read traffic distribution
primary write latency
failover time
stale read incidents

The design question is not "should reads go to replicas?" The real question is: which reads can tolerate stale data, and which reads must see the newest committed state?

What to study next

These links keep the session moving: read prerequisites first, then open the systems, concepts, and patterns that deepen this page.

Prerequisites

Read these first if the mechanics feel unfamiliar.

Eventual ConsistencyStart here if Eventual Consistency is still fuzzy.Database IndexingStart here if Database Indexing is still fuzzy.

Used In Systems

System studies where this idea appears in context.

URL Shortener SystemSee the idea under full production pressure.

Related Concepts

Core ideas that connect to this topic.

CachingUnderstand the concept behind the design decision.ShardingUnderstand the concept behind the design decision.