Concepts

Eventual Consistency

A consistency model where replicas, caches, or projections may temporarily disagree but are expected to converge if processing continues.

foundation4 min readUpdated unknownReliabilityOperationsTradeoffs

StalenessConvergenceReplication LagDerived ViewsProduct Consistency Contracts

Concepts Covered

Stale reads
Convergence
Replication lag
Caches
Derived projections
Product consistency contracts
Repair and reconciliation
Bounded versus unbounded lag

Definition

Eventual consistency means different copies or views of data may temporarily disagree, but they are expected to converge over time if no new updates arrive and the system keeps processing.

It does not mean "anything can be wrong forever."

A serious eventually consistent system defines which data may be stale, how stale it may become, how lag is monitored, and how incorrect derived state is repaired.

The Pain That Forces Eventual Consistency

At small scale, one database can answer every question directly. Reads and writes go to the same place.

At larger scale, systems copy data because one source table cannot efficiently serve every read pattern:

primary database to read replicas
source rows to cache
events to derived projections
messages to inbox views
likes to counter projections
raw events to analytics stores

Copies improve performance and availability, but copies create a new problem: they can lag behind the source.

The Naive Expectation

A beginner often expects this:

write completes -> every user immediately sees the new value everywhere

That expectation is expensive in distributed systems. If every cache, read replica, projection, search index, counter, and analytics table must update synchronously before the user request succeeds, the write path becomes slow and fragile.

Example:

1. User likes a post.
2. Like edge is inserted.
3. Counter projection must update.
4. Feed ranking features must update.
5. Notification candidates must update.
6. Analytics aggregates must update.
7. API returns only after all of that succeeds.

This design makes a simple like depend on many downstream systems.

Mental Model

Eventual consistency separates source truth from derived visibility.

The source of truth should become correct first. Copies, caches, and projections can catch up.

For Instagram-style likes:

source truth: user_7 likes post_42
derived view: post_42 has 10,001 likes

The user's own like button may need immediate correctness. The global count can usually be a little stale if the product defines that contract clearly.

Concrete Example: Likes

After a user likes a post:

viewer state: strongly consistent for current user
aggregate count: eventually consistent

That means:

The user should see their own button as liked immediately after success.
The total like count may update a moment later.
Other users may briefly see the old count.
The system should eventually converge to the correct count.

This is not laziness. It is an explicit product and engineering tradeoff.

Where Staleness Comes From

Common causes:

read replica lag
cache invalidation delay
event stream consumer lag
projection worker failure
retries and duplicate events
regional replication delay
batch processing windows

The system should make staleness visible. Invisible lag becomes user confusion and operational risk.

What Eventual Consistency Guarantees

Eventual consistency can guarantee convergence if:

source changes are durable
update events are not permanently lost
consumers keep processing
repair mechanisms exist for drift
no conflicting writes continue forever

It does not guarantee:

immediate visibility everywhere
monotonic reads for every user
bounded lag unless explicitly engineered
automatic repair after missed events
safe staleness for every product feature

Some data should not be eventually consistent. Authorization checks, payment state, inventory reservation, and safety decisions often need stronger guarantees.

Operational Reality

Teams should monitor:

replication lag
event consumer lag
cache hit ratio and stale reads
projection drift
age of oldest unprocessed event
reconciliation corrections
user-visible inconsistency reports
lag by region or partition

Failure modes:

Treating eventual consistency as permission for unbounded lag.
Showing contradictory states that break user trust.
Letting stale cache entries affect safety or authorization.
Missing events and never repairing the projection.
Having no owner for lag alerts.
Allowing a derived view to become the only practical truth.

Knowledge links

Use these links to understand what to know first, where this idea appears, and what to study next.

Prerequisites

Read these first if this topic feels unfamiliar.

Derived Projections

Used In Systems

System studies where this idea appears in context.

Instagram Likes System WhatsApp-Style Messaging System

Related Concepts

Core ideas that connect to this topic.

Read Replicas Caching Projection Drift

Related Patterns

Reusable architecture moves built from these ideas.

Reconciliation Job