Concepts
Eventual Consistency
A consistency model where replicas, caches, or projections may temporarily disagree but are expected to converge if processing continues.
Concepts Covered
- Stale reads
- Convergence
- Replication lag
- Caches
- Derived projections
- Product consistency contracts
- Repair and reconciliation
- Bounded versus unbounded lag
Definition
Eventual consistency means different copies or views of data may temporarily disagree, but they are expected to converge over time if no new updates arrive and the system keeps processing.
It does not mean "anything can be wrong forever."
A serious eventually consistent system defines which data may be stale, how stale it may become, how lag is monitored, and how incorrect derived state is repaired.
The Pain That Forces Eventual Consistency
At small scale, one database can answer every question directly. Reads and writes go to the same place.
At larger scale, systems copy data because one source table cannot efficiently serve every read pattern:
- primary database to read replicas
- source rows to cache
- events to derived projections
- messages to inbox views
- likes to counter projections
- raw events to analytics stores
Copies improve performance and availability, but copies create a new problem: they can lag behind the source.
The Naive Expectation
A beginner often expects this:
write completes -> every user immediately sees the new value everywhere
That expectation is expensive in distributed systems. If every cache, read replica, projection, search index, counter, and analytics table must update synchronously before the user request succeeds, the write path becomes slow and fragile.
Example:
1. User likes a post.
2. Like edge is inserted.
3. Counter projection must update.
4. Feed ranking features must update.
5. Notification candidates must update.
6. Analytics aggregates must update.
7. API returns only after all of that succeeds.
This design makes a simple like depend on many downstream systems.
Mental Model
Eventual consistency separates source truth from derived visibility.
The source of truth should become correct first. Copies, caches, and projections can catch up.
For Instagram-style likes:
source truth: user_7 likes post_42
derived view: post_42 has 10,001 likes
The user's own like button may need immediate correctness. The global count can usually be a little stale if the product defines that contract clearly.
Concrete Example: Likes
After a user likes a post:
viewer state: strongly consistent for current user
aggregate count: eventually consistent
That means:
- The user should see their own button as liked immediately after success.
- The total like count may update a moment later.
- Other users may briefly see the old count.
- The system should eventually converge to the correct count.
This is not laziness. It is an explicit product and engineering tradeoff.
Where Staleness Comes From
Common causes:
- read replica lag
- cache invalidation delay
- event stream consumer lag
- projection worker failure
- retries and duplicate events
- regional replication delay
- batch processing windows
The system should make staleness visible. Invisible lag becomes user confusion and operational risk.
What Eventual Consistency Guarantees
Eventual consistency can guarantee convergence if:
- source changes are durable
- update events are not permanently lost
- consumers keep processing
- repair mechanisms exist for drift
- no conflicting writes continue forever
It does not guarantee:
- immediate visibility everywhere
- monotonic reads for every user
- bounded lag unless explicitly engineered
- automatic repair after missed events
- safe staleness for every product feature
Some data should not be eventually consistent. Authorization checks, payment state, inventory reservation, and safety decisions often need stronger guarantees.
Operational Reality
Teams should monitor:
- replication lag
- event consumer lag
- cache hit ratio and stale reads
- projection drift
- age of oldest unprocessed event
- reconciliation corrections
- user-visible inconsistency reports
- lag by region or partition
Failure modes:
- Treating eventual consistency as permission for unbounded lag.
- Showing contradictory states that break user trust.
- Letting stale cache entries affect safety or authorization.
- Missing events and never repairing the projection.
- Having no owner for lag alerts.
- Allowing a derived view to become the only practical truth.
Related Topics
Knowledge links
Use these links to understand what to know first, where this idea appears, and what to study next.
Prerequisites
Read these first if this topic feels unfamiliar.
Used In Systems
System studies where this idea appears in context.
Related Concepts
Core ideas that connect to this topic.
Related Patterns
Reusable architecture moves built from these ideas.