Concepts
Projection Drift
When a derived read model or aggregate becomes inconsistent with the source-of-truth data it was computed from.
Concepts Covered
- Projection drift
- Source of truth
- Derived read models
- Counter correctness
- Event replay
- Reconciliation
- Repair jobs
- Acceptable error
Definition
Projection drift happens when a derived view no longer matches the source-of-truth data it was computed from.
Example:
source of truth: 1,000 active like edges
counter projection: 997 likes
The counter has drifted by three.
A projection is any computed view used for reads: counters, timelines, dashboards, search indexes, inboxes, unread counts, analytics tables, or materialized views.
The Pain That Forces Drift Handling
Derived projections exist because reading from the source of truth every time can be too slow or expensive.
An Instagram-like system might store the real like relationship as:
likes(user_id, post_id)
But counting that table on every post view is expensive. So the system maintains a derived counter:
post_like_counts(post_id, count)
That makes reads fast, but it creates a new correctness question:
Does the count still match the real likes?
Over time, retries, missed events, duplicate events, failed workers, manual fixes, and code bugs can make the projection diverge from the truth.
Mental Model
The source of truth is the durable fact. The projection is a convenient view.
Projection drift is the gap between them.
truth -> events -> projection
Every arrow can fail. An event may not be published. A consumer may process it twice. A deployment may contain a bug. A backfill may skip a partition. A manual database correction may not emit the same event path.
If a system uses derived projections, it should also have a plan for detecting and repairing drift.
Why Drift Happens
Common causes:
- missed events
- duplicate events
- consumer bugs
- events applied out of order
- partial outages
- manual data changes
- replaying old events with new logic
- schema migrations that change meaning
- race conditions between source writes and projection updates
Drift is not a rare edge case in long-running systems. It is a normal risk of maintaining multiple copies of meaning.
Example: Like Counter Drift
Suppose a user likes a post:
1. Insert like edge succeeds.
2. LikeCreated event is published.
3. Counter consumer increments count.
If step 2 fails, the source of truth has the like but the counter never increments.
If step 3 runs twice, the counter increments twice.
If a unlike event is processed before the like event, the counter may end up wrong depending on the implementation.
The user sees the projection, not the source table. So even small drift can damage trust if users notice impossible numbers.
Repair Strategies
| Strategy | Idea | Tradeoff |
|---|---|---|
| Recompute from source tables | Count truth directly and overwrite projection | Accurate but expensive |
| Replay event log | Rebuild projection from historical events | Requires reliable event history |
| Compare samples | Check parts of the dataset for mismatch | Cheaper but incomplete |
| Incremental reconciliation | Repair partitions, keys, or time windows gradually | Needs scheduling and tracking |
| Dual calculation | Compare old and new projection logic during migration | Extra cost |
The repair method depends on the business cost of being wrong.
For like counts, small temporary drift may be acceptable. For payments, balances, inventory, or permissions, drift may be unacceptable and require stronger consistency boundaries.
Operational Reality
Important signals:
- drift detected by reconciliation jobs
- projection freshness
- consumer error rate
- duplicate event rate
- event replay failures
- repair job duration
- number of repaired records
- largest drift by key
- stuck partitions
The key product question is: how wrong can this projection be, for how long, before it becomes unacceptable?
That answer determines whether the system needs strict transactions, frequent reconciliation, approximate counters, user-visible freshness labels, or manual repair tools.
Related Topics
Knowledge links
Use these links to understand what to know first, where this idea appears, and what to study next.
Prerequisites
Read these first if this topic feels unfamiliar.
Used In Systems
System studies where this idea appears in context.
Related Concepts
Core ideas that connect to this topic.
Related Patterns
Reusable architecture moves built from these ideas.