Patterns
Transactional Outbox
Reliably publish events that correspond to committed database changes by storing publish intent inside the same local database transaction.
Concepts Covered
- Local database transactions
- Cross-system consistency gaps
- Durable publish intent
- Outbox tables
- Background publishers
- At-least-once delivery
- Idempotent consumers
- Outbox table retention
1. Intent
The Transactional Outbox pattern exists because a service usually cannot safely commit a database write and publish a Kafka or message-broker event as one atomic operation.
The database and the broker are different systems. The database has its own transaction log, commit rules, replication behavior, and failure modes. Kafka or another broker has its own acknowledgements, partitions, retries, and availability profile. Unless the system uses a distributed transaction protocol across both systems, there is no single transaction that can guarantee:
commit database row AND publish broker event atomically
That gap is the entire reason the outbox pattern exists.
The pattern changes the problem. Instead of trying to write to the database and publish to the broker in one unsafe request path, the service writes two rows inside one local database transaction:
- the business state
- the intent to publish an event about that business state
After the transaction commits, a separate background publisher reads the outbox row and publishes it to the broker.
2. The Problem Without This Pattern
Imagine an Instagram-style like service. A user taps the like button. The service needs to do two things:
- Store that the user liked the post.
- Publish a
LikeCreatedevent so counters, notifications, ranking, and analytics can react.
A naive implementation might look like this:
insert like into database
publish LikeCreated event to Kafka
return success
The problem is the failure window between the database write and the broker publish.
Suppose the database insert succeeds, but the service crashes before publishing to Kafka. Now the like exists in the database, but downstream systems never hear about it. The counter does not update. The notification system does not know. Analytics misses the event. Ranking features may be stale.
The source-of-truth state changed, but the event-driven world did not receive the change.
Reversing the order is not safe either:
publish LikeCreated event to Kafka
insert like into database
return success
If the publish succeeds but the database insert fails, downstream systems react to a like that does not exist. The notification system might notify someone about a like that was never committed. A counter might increment even though the source edge is absent.
Both orders are broken because the system is trying to coordinate two different systems without one shared transaction.
3. How The Pattern Works
The Transactional Outbox pattern keeps the request path inside one transactional boundary: the local database.
Instead of publishing directly to Kafka during the request, the service writes an outbox row in the same database transaction as the business write.
BEGIN TRANSACTION;
INSERT INTO likes (
user_id,
post_id,
created_at
) VALUES (
'user_7',
'post_42',
now()
);
INSERT INTO outbox_events (
event_id,
aggregate_id,
event_type,
payload,
created_at,
processed_at
) VALUES (
'evt_1001',
'post_42',
'LikeCreated',
'{"user_id":"user_7","post_id":"post_42"}',
now(),
NULL
);
COMMIT;
Now the important guarantee is local and concrete:
If the like row commits, the outbox event row commits too.
If the transaction rolls back, neither row exists.
The outbox itself is not Kafka. It is a durable staging area inside the database that bridges the transactional database world and the asynchronous event-driven world.
After commit, a separate background publisher continuously reads unprocessed outbox rows:
select unpublished outbox rows
publish each event to Kafka
mark rows as processed
repeat
The full flow looks like this:
sequenceDiagram participant API as Like API participant DB as Database participant Publisher as Outbox Publisher participant Broker as Kafka / Broker participant Consumers as Consumers API->>DB: Begin transaction API->>DB: Insert like row API->>DB: Insert outbox event row API->>DB: Commit transaction Publisher->>DB: Read unprocessed outbox rows Publisher->>Broker: Publish LikeCreated Broker->>Consumers: Deliver event Publisher->>DB: Mark outbox row processed
The API no longer needs Kafka to be healthy at the exact moment the user likes a post. If Kafka is temporarily unavailable, the outbox rows remain in the database. The publisher can retry later.
4. When To Use It
Use the Transactional Outbox pattern when a committed database change must reliably produce an event.
Good use cases:
- A like should emit events for counters, notifications, ranking, and analytics.
- An order should emit
OrderCreatedfor payment or fulfillment workflows. - A user signup should emit
UserRegisteredfor email, onboarding, or CRM systems. - A message write should emit delivery work for asynchronous workers.
- A payment state change should emit events for ledgers, receipts, or risk systems.
The pattern is useful when downstream systems would become incorrect if they silently missed committed changes.
It is especially important when:
- the database is the source of truth
- downstream projections are event-driven
- direct broker publish happens outside the database transaction
- losing events would create durable product inconsistency
- retrying publication is acceptable
5. When Not To Use It
The pattern may be unnecessary if losing the event is acceptable.
For example, a best-effort analytics ping may not need an outbox if the product can tolerate occasional loss. A debug log event may not need this reliability either.
It may also be unnecessary if the event can be reconstructed cheaply from periodic scans. For example, a nightly batch job could rebuild a low-priority report from source tables.
Do not add an outbox just because the architecture uses events. Add it when the event is part of the correctness contract.
Also be honest about the operational cost. An outbox introduces:
- an outbox table
- a publisher process
- retry behavior
- monitoring
- retention or cleanup
- duplicate publish handling
- schema evolution concerns
If nobody owns those operations, the outbox can become another reliability problem.
6. Data And Operational Model
A practical outbox row usually contains enough information to publish, retry, inspect, and clean up events.
outbox_events
- event_id
- aggregate_type
- aggregate_id
- event_type
- payload
- status
- created_at
- processed_at
- attempt_count
- next_attempt_at
- last_error
The publisher needs a safe way to claim work. If multiple publisher instances run, they must not all publish the same row at the same time unless duplicate publish is acceptable and consumers are idempotent.
Common publisher responsibilities:
- read unprocessed rows
- publish events to the broker
- mark rows as processed after broker acknowledgement
- retry transient failures with backoff
- record permanent failures for inspection
- expose lag and failure metrics
- clean up or archive old processed rows
Important metrics:
- oldest unprocessed outbox row age
- number of unprocessed rows
- publish success rate
- publish failure rate
- retry count
- outbox table size
- publisher lag
- dead-lettered or stuck events
Outbox retention matters. If processed rows are never deleted or archived, the outbox table becomes a hidden storage and query-performance problem.
7. Failure Modes
The outbox pattern prevents lost publish intent, but it does not remove all failure modes.
Important failures:
- The publisher publishes an event but crashes before marking the row processed.
- The publisher retries and publishes the same event more than once.
- Consumers are not idempotent and duplicate side effects appear.
- The outbox table grows without retention.
- Multiple publishers claim the same row without safe coordination.
- Event payload schemas change in a way consumers cannot handle.
- Publisher lag grows and downstream projections become stale.
- The database becomes overloaded because outbox polling is too aggressive.
The most common misunderstanding is thinking the outbox gives exactly-once delivery everywhere. It does not.
The outbox guarantees that if the business transaction commits, the event intent is durably recorded. Publishing may still happen more than once. Consumers still need idempotency.
8. Tradeoffs
| Benefit | Cost |
|---|---|
| Keeps business write and publish intent in one local transaction | Adds an outbox table |
| Prevents committed changes from silently losing events | Adds a publisher workflow |
| Decouples request success from broker availability | Events can be delayed |
| Makes publishing retryable | Consumers must handle duplicates |
| Supports event-driven projections | Requires retention and monitoring |
The key tradeoff is latency versus reliability. Direct publishing may look simpler and faster, but it creates a dangerous consistency gap. The outbox adds an asynchronous step so the system can retry safely when the broker, network, or publisher fails.
For user-facing flows, this is usually the right tradeoff. The user action can commit quickly, and downstream systems can catch up.
9. Related Systems And Concepts
Knowledge links
Use these links to understand what to know first, where this idea appears, and what to study next.
Prerequisites
Read these first if this topic feels unfamiliar.
Used In Systems
System studies where this idea appears in context.
Related Patterns
Reusable architecture moves built from these ideas.