Concepts

Delivery Guarantees

The reliability contract that defines whether messages can be lost, duplicated, retried, acknowledged, or eventually delivered.

intermediate4 min readUpdated 2026-05-14ModelingReliabilityOperationsTradeoffs

At-Most-OnceAt-Least-OnceExactly-Once IllusionAcknowledgementsIdempotency

After this, you will understand

How Delivery Guarantees helps you see where this idea appears in production systems, what problem forces it, and how to reason about the tradeoffs.

Naive mental model

Treat the idea as a definition to memorize.

Production pressure

Real systems force the idea to handle At-Most-Once, At-Least-Once, and Exactly-Once Illusion.

Better reasoning

Use the concept to decide what the system guarantees, what it risks, and what it costs to operate.

Think before readingWhere would Delivery Guarantees appear in a real production system, and what failure or bottleneck would it help you reason about?

As you read, look for the pressure that creates the idea first. The mechanics matter more once the reason is clear.

Reading in progress

This page is saved in your local study history so you can continue later.

Next: Delta Transfer

Concepts Covered

At-most-once delivery
At-least-once delivery
Exactly-once semantics
Durable acceptance
Acknowledgements
Retries
Duplicate prevention
User-visible delivery states

Definition

A delivery guarantee is the reliability contract a system makes about what can happen after work is accepted.

In a messaging system, this means answering questions like:

Can an accepted message be lost?
Can a recipient receive the same message twice?
What does "sent" mean?
What does "delivered" mean?
Does the sender wait for every recipient device?
What happens if a worker crashes halfway through delivery?

Delivery guarantees are not just implementation details. They shape user trust. A chat product can tolerate a receipt arriving late. It cannot casually lose a message after telling the sender it was accepted.

The Pain That Forces Delivery Guarantees

Naive messaging systems often mix several meanings into one word: "sent."

But a message moves through multiple stages:

client creates message
  -> server accepts message
  -> message is stored durably
  -> delivery worker processes it
  -> gateway pushes it
  -> recipient device receives it
  -> recipient reads it

If the UI shows one checkmark after the first network request returns, what does that checkmark actually mean? Did the server store the message? Did a recipient receive it? Did only a gateway see it? Did the message survive a crash?

Delivery guarantees force the system to define these boundaries clearly.

The Three Classic Guarantees

Guarantee	Meaning	Common consequence
At-most-once	Try once; do not retry after uncertainty	Work can be lost
At-least-once	Retry until success is observed	Duplicates are possible
Exactly-once	Each logical operation affects the system once	Usually built from idempotency and deduplication

"Exactly once" is often misunderstood. In distributed systems, networks fail and components retry. A practical design usually uses at-least-once delivery underneath, then adds idempotency so repeated attempts do not create repeated logical effects.

Accepted Is Not Delivered

A chat system should separate these states:

pending    -> client has not received server acceptance
accepted   -> server durably stored the message
delivered  -> recipient account or device acknowledged receipt
read       -> recipient viewed the message according to product rules
failed     -> system could not accept or deliver under policy

The exact UI labels are product decisions, but the engineering boundary matters.

If the server says a message was accepted, the message should be durable enough to survive gateway crashes, worker retries, and recipient offline periods.

Why At-Least-Once Is Common

At-least-once delivery chooses retry over silent loss.

Suppose a delivery worker sends a message to a gateway and then crashes before recording success.

Did the device receive the message? The worker may not know.

Retrying is safer than giving up, but retrying can duplicate delivery unless the recipient side or delivery state is idempotent.

This is why delivery systems need stable identifiers:

message_id + recipient_user_id + device_id

That key lets the system recognize:

this delivery task already exists
this device already acknowledged this message
this retry is the same logical work

Acknowledgements

Acknowledgements are signals from one part of the system to another. They are not all equal.

Important acknowledgement types:

Server acceptance acknowledgement to the sender.
Gateway write acknowledgement from gateway to delivery worker.
Device receipt acknowledgement from client to backend.
Read acknowledgement from client to receipt service.
Consumer checkpoint acknowledgement to a stream or queue.

A gateway saying "I wrote the event to the socket" is weaker than a device saying "I processed the message." A read receipt is different from a delivery receipt.

Precise language prevents false confidence.

Failure Modes

Common failures:

The server acknowledges before durably storing the message.
The client retries without an idempotency key and creates duplicates.
The delivery worker retries push notifications without deduplication.
The UI treats "sent to gateway" as "delivered to recipient."
Poison messages retry forever and block a queue.
Retries happen too aggressively and create overload.

Dead-letter queues are useful for delivery tasks that repeatedly fail. They prevent one bad item from blocking a whole queue, while giving operators a place to inspect and repair.

Operational Reality

Important signals:

accepted message count
delivery task retry rate
duplicate delivery attempts
acknowledgement latency
undelivered backlog age
dead-letter queue depth
device receipt lag
gateway push failures
messages stuck in uncertain states

The product question is: what should the user believe at each state? The engineering job is to make that belief honest.

What to study next

These links keep the session moving: read prerequisites first, then open the systems, concepts, and patterns that deepen this page.

Prerequisites

Read these first if the mechanics feel unfamiliar.

IdempotencyStart here if Idempotency is still fuzzy.Retry With Backoff And JitterStart here if Retry With Backoff And Jitter is still fuzzy.

Used In Systems

System studies where this idea appears in context.

WhatsApp-Style Messaging SystemSee the idea under full production pressure.

Related Concepts

Core ideas that connect to this topic.

Offline DeliveryUnderstand the concept behind the design decision.Message OrderingUnderstand the concept behind the design decision.Message ReceiptsUnderstand the concept behind the design decision.

Related Patterns

Reusable architecture moves built from these ideas.

Dead-Letter QueueLearn the reusable move this page points toward.