Patterns
Upload-Then-Reference Media
Upload large media outside the message send path, then send a lightweight message that references the stored media object.
Concepts Covered
- Media upload separation
- Object storage
- Message references
- Resumable upload
- Async processing
- Orphan cleanup
- Download authorization
- Placeholder delivery
1. Intent
Upload-Then-Reference Media keeps large files out of the core message send path.
Instead of sending a video, image, or document through the same pipeline as a text message, the client uploads the file to a media service first. The chat message then stores a lightweight reference:
message_type = image
media_object_id = media_123
This keeps messaging focused on durable metadata, ordering, delivery, and receipts.
2. The Problem Without This Pattern
If raw media travels through realtime gateways, message APIs, queues, workers, and message tables, the whole system becomes heavier.
Large files increase:
- request time
- memory pressure
- retry cost
- queue storage
- gateway bandwidth
- database row size
- failure probability
A weak mobile network can make the send path block for a long time. A video upload failure can look like a message delivery failure even though these are different problems.
The system needs to separate:
store this file
from:
send this message
3. How The Pattern Works
Typical flow:
1. Client requests upload authorization.
2. Media service creates an upload session.
3. Client uploads bytes to object storage or an upload endpoint.
4. Media service stores metadata and optional processing state.
5. Client sends a chat message with media_object_id.
6. Recipients receive the message reference.
7. Recipients download media through authorized URLs or proxies.
If processing is needed, it runs asynchronously:
media_uploaded -> thumbnail_worker -> scan_worker -> transcode_worker
The message can be delivered while thumbnails or previews are still being prepared, depending on product rules.
4. When To Use It
Use this pattern when:
- messages can contain large files
- uploads need retry or resume
- media processing is separate from message delivery
- files belong in object storage
- recipients can download media lazily
- media authorization matters
- uploads may fail independently from sends
It is common for chat apps, collaboration tools, social platforms, and document-sharing products.
5. When Not To Use It
It may be unnecessary when:
- payloads are always tiny
- files are never retained
- the product only sends small inline metadata
- upload reliability is not important
- media does not require processing, authorization, or storage lifecycle
Even then, keeping binary data out of core transactional tables is usually a good instinct.
6. Data And Operational Model
Media object:
media_object
- media_object_id
- owner_user_id
- upload_state
- size_bytes
- content_type
- object_storage_key
- thumbnail_state
- scan_state
- expires_at
Message reference:
message
- message_id
- conversation_id
- message_type
- media_object_id
Operators should monitor:
- upload success rate
- upload session age
- orphaned media count
- processing queue lag
- download error rate
- object storage cost
- thumbnail and scan failure rate
- authorization failure rate
Orphan cleanup is part of the pattern. Uploaded media that never becomes referenced by a sent message should expire or be garbage collected.
7. Failure Modes
- Upload succeeds but message send fails, leaving orphaned media.
- Message send succeeds but media processing fails.
- Download authorization expires too early.
- Clients retry upload and create duplicate media objects.
- Processing queues lag and previews appear late.
- Media metadata is deleted while messages still reference it.
- Raw media accidentally flows through realtime gateways.
- A placeholder is delivered but the final media never becomes available.
8. Tradeoffs
| Benefit | Cost |
|---|---|
| Keeps message path lightweight | Adds upload sessions and media metadata |
| Supports resumable uploads | Requires orphan cleanup |
| Allows async thumbnails and scans | Recipients may see placeholders |
| Reduces gateway and queue pressure | Download authorization becomes its own concern |
| Separates file storage from message delivery | More states to reconcile |
Upload-then-reference turns media into a storage lifecycle problem attached to messaging, instead of letting media overload the messaging pipeline itself.
9. Related Systems And Concepts
Knowledge links
Use these links to understand what to know first, where this idea appears, and what to study next.
Prerequisites
Read these first if this topic feels unfamiliar.
Used In Systems
System studies where this idea appears in context.
Related Concepts
Core ideas that connect to this topic.