System Design
URL Shortener System
Design a reliable URL shortening service that supports short-code generation, low-latency redirects, abuse controls, analytics, and horizontal scaling.
Study path
Read these in order
Start with the mechanics, then move into the patterns that explain why the system is shaped this way.
Concepts Covered
- Short-code generation
- HTTP redirects
- Database indexing
- B-tree-backed lookups
- Caching and cache invalidation
- Rate limiting
- Read replicas
- Sharding and partition keys
- Hot key mitigation
- Abuse prevention
- Analytics pipelines
- Cache-aside
- Idempotent consumers
- Dead-letter queues
- Synchronous vs asynchronous write paths
1. Introduction
A URL shortener converts a long URL into a compact URL that redirects users to the original destination.
Example:
https://example.com/a/very/long/product/path?campaign=spring
becomes:
https://arc.fl/x7Kp9Q
At first, this seems like a simple key-value lookup: store short_code -> long_url, then redirect. That intuition is useful, but a production system quickly becomes more interesting. It needs low-latency reads, reliable short-code generation, abuse controls, analytics, expiration policies, custom aliases, and a data model that can survive very high read traffic.
The most important mental model: URL creation is usually write-light, while redirect traffic can become read-heavy and latency-sensitive. This is why the page keeps separating creation-time concerns like short-code generation, validation, and rate limiting from redirect-time concerns like HTTP redirects, caching, and asynchronous analytics.
2. Product Requirements
Functional Requirements
- Users can submit a long URL and receive a short URL.
- Visiting a short URL redirects to the original long URL.
- The system can reject invalid, unsafe, or abusive URLs.
- Short links can optionally expire.
- Users can optionally request custom aliases, depending on product scope.
- The system records click events for analytics.
- Operators can disable malicious or policy-violating links.
Non-Functional Requirements
- Redirects should be low latency.
- Short-code lookups should be highly available.
- Short codes should be unique.
- The system should tolerate traffic spikes on popular links.
- Analytics collection should not slow down redirects.
- Link creation should be rate limited to prevent spam.
- The system should avoid accidental reuse of active short codes.
- Data should be durable because broken links damage trust.
3. Core Engineering Challenges
| Challenge | Why it matters |
|---|---|
| Short-code uniqueness | Two long URLs must not accidentally receive the same active short code. |
| Low-latency redirects | Every redirect adds delay before the user reaches the destination. |
| Read-heavy workload | Popular links can receive far more reads than writes. |
| Abuse prevention | URL shorteners are attractive for spam, phishing, and malware distribution. |
| Analytics pressure | Click tracking can create huge write volume. |
| Cache correctness | Disabled or expired links should not continue redirecting because of stale cache entries. |
| Hot keys | A single viral link can dominate cache and database traffic. |
Naive implementations usually fail because they put too much work on the redirect path. If every redirect performs a database lookup, writes analytics synchronously, checks abuse systems synchronously, and updates counters inline, the path becomes slow and fragile.
4. High-Level Architecture
flowchart LR Client[Client] --> CreateAPI[Create Link API] CreateAPI --> Validator[URL validation and policy checks] Validator --> CodeGen[Short-code generator] CodeGen --> LinkDB[(Link database)] CreateAPI --> User[Return short URL] Visitor[Visitor] --> RedirectAPI[Redirect service] RedirectAPI --> Cache[(Cache)] Cache --> LinkDB RedirectAPI --> EventQueue[Click event queue] RedirectAPI --> Destination[Long URL] EventQueue --> AnalyticsWorkers[Analytics workers] AnalyticsWorkers --> AnalyticsStore[(Analytics store)]
The create path and redirect path have different priorities.
The create path cares about validation, uniqueness, durability, and abuse controls.
The redirect path cares about latency, availability, and safe degradation.
5. Core Components
Create Link API
The Create Link API owns the write path for new short links. Its job is not only to accept a URL and return a code; it is the boundary where the system decides whether the requested link is allowed to exist.
On a normal request, it validates the URL syntax, rejects unsupported schemes, checks user or IP-based rate limits, performs basic abuse checks, asks the short-code generator for a candidate code, and writes the mapping durably. If custom aliases are supported, this API also owns alias availability checks and conflict responses.
The important scaling detail is that link creation is usually not the highest-volume path, but it is riskier than it looks. Public creation endpoints attract spam and abuse. If this API does not enforce quotas and rate limits, attackers can create huge numbers of links, poison analytics, or use the platform for phishing. Operationally, teams would watch creation rate, rejection rate, collision retries, abuse-rule latency, and database insert failures.
This component should avoid doing expensive long-running checks synchronously unless the product requires it. A practical design may do fast validation inline and send deeper reputation or malware scans to asynchronous workers, marking suspicious links for later disablement.
Redirect Service
The Redirect Service is the most latency-sensitive component in the system. It handles requests like /x7Kp9Q, extracts the short code, resolves the destination, verifies that the link is active, emits a click event, and returns an HTTP redirect.
This service should do as little synchronous work as possible. The user is waiting to reach the destination page, so every extra dependency adds visible delay. The service should not synchronously update analytics tables, run deep malware scans, or call multiple downstream services before redirecting. Those decisions belong in creation-time checks, cached policy state, or asynchronous processing.
At scale, the redirect service is usually horizontally replicated. Its main operational signals are p95/p99 latency, cache hit ratio, database fallback rate, redirect error rate, and event queue publish failures. If cache misses spike, the database can suddenly become overloaded. If the click-event queue slows down, the redirect path should have an explicit policy: buffer briefly, sample, or drop analytics rather than break redirects.
The response status code is part of this component's product contract. A permanent redirect or temporary redirect changes how much control the service keeps after the first visit. Permanent redirects can be cached aggressively by browsers and intermediaries, but that reduces server-side control. Temporary redirects keep control centralized, which is often valuable when links can expire, be disabled, or need analytics.
Short-Code Generator
The Short-Code Generator creates the compact identifier users see in the short URL. The generator must balance uniqueness, length, predictability, throughput, and operational simplicity.
There are several common strategies, and none is universally best. A random base62 token is easy for many application servers to generate independently, but collisions are possible and must be handled with a uniqueness check. A base62-encoded sequence is collision-free if IDs are allocated correctly, but it can reveal approximate creation volume and may introduce coordination around ID generation. Pre-generated pools let workers reserve batches of codes, which can reduce request-time generation work, but now the system must monitor pool exhaustion.
The database uniqueness constraint is still important even with a good generator. It is the final safety rail. The generator proposes; the database confirms. If the insert fails because of a collision, the API retries with a new candidate.
Operationally, collision rate is a useful signal. A rising collision rate may mean the code space is getting crowded, the generator is biased, or a bug is producing repeated candidates.
Link Database
The Link Database is the source of truth for short-code mappings. It stores the destination URL, ownership metadata, status, expiration, and moderation state.
The dominant lookup is by short_code, so the short_code column needs a unique database index. Without that index, the redirect service would eventually need to scan too much data to find a destination. With the index, the lookup is shaped like a precise key lookup.
The database should not be treated as the only performance layer for redirects. It is the durable authority, not necessarily the fastest read path. Caches and read replicas can absorb read load, but the primary database remains responsible for correctness during creation, updates, disablement, and expiration.
Important failure modes include primary database unavailability, replica lag, slow index lookups, and accidental full-table scans from poorly designed admin queries. Operators should watch query latency, lock contention, replication lag, connection pool saturation, and cache miss load.
Cache
The cache stores hot short_code -> destination mappings close to the redirect service, commonly using the cache-aside pattern. A cache hit lets the system redirect without touching the database, which lowers latency and protects the database during traffic spikes.
The cache should usually store enough information to make the redirect decision safely: destination URL, status, expiration, and possibly a version or updated timestamp. If the cache stores only the destination URL, it may accidentally redirect disabled or expired links until the entry expires.
The hard part is invalidation. When a link is disabled, edited, or expires, stale cache entries can become correctness bugs. A practical system may combine short TTLs, explicit invalidation messages, and versioned cache values. The right choice depends on whether stale redirects are merely annoying or actively dangerous.
Hot keys are another concern. A viral short link may receive enormous traffic. The cache helps, but the team should still monitor top keys, cache node load, and whether a single key is creating uneven pressure.
Click Event Queue
The Click Event Queue decouples redirect serving from analytics pipeline processing. Instead of writing analytics synchronously inside the redirect request, the service emits an event and lets workers process it later. This is a small version of an event stream: redirects publish facts, and analytics consumers build useful views from them.
This queue protects user-facing latency. If analytics storage slows down, redirects can continue as long as the queue can accept events. The queue also allows batching, replay, sampling, and separate consumer groups for different use cases such as dashboards, abuse detection, billing, or campaign reporting.
The queue must have an explicit durability policy. Some products require highly accurate click counts; others can tolerate sampled analytics during extreme load. That decision changes how the redirect service behaves when the queue is unavailable and what kind of backpressure policy it needs.
Operators should watch publish error rate, queue depth, consumer lag, event age, and dead-letter volume. Queue lag does not necessarily mean redirects are broken, but it means analytics are getting stale.
Analytics Workers
Analytics Workers consume click events and transform raw redirect activity into useful product data: total clicks, unique visitors, referrers, device breakdowns, country-level summaries, campaign metrics, and abuse signals.
These workers should usually follow the idempotent consumer pattern or deduplicate by event ID because event queues often deliver at least once. If a worker processes the same click event twice, counters can inflate unless the aggregation model accounts for duplicates.
Analytics storage should be separated from the primary link database. The access pattern is different: analytics is append-heavy and aggregation-heavy, while the link database is lookup-heavy and correctness-sensitive. Mixing these workloads too early can make the main redirect system fragile.
At scale, analytics workers may process events in batches, aggregate by time windows, and write rollups instead of updating one counter per event. Teams should watch consumer lag, aggregation delay, duplicate rate, storage write latency, and the freshness of dashboards.
Abuse and Policy System
The Abuse and Policy System exists because public URL shorteners are attractive to spammers, phishers, and malware distributors. This is not a side concern; it is part of the core product safety model.
At creation time, the system can perform fast checks: allowed URL schemes, domain blocklists, user reputation, account age, and rate-limit state. Deeper checks, such as crawling the destination or consulting slower external reputation services, may happen asynchronously after creation.
On the redirect path, policy checks need to be fast. The redirect service should be able to use cached policy state from the link record, such as active, disabled, expired, or under_review. Calling a slow abuse service for every redirect would make the critical path fragile.
This component introduces a product tradeoff. Aggressive blocking reduces abuse but can create false positives. Loose blocking improves user convenience but may harm trust and safety. Operators should monitor abuse reports, false-positive rates, policy check latency, disabled-link redirects, and manual review queues.
6. Data Modeling
The core table can be simple.
CREATE TABLE links (
id BIGINT PRIMARY KEY,
short_code VARCHAR(16) NOT NULL UNIQUE,
long_url TEXT NOT NULL,
owner_id BIGINT,
status VARCHAR(32) NOT NULL,
created_at TIMESTAMP NOT NULL,
expires_at TIMESTAMP,
last_checked_at TIMESTAMP
);
Important indexes:
| Index | Purpose |
|---|---|
UNIQUE(short_code) | Guarantees code uniqueness and supports redirect lookup. |
(owner_id, created_at) | Supports user link management pages. |
(expires_at) | Supports cleanup or expiration scans. |
(status) | Supports moderation and operational views. |
The redirect path primarily needs fast lookup by short_code.
The analytics model should usually be separate:
click_events
event_id
short_code
occurred_at
user_agent
referrer
ip_prefix_or_geo
Raw click events can be high volume, so they are usually processed asynchronously into aggregated views.
7. Request Lifecycle
Create Short Link
- Client submits a long URL.
- API validates URL syntax and allowed schemes.
- API checks rate limits and abuse rules.
- Short-code generator proposes a code.
- Database insert attempts to store the mapping.
- If the code collides, generation retries with a new code.
- API returns the short URL.
sequenceDiagram
participant Client
participant API
participant DB
Client->>API: Submit long URL
API->>API: Validate and rate limit
API->>API: Generate short code
API->>DB: Insert mapping with unique code
alt Collision
DB-->>API: Unique constraint violation
API->>API: Generate another code
API->>DB: Retry insert
end
DB-->>API: Insert accepted
API-->>Client: Return short URL
Redirect Short Link
- Visitor requests the short URL.
- Redirect service extracts the short code.
- Service checks cache.
- On cache miss, service reads the link database.
- Service verifies status and expiration.
- Service emits a click event asynchronously.
- Service returns an HTTP redirect.
The redirect response is commonly 301, 302, 307, or 308, depending on product needs. Permanent redirects can be cached aggressively by clients and intermediaries, which is useful for stable links but risky if destinations can change. Temporary redirects preserve more control for mutable links and analytics-heavy products.
8. Scaling Problems
Redirect Volume
Redirects can be orders of magnitude more frequent than link creation. The redirect service should be horizontally scalable and should avoid unnecessary synchronous dependencies.
Database Read Pressure
If every redirect hits the primary database, the database becomes a bottleneck. A cache and read replicas can absorb most read traffic.
Hot Links
A link shared by a celebrity, news event, or large marketing campaign can become a hot key. Caches help, but a single extremely hot key can still create uneven load. The system should monitor hot keys and keep the redirect path lightweight.
Analytics Write Amplification
Click analytics can create one write per redirect. At high scale, that is expensive and can overwhelm transactional databases. Queue-based ingestion and batch aggregation are safer.
Collision Handling
Short codes are finite. Random generation must handle collisions with a database uniqueness check or a reserved-code system.
Abuse Load
Attackers can create many links or generate traffic to malicious destinations. Rate limiting, reputation systems, and moderation tooling become part of the architecture, not optional polish.
9. Distributed Systems Concepts
Caching
Caching improves redirect latency and reduces database pressure. The key correctness challenge is invalidation: disabled, expired, or updated links must not remain valid forever in cache.
Sharding
At very large scale, link data may be partitioned across database shards. A common approach is to shard by short code or internal link ID. The partition key should match the main access pattern.
Read Replicas
Read replicas can serve redirect lookups, but replication lag matters. If a newly created link is read from a lagging replica, the redirect may briefly fail unless the system uses read-after-write strategies.
Idempotency
If the create API accepts client retries, idempotency keys can prevent duplicate short links for the same creation request.
Backpressure
Click event queues need backpressure controls. If analytics workers lag, redirects should continue while the system preserves or deliberately samples events according to product requirements.
10. Reliability & Failure Handling
| Failure | Impact | Mitigation |
|---|---|---|
| Cache unavailable | Redirect service hits database more often | Fail open to database with load protection |
| Database unavailable | Redirects may fail on cache miss | High cache hit ratio, replicas, graceful error page |
| Analytics queue unavailable | Click events may be delayed or dropped | Local buffering, sampling policy, clear durability goals |
| Abuse service unavailable | Link creation decisions become risky | Conservative fallback for creation, cached policy for redirect |
| Replica lag | Newly created links may not resolve | Read-after-write from primary or cache newly created mapping |
| Bad deploy | Redirect path breaks user traffic | Canary deploys, fast rollback, synthetic checks |
The redirect path deserves strong monitoring:
- redirect latency
- cache hit ratio
- database lookup latency
- error rate by status
- queue lag for click events
- top hot keys
- abuse rejection rates
11. Real-World Company Approaches
Public URL shorteners and link platforms generally optimize heavily around redirect availability and latency. The exact internals vary and should not be assumed without public engineering sources.
A company operating this at large scale would commonly:
- place redirect services close to users through edge locations or regional deployments
- keep hot mappings cached
- separate analytics ingestion from redirects
- rate limit creation endpoints
- invest in abuse detection
- use observability to detect hot links and traffic spikes
- avoid putting expensive work directly in the redirect path
The general principle is stable: redirects are the critical path, and secondary work should be decoupled.
12. Tradeoffs & Alternatives
Random Short Codes
Random codes are easy to generate independently, but collisions must be handled. Longer codes reduce collision probability.
Sequential IDs Encoded As Base62
Sequential IDs are simple and collision-free if generated centrally, but they can reveal creation volume and may require coordination or ID allocation.
Hash-Based Codes
Hashing a long URL can make repeated URLs map predictably, but collisions still exist and product requirements may not want different users sharing the same short code for the same destination.
301 vs 302 Redirects
301 can be faster for stable permanent links because clients and intermediaries may cache it. 302 keeps more server-side control for mutable destinations, analytics, expiration, and abuse intervention.
Synchronous Analytics
Synchronous analytics are simpler but slow down redirects. Asynchronous analytics are more scalable but introduce queueing, replay, deduplication, and delayed metrics.
13. Evolution Path
- Start with one database table keyed by
short_code. - Add a unique index for short-code safety.
- Add basic validation and rate limiting.
- Add cache for redirect lookups.
- Add asynchronous click event ingestion.
- Add read replicas for redirect reads.
- Add abuse detection and moderation workflows.
- Add sharding when the link table or write volume exceeds one database's practical limits.
- Add regional redirect services or edge caching for global latency.
- Add richer analytics pipelines and hot-key mitigation.
The system should not begin with every advanced feature. It should evolve as bottlenecks become real.
14. Key Engineering Lessons
- URL shorteners are read-heavy systems with a latency-sensitive redirect path.
- The short code must be unique, indexed, and fast to resolve.
- Analytics should not block redirects.
- Abuse prevention is part of the core system because public link creation invites misuse.
- Cache invalidation matters because disabled and expired links are correctness-sensitive.
- The right redirect status code depends on whether the product values cacheability or server-side control.
- Sharding strategy should follow the dominant lookup pattern.
- A simple product can still teach many backend fundamentals.
15. Related Topics
Knowledge links
Use these links to understand what to know first, where this idea appears, and what to study next.
Related Concepts
Core ideas that connect to this topic.
Related Patterns
Reusable architecture moves built from these ideas.
More Links
Additional references connected to this page.