System Design

URL Shortener System

Design a reliable URL shortening service that supports short-code generation, low-latency redirects, abuse controls, analytics, and horizontal scaling.

foundation16 min readUpdated unknownModelingCapacityDataReliabilityOperationsTradeoffs

Short-Code GenerationDatabase IndexingCachingRate LimitingShardingRead ReplicasHTTP RedirectsHot Key MitigationAnalytics PipelinesBackpressure

Study path

Read these in order

Start with the mechanics, then move into the patterns that explain why the system is shaped this way.

1. Introduction

A URL shortener converts a long URL into a compact URL that redirects users to the original destination.

Example:

https://example.com/a/very/long/product/path?campaign=spring

becomes:

https://arc.fl/x7Kp9Q

At first, this seems like a simple key-value lookup: store short_code -> long_url, then redirect. That intuition is useful, but a production system quickly becomes more interesting. It needs low-latency reads, reliable short-code generation, abuse controls, analytics, expiration policies, custom aliases, and a data model that can survive very high read traffic.

The most important mental model: URL creation is usually write-light, while redirect traffic can become read-heavy and latency-sensitive. This is why the page keeps separating creation-time concerns like short-code generation, validation, and rate limiting from redirect-time concerns like HTTP redirects, caching, and asynchronous analytics.

2. Product Requirements

Functional Requirements

Users can submit a long URL and receive a short URL.
Visiting a short URL redirects to the original long URL.
The system can reject invalid, unsafe, or abusive URLs.
Short links can optionally expire.
Users can optionally request custom aliases, depending on product scope.
The system records click events for analytics.
Operators can disable malicious or policy-violating links.

Non-Functional Requirements

Redirects should be low latency.
Short-code lookups should be highly available.
Short codes should be unique.
The system should tolerate traffic spikes on popular links.
Analytics collection should not slow down redirects.
Link creation should be rate limited to prevent spam.
The system should avoid accidental reuse of active short codes.
Data should be durable because broken links damage trust.

3. Core Engineering Challenges

Challenge	Why it matters
Short-code uniqueness	Two long URLs must not accidentally receive the same active short code.
Low-latency redirects	Every redirect adds delay before the user reaches the destination.
Read-heavy workload	Popular links can receive far more reads than writes.
Abuse prevention	URL shorteners are attractive for spam, phishing, and malware distribution.
Analytics pressure	Click tracking can create huge write volume.
Cache correctness	Disabled or expired links should not continue redirecting because of stale cache entries.
Hot keys	A single viral link can dominate cache and database traffic.

Naive implementations usually fail because they put too much work on the redirect path. If every redirect performs a database lookup, writes analytics synchronously, checks abuse systems synchronously, and updates counters inline, the path becomes slow and fragile.

4. High-Level Architecture

flowchart LR
  Client[Client] --> CreateAPI[Create Link API]
  CreateAPI --> Validator[URL validation and policy checks]
  Validator --> CodeGen[Short-code generator]
  CodeGen --> LinkDB[(Link database)]
  CreateAPI --> User[Return short URL]

  Visitor[Visitor] --> RedirectAPI[Redirect service]
  RedirectAPI --> Cache[(Cache)]
  Cache --> LinkDB
  RedirectAPI --> EventQueue[Click event queue]
  RedirectAPI --> Destination[Long URL]
  EventQueue --> AnalyticsWorkers[Analytics workers]
  AnalyticsWorkers --> AnalyticsStore[(Analytics store)]

The create path and redirect path have different priorities.

The create path cares about validation, uniqueness, durability, and abuse controls.

The redirect path cares about latency, availability, and safe degradation.

5. Core Components

Create Link API

The Create Link API owns the write path for new short links. Its job is not only to accept a URL and return a code; it is the boundary where the system decides whether the requested link is allowed to exist.

On a normal request, it validates the URL syntax, rejects unsupported schemes, checks user or IP-based rate limits, performs basic abuse checks, asks the short-code generator for a candidate code, and writes the mapping durably. If custom aliases are supported, this API also owns alias availability checks and conflict responses.

The important scaling detail is that link creation is usually not the highest-volume path, but it is riskier than it looks. Public creation endpoints attract spam and abuse. If this API does not enforce quotas and rate limits, attackers can create huge numbers of links, poison analytics, or use the platform for phishing. Operationally, teams would watch creation rate, rejection rate, collision retries, abuse-rule latency, and database insert failures.

This component should avoid doing expensive long-running checks synchronously unless the product requires it. A practical design may do fast validation inline and send deeper reputation or malware scans to asynchronous workers, marking suspicious links for later disablement.

Redirect Service

The Redirect Service is the most latency-sensitive component in the system. It handles requests like /x7Kp9Q, extracts the short code, resolves the destination, verifies that the link is active, emits a click event, and returns an HTTP redirect.

This service should do as little synchronous work as possible. The user is waiting to reach the destination page, so every extra dependency adds visible delay. The service should not synchronously update analytics tables, run deep malware scans, or call multiple downstream services before redirecting. Those decisions belong in creation-time checks, cached policy state, or asynchronous processing.

At scale, the redirect service is usually horizontally replicated. Its main operational signals are p95/p99 latency, cache hit ratio, database fallback rate, redirect error rate, and event queue publish failures. If cache misses spike, the database can suddenly become overloaded. If the click-event queue slows down, the redirect path should have an explicit policy: buffer briefly, sample, or drop analytics rather than break redirects.

The response status code is part of this component's product contract. A permanent redirect or temporary redirect changes how much control the service keeps after the first visit. Permanent redirects can be cached aggressively by browsers and intermediaries, but that reduces server-side control. Temporary redirects keep control centralized, which is often valuable when links can expire, be disabled, or need analytics.

Short-Code Generator

The Short-Code Generator creates the compact identifier users see in the short URL. The generator must balance uniqueness, length, predictability, throughput, and operational simplicity.

There are several common strategies, and none is universally best. A random base62 token is easy for many application servers to generate independently, but collisions are possible and must be handled with a uniqueness check. A base62-encoded sequence is collision-free if IDs are allocated correctly, but it can reveal approximate creation volume and may introduce coordination around ID generation. Pre-generated pools let workers reserve batches of codes, which can reduce request-time generation work, but now the system must monitor pool exhaustion.

The database uniqueness constraint is still important even with a good generator. It is the final safety rail. The generator proposes; the database confirms. If the insert fails because of a collision, the API retries with a new candidate.

Operationally, collision rate is a useful signal. A rising collision rate may mean the code space is getting crowded, the generator is biased, or a bug is producing repeated candidates.

Link Database

The Link Database is the source of truth for short-code mappings. It stores the destination URL, ownership metadata, status, expiration, and moderation state.

The dominant lookup is by short_code, so the short_code column needs a unique database index. Without that index, the redirect service would eventually need to scan too much data to find a destination. With the index, the lookup is shaped like a precise key lookup.

The database should not be treated as the only performance layer for redirects. It is the durable authority, not necessarily the fastest read path. Caches and read replicas can absorb read load, but the primary database remains responsible for correctness during creation, updates, disablement, and expiration.

Important failure modes include primary database unavailability, replica lag, slow index lookups, and accidental full-table scans from poorly designed admin queries. Operators should watch query latency, lock contention, replication lag, connection pool saturation, and cache miss load.

Cache

The cache stores hot short_code -> destination mappings close to the redirect service, commonly using the cache-aside pattern. A cache hit lets the system redirect without touching the database, which lowers latency and protects the database during traffic spikes.

The cache should usually store enough information to make the redirect decision safely: destination URL, status, expiration, and possibly a version or updated timestamp. If the cache stores only the destination URL, it may accidentally redirect disabled or expired links until the entry expires.

The hard part is invalidation. When a link is disabled, edited, or expires, stale cache entries can become correctness bugs. A practical system may combine short TTLs, explicit invalidation messages, and versioned cache values. The right choice depends on whether stale redirects are merely annoying or actively dangerous.

Hot keys are another concern. A viral short link may receive enormous traffic. The cache helps, but the team should still monitor top keys, cache node load, and whether a single key is creating uneven pressure.

Click Event Queue

The Click Event Queue decouples redirect serving from analytics pipeline processing. Instead of writing analytics synchronously inside the redirect request, the service emits an event and lets workers process it later. This is a small version of an event stream: redirects publish facts, and analytics consumers build useful views from them.

This queue protects user-facing latency. If analytics storage slows down, redirects can continue as long as the queue can accept events. The queue also allows batching, replay, sampling, and separate consumer groups for different use cases such as dashboards, abuse detection, billing, or campaign reporting.

The queue must have an explicit durability policy. Some products require highly accurate click counts; others can tolerate sampled analytics during extreme load. That decision changes how the redirect service behaves when the queue is unavailable and what kind of backpressure policy it needs.

Operators should watch publish error rate, queue depth, consumer lag, event age, and dead-letter volume. Queue lag does not necessarily mean redirects are broken, but it means analytics are getting stale.

Analytics Workers

Analytics Workers consume click events and transform raw redirect activity into useful product data: total clicks, unique visitors, referrers, device breakdowns, country-level summaries, campaign metrics, and abuse signals.

These workers should usually follow the idempotent consumer pattern or deduplicate by event ID because event queues often deliver at least once. If a worker processes the same click event twice, counters can inflate unless the aggregation model accounts for duplicates.

Analytics storage should be separated from the primary link database. The access pattern is different: analytics is append-heavy and aggregation-heavy, while the link database is lookup-heavy and correctness-sensitive. Mixing these workloads too early can make the main redirect system fragile.

At scale, analytics workers may process events in batches, aggregate by time windows, and write rollups instead of updating one counter per event. Teams should watch consumer lag, aggregation delay, duplicate rate, storage write latency, and the freshness of dashboards.

Abuse and Policy System

The Abuse and Policy System exists because public URL shorteners are attractive to spammers, phishers, and malware distributors. This is not a side concern; it is part of the core product safety model.

At creation time, the system can perform fast checks: allowed URL schemes, domain blocklists, user reputation, account age, and rate-limit state. Deeper checks, such as crawling the destination or consulting slower external reputation services, may happen asynchronously after creation.

On the redirect path, policy checks need to be fast. The redirect service should be able to use cached policy state from the link record, such as active, disabled, expired, or under_review. Calling a slow abuse service for every redirect would make the critical path fragile.

This component introduces a product tradeoff. Aggressive blocking reduces abuse but can create false positives. Loose blocking improves user convenience but may harm trust and safety. Operators should monitor abuse reports, false-positive rates, policy check latency, disabled-link redirects, and manual review queues.

6. Data Modeling

The core table can be simple.

CREATE TABLE links (
  id BIGINT PRIMARY KEY,
  short_code VARCHAR(16) NOT NULL UNIQUE,
  long_url TEXT NOT NULL,
  owner_id BIGINT,
  status VARCHAR(32) NOT NULL,
  created_at TIMESTAMP NOT NULL,
  expires_at TIMESTAMP,
  last_checked_at TIMESTAMP
);

Important indexes:

Index	Purpose
`UNIQUE(short_code)`	Guarantees code uniqueness and supports redirect lookup.
`(owner_id, created_at)`	Supports user link management pages.
`(expires_at)`	Supports cleanup or expiration scans.
`(status)`	Supports moderation and operational views.

The redirect path primarily needs fast lookup by short_code.

The analytics model should usually be separate:

click_events
  event_id
  short_code
  occurred_at
  user_agent
  referrer
  ip_prefix_or_geo

Raw click events can be high volume, so they are usually processed asynchronously into aggregated views.

7. Request Lifecycle

Create Short Link

Client submits a long URL.
API validates URL syntax and allowed schemes.
API checks rate limits and abuse rules.
Short-code generator proposes a code.
Database insert attempts to store the mapping.
If the code collides, generation retries with a new code.
API returns the short URL.

sequenceDiagram
  participant Client
  participant API
  participant DB
  Client->>API: Submit long URL
  API->>API: Validate and rate limit
  API->>API: Generate short code
  API->>DB: Insert mapping with unique code
  alt Collision
    DB-->>API: Unique constraint violation
    API->>API: Generate another code
    API->>DB: Retry insert
  end
  DB-->>API: Insert accepted
  API-->>Client: Return short URL

Redirect Short Link

Visitor requests the short URL.
Redirect service extracts the short code.
Service checks cache.
On cache miss, service reads the link database.
Service verifies status and expiration.
Service emits a click event asynchronously.
Service returns an HTTP redirect.

The redirect response is commonly 301, 302, 307, or 308, depending on product needs. Permanent redirects can be cached aggressively by clients and intermediaries, which is useful for stable links but risky if destinations can change. Temporary redirects preserve more control for mutable links and analytics-heavy products.

8. Scaling Problems

Redirect Volume

Redirects can be orders of magnitude more frequent than link creation. The redirect service should be horizontally scalable and should avoid unnecessary synchronous dependencies.

Database Read Pressure

If every redirect hits the primary database, the database becomes a bottleneck. A cache and read replicas can absorb most read traffic.

Hot Links

A link shared by a celebrity, news event, or large marketing campaign can become a hot key. Caches help, but a single extremely hot key can still create uneven load. The system should monitor hot keys and keep the redirect path lightweight.

Analytics Write Amplification

Click analytics can create one write per redirect. At high scale, that is expensive and can overwhelm transactional databases. Queue-based ingestion and batch aggregation are safer.

Collision Handling

Short codes are finite. Random generation must handle collisions with a database uniqueness check or a reserved-code system.

Abuse Load

Attackers can create many links or generate traffic to malicious destinations. Rate limiting, reputation systems, and moderation tooling become part of the architecture, not optional polish.

9. Distributed Systems Concepts

Caching

Caching improves redirect latency and reduces database pressure. The key correctness challenge is invalidation: disabled, expired, or updated links must not remain valid forever in cache.

Sharding

At very large scale, link data may be partitioned across database shards. A common approach is to shard by short code or internal link ID. The partition key should match the main access pattern.

Read Replicas

Read replicas can serve redirect lookups, but replication lag matters. If a newly created link is read from a lagging replica, the redirect may briefly fail unless the system uses read-after-write strategies.

Idempotency

If the create API accepts client retries, idempotency keys can prevent duplicate short links for the same creation request.

Backpressure

Click event queues need backpressure controls. If analytics workers lag, redirects should continue while the system preserves or deliberately samples events according to product requirements.

10. Reliability & Failure Handling

Failure	Impact	Mitigation
Cache unavailable	Redirect service hits database more often	Fail open to database with load protection
Database unavailable	Redirects may fail on cache miss	High cache hit ratio, replicas, graceful error page
Analytics queue unavailable	Click events may be delayed or dropped	Local buffering, sampling policy, clear durability goals
Abuse service unavailable	Link creation decisions become risky	Conservative fallback for creation, cached policy for redirect
Replica lag	Newly created links may not resolve	Read-after-write from primary or cache newly created mapping
Bad deploy	Redirect path breaks user traffic	Canary deploys, fast rollback, synthetic checks

The redirect path deserves strong monitoring:

redirect latency
cache hit ratio
database lookup latency
error rate by status
queue lag for click events
top hot keys
abuse rejection rates

11. Real-World Company Approaches

Public URL shorteners and link platforms generally optimize heavily around redirect availability and latency. The exact internals vary and should not be assumed without public engineering sources.

A company operating this at large scale would commonly:

place redirect services close to users through edge locations or regional deployments
keep hot mappings cached
separate analytics ingestion from redirects
rate limit creation endpoints
invest in abuse detection
use observability to detect hot links and traffic spikes
avoid putting expensive work directly in the redirect path

The general principle is stable: redirects are the critical path, and secondary work should be decoupled.

Start with one database table keyed by short_code.
Add a unique index for short-code safety.
Add basic validation and rate limiting.
Add cache for redirect lookups.
Add asynchronous click event ingestion.
Add read replicas for redirect reads.
Add abuse detection and moderation workflows.
Add sharding when the link table or write volume exceeds one database's practical limits.
Add regional redirect services or edge caching for global latency.
Add richer analytics pipelines and hot-key mitigation.

The system should not begin with every advanced feature. It should evolve as bottlenecks become real.

14. Key Engineering Lessons

URL shorteners are read-heavy systems with a latency-sensitive redirect path.
The short code must be unique, indexed, and fast to resolve.
Analytics should not block redirects.
Abuse prevention is part of the core system because public link creation invites misuse.
Cache invalidation matters because disabled and expired links are correctness-sensitive.
The right redirect status code depends on whether the product values cacheability or server-side control.
Sharding strategy should follow the dominant lookup pattern.
A simple product can still teach many backend fundamentals.

Knowledge links

Use these links to understand what to know first, where this idea appears, and what to study next.

Related Concepts

Core ideas that connect to this topic.

Short-Code Generation HTTP Redirects Database Indexing Caching Rate Limiting Read Replicas Sharding Hot Key Mitigation Analytics Pipelines Event Streams Backpressure

Related Patterns

Reusable architecture moves built from these ideas.

Cache-Aside Idempotent Consumer Dead-Letter Queue Retry With Backoff And Jitter