Designing a URL shortener
From one API + one Postgres to a globally distributed redirect fabric — five revisions, each driven by the thing that broke.
Why this design question bites
A URL shortener looks trivial. Take a long URL, hand back a short code, 302 the short code to the long URL when someone hits it. Two endpoints, one row per link. The reason it shows up in interviews so often is that the naive version is a one-pager, and most of the obvious next steps push the cost into a place you don't notice until production traffic finds it.
Requirements we'll design against
- Functional: create a short link from a long URL; redirect it back to its destination; (optional) custom aliases, expiration, click counts.
- Non-functional: redirect latency under 100ms at p99; availability "never visibly down" (99.95%+); read-heavy workload at roughly 100:1; a small number of links will take a large share of traffic (power law).
Kata setup
Scored characteristics
Archie reviews every URL shortener iteration against the same three quality attributes, then turns the lowest-scoring gaps into the next concrete design move.
Performance
The system's responsiveness and throughput under various loads. This includes latency, stress testing, peak analysis, and capacity planning.
Scalability
The ability of the system to handle an increasing number of users, transactions, or data volume, typically by adding or removing resources (elasticity).
Security
A broad characteristic covering data encryption (at rest and in transit), threat modeling, and overall protection against malicious attacks.
Five versions, five failure modes
Five iterations below. Each one is the smallest change that fixes the thing that broke in the previous version — no leaping ahead. Every diagram on this page is live; click around inline, or open it in Archie if you want to keep going.
v1 — The naive design
One API server, one Postgres, one row per link, short code derived from a BIGSERIAL id. For a side project, perfectly fine. Public-facing, it has three problems waiting to happen, and traffic will find all three pretty quickly:
- Coupled write path: the short code is derived from the returned id, so every create is INSERT-then-encode with the write transaction held open across both. Sequential codes are also trivially enumerable — anyone can walk the namespace and see your business volume. (Right-edge B-tree page contention from the monotonic PK shows up too at very high write concurrency, but the round-trip and enumeration problems bite first.)
- Single read path: every redirect hits Postgres. The connection pool runs out long before disk does.
- No hot-key insulation: one viral link can saturate the DB for everyone else trying to read.
Design a URL shortener that creates compact links, redirects read-heavy traffic quickly, and handles analytics and abuse safely.
v2 — Caching the hot reads
Reads are the first thing to fall over, because traffic is power-law: a tiny fraction of codes get a huge share of the hits. Stick Redis in front of Postgres, key by short code, set a TTL somewhere between an hour and a day. Long enough that hot keys actually stay cached, short enough that a stale entry can't haunt you forever.
Design a URL shortener that creates compact links, redirects read-heavy traffic quickly, and handles analytics and abuse safely.
TTL is not invalidation
If a user deletes or rotates a link, you have to DEL the cache entry yourself. Otherwise the old destination keeps redirecting until the TTL runs out. TTL is a safety net for bugs in your invalidation logic, not a replacement for it.
v3 — Decoupling the write path
Reads are taken care of, but the create path is doing extra work. Two things hurt: deriving the short code from the returned id forces an INSERT-then-encode round trip with the write transaction held open across both steps, and sequential codes leak the namespace to anyone willing to increment. Both go away if you stop deriving the code from the database. Pre-mint random base62 codes in a separate Key Pool service, lease batches to API workers, and use the code itself as the primary key. As a side effect, random PKs scatter writes across the B-tree, so the right-edge page contention you'd otherwise hit at higher write concurrency never gets a chance to bite either.
Design a URL shortener that creates compact links, redirects read-heavy traffic quickly, and handles analytics and abuse safely.
Don't coordinate to recover unused codes
If an API server crashes with codes still in its batch, those codes are gone. Don't try to recover them — against a ~3.5T keyspace they're a rounding error. Spend that engineering budget on the parts that matter. Handle the rare insert collision with INSERT … ON CONFLICT retry (about 36k expected over 500M random codes against 62⁷), and make the Key Pool the single source of truth for issued ranges so two workers can never lease the same batch.
Like where this is going?
Try ArchieGuru on your own design — Archie will review the result.
v4 — Pushing reads to the edge
A redirect is a 302 with a Location header (or a 301 with a short Cache-Control: max-age). That's a response any CDN can cache and serve without touching origin. Put the edge in front, give popular codes a few minutes of TTL, and the head of the distribution stops reaching you at all. Cold codes still fall through to origin like before. Behind the CDN, add Postgres read replicas so cold-miss reads don't compete with the write primary, which from here on mostly sees writes and the occasional warm-up fill.
Design a URL shortener that creates compact links, redirects read-heavy traffic quickly, and handles analytics and abuse safely.
What edge TTLs actually buy you (and cost you)
A brand-new link isn't stuck behind the TTL. The edge just misses on it and falls through to origin like any other cold key. TTLs are doing two different jobs here. One is propagating updates and deletes: an edited or deleted link will keep redirecting to its old destination until the cached entry expires or you purge it, so editable links want short TTLs, explicit purges on mutation, and 302 rather than 301 (browsers cache 301 aggressively, well past anything you can control from the CDN). The other is negative caching: if someone happened to hit the code before it existed, the edge may have cached the 404. Keep negative TTLs short — seconds, not minutes — or new links genuinely will look unreachable for a while.
v5 — Analytics and abuse
Two product requirements show up on every shortener's roadmap eventually. People want click counts. And someone, sooner or later, will try to use your domain for phishing. Click analytics has a wrinkle that's easy to miss now that the CDN serves the head of the distribution: origin only sees cache misses, so any counter rooted at origin systematically undercounts the hottest links by the cache hit rate. There are three honest ways to capture clicks at the edge instead. Ship CDN access logs to a bucket and batch-ingest (cheapest, lossy on the long tail, lag of minutes). Run an edge worker — Cloudflare Workers, Lambda@Edge, Fastly Compute — that emits one event per redirect to a stream (every hit captured, costs per request). Or use whatever real-time log streaming your CDN exposes. Pick one explicitly and don't pretend origin can see what it can't.
Abuse handling comes in three layers. WAF rate limits at the front door, so attackers can't bulk-mint thousands of phishing links in a minute. A synchronous threat-intel lookup on the create path, so known-bad destinations never get a short code in the first place. And an async scoring job over the click stream that catches the rest and feeds findings back into the WAF blocklist. The redirect path itself stays edge-cached and fast.
Design a URL shortener that creates compact links, redirects read-heavy traffic quickly, and handles analytics and abuse safely.
Known limitations
v5 is still single-region for writes. Going multi-region active-active is a much bigger lift than it looks: conflict-free code allocation across regions, per-region key pools, replicated abuse blocklists. Almost no shortener actually needs that. If yours does, scope it as its own project, not a tweak on the side of v5.
A few things we didn't dig into here: custom domains (mostly routing and TLS), per-user quotas (a rate-limit rule and a counter), and link ownership and ACLs (a column, an index, some middleware). Each is a small diff on top of v5.
What Archie unlocked, step by step
Every version above started from an Archie review of the previous one. Archie scored each characteristic, picked the lowest, and turned the gap into the next concrete design move: caching, then the key pool, then the edge, then async analytics and abuse handling. The table below is the cumulative scorecard.
Score lift under Archie's reviews
Each step is one Archie review — the gap it flagged became the next version's design move. Rows are ordered from the area Archie still nudges hardest to the one it took furthest, with the overall score on the bottom row. Deltas show the total lift from v1 to v5.
Security
18 → 78+60
v1v2v3v4v5Scalability
28 → 86+58
v1v2v3v4v5Performance
34 → 90+56
v1v2v3v4v5Overall
27 → 84+57
v1v2v3v4v5