System Design · Lesson 21 of 26
Case Study: Design a URL Shortener (like bit.ly)
A URL shortener is one of the most common system design interview questions — not because the product is complex, but because a simple spec hides a surprising number of interesting trade-offs. This case study walks through the full design.
Requirements
Start every system design by nailing down the requirements before touching architecture.
Functional:
- Given a long URL, return a short URL (e.g.,
https://sys.fm/xK3p9) - Visiting the short URL redirects to the original long URL
- Custom aliases (e.g.,
/my-blog) are optionally supported - Links can optionally expire
Non-functional:
- Read-heavy: redirects vastly outnumber writes (100:1 ratio is typical)
- Low latency redirects — users feel every millisecond on a redirect
- High availability — a dead link is a bad user experience
- Analytics: click counts, geography, referrers (optional but common in interviews)
Scale estimates (back-of-envelope):
Write: 100M new URLs/day → ~1,200 writes/second
Read: 10B redirects/day → ~115,000 reads/second
Storage: 100M * 365 * 5 years * 500 bytes ≈ 90 TB over 5 yearsThe Core Problem: Short Code Generation
The most interesting design decision is how to generate the short code. You have three main options.
Option 1: Hash + Truncate
MD5 or SHA-256 the long URL, take the first 7 characters.
MD5("https://example.com/very/long/url") = "1a2b3c4d5e6f..."
Short code = "1a2b3c" (first 6 chars of base62-encoded hash)Problem: Hash collisions. Two different URLs can produce the same prefix. You need a collision check on every write — if collision, try 7 chars, then 8, etc.
When to use: Works fine for moderate scale where collisions are rare and a DB lookup on write is acceptable.
Option 2: Counter + Base62 Encoding
A global auto-increment counter. Every new URL gets the next number, encoded in base62 (a-z, A-Z, 0-9).
Counter: 1 → base62 → "b"
Counter: 1000000 → base62 → "4c92"
Counter: 56800235584 → base62 → "zzzzzz" (6 chars = 56B entries)Problem: The counter is a single point of failure and a bottleneck. Solutions:
- Use a distributed counter (Zookeeper, Redis INCR) — adds complexity and latency
- Pre-allocate ranges to app servers (each server gets 1M IDs at a time, hands them out locally)
When to use: Best for predictable, sequential IDs. Reveals enumeration info (IDs are guessable in order) — acceptable for most services, not for private sharing.
Option 3: Random Base62
Generate 6–7 random base62 characters and check for uniqueness in the DB.
import random, string
def generate_code(length=7):
chars = string.ascii_letters + string.digits # 62 chars
return ''.join(random.choices(chars, k=length))
# 62^7 = ~3.5 trillion combinationsProblem: As the DB fills up, collision probability increases. At 10% fill (350B entries for 7 chars), you'll still have ~10% collision rate per attempt. Usually acceptable with retry logic.
Recommended approach for interviews: Counter with pre-allocated ranges. It's predictable, fast, and distributed without a hot single counter.
Data Model
Keep it simple:
CREATE TABLE urls (
id BIGINT PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
user_id BIGINT,
created_at TIMESTAMPTZ DEFAULT now(),
expires_at TIMESTAMPTZ,
click_count BIGINT DEFAULT 0
);
CREATE UNIQUE INDEX idx_short_code ON urls(short_code);For analytics (if required):
CREATE TABLE clicks (
id BIGINT PRIMARY KEY,
short_code VARCHAR(10) NOT NULL,
clicked_at TIMESTAMPTZ DEFAULT now(),
ip_hash VARCHAR(64), -- hashed for privacy
country VARCHAR(2),
referrer TEXT,
user_agent TEXT
);Keep clicks in a separate table (or separate service). Click volume dwarfs URL creation volume — don't let analytics writes block redirects.
Architecture
Browser → CDN/Edge → Load Balancer → API Servers → Cache (Redis)
↓ (cache miss)
DB Read ReplicaWrite path:
- API server generates short code
- Writes to primary DB
- Optionally warms cache (
SET short_code → long_url EX 86400)
Read path (the hot path):
- Request hits CDN — if cached at edge, returns
301/302with no origin hit - Cache miss → Redis lookup (sub-millisecond)
- Redis miss → DB read replica
- Returns redirect
301 vs 302:
301 Permanent— browser caches redirect; no future requests reach your servers. Good for performance, bad for analytics (you can't count clicks).302 Temporary— every click hits your servers. Better for analytics, higher origin load.
Use 302 if you want click tracking. Use 301 for static links where analytics don't matter.
Caching Strategy
Redirects are the hot path. Cache aggressively.
Layer 1: CDN edge cache (Cloudflare, CloudFront)
TTL: 1 hour for active links
Layer 2: Redis (per data center)
TTL: 24 hours, LRU eviction
Layer 3: DB read replica
Hot rows live in Postgres buffer pool anywayThe 80/20 rule applies hard here: 20% of short codes account for 80% of traffic. A modest Redis cluster handles this.
Cache invalidation: When a URL is deleted or expires, you need to evict from Redis. Either:
- Set TTL equal to expiry time at write time
- On delete, send a cache invalidation event via message queue
Expiry and Cleanup
Two mechanisms:
- Check-on-read: When handling a redirect, check
expires_at. If expired, return410 Gone. Fast, no background job needed. - Background cleanup: A nightly job deletes expired rows. Keeps the DB clean but is optional — expired rows that are never accessed waste space but don't cause bugs.
Abuse Prevention
URL shorteners are a favourite tool for spam and phishing.
Basic controls:
- Rate limit by IP: max 10 URL creations per minute per IP
- Rate limit by account: max 1,000 per day for free tier
- Blocklist of known malicious long URLs (Google Safe Browsing API)
- Custom alias validation: alphanumeric only, reserved words blocked (
api,admin,login)
For the interview: Mentioning Safe Browsing API integration always impresses. It shows you're thinking about abuse, not just the happy path.
Scaling Discussion
| Bottleneck | Solution |
|-----------|---------|
| Write throughput | Horizontal scale API servers; pre-allocated ID ranges eliminate counter bottleneck |
| Read throughput | Redis cluster + CDN edge caching |
| DB storage | Partition clicks table by month; archive old data to cold storage |
| Single-region failure | Multi-region with DNS failover; Redis replication per region |
| Hot short codes | CDN caching means top 1% of codes never reach origin servers |
What Interviewers Are Actually Testing
- You nail the back-of-envelope before proposing any architecture
- You compare code-generation strategies — not just "use a hash"
- You separate the write path from the read path — they have different scaling requirements
- You use caching correctly — CDN → Redis → DB, not just "add Redis"
- You discuss the 301 vs 302 trade-off — this is a signal you've thought about analytics
- You mention abuse prevention — shows product thinking, not just engineering
Quick Reference
Short code: 6–7 base62 chars → 56B–3.5T combinations
Generation: Counter with pre-allocated ranges (fast, distributed)
DB: PostgreSQL, index on short_code
Cache: Redis (24h TTL) + CDN edge (1h TTL)
Redirect: 302 if analytics needed, 301 if not
Expiry: Check on read + background cleanup
Rate limiting: Per-IP and per-account, Safe Browsing for abuse
Scale: Read replicas for DB, Redis cluster for cache, CDN for edge