System Design · Lesson 21 of 26

Case Study: Design a URL Shortener (like bit.ly)

A URL shortener is one of the most common system design interview questions — not because the product is complex, but because a simple spec hides a surprising number of interesting trade-offs. This case study walks through the full design.

Requirements

Start every system design by nailing down the requirements before touching architecture.

Functional:

Given a long URL, return a short URL (e.g., https://sys.fm/xK3p9)
Visiting the short URL redirects to the original long URL
Custom aliases (e.g., /my-blog) are optionally supported
Links can optionally expire

Non-functional:

Read-heavy: redirects vastly outnumber writes (100:1 ratio is typical)
Low latency redirects — users feel every millisecond on a redirect
High availability — a dead link is a bad user experience
Analytics: click counts, geography, referrers (optional but common in interviews)

Scale estimates (back-of-envelope):

Write: 100M new URLs/day → ~1,200 writes/second
Read:  10B redirects/day → ~115,000 reads/second
Storage: 100M * 365 * 5 years * 500 bytes ≈ 90 TB over 5 years

The Core Problem: Short Code Generation

The most interesting design decision is how to generate the short code. You have three main options.

Option 1: Hash + Truncate

MD5 or SHA-256 the long URL, take the first 7 characters.

MD5("https://example.com/very/long/url") = "1a2b3c4d5e6f..."
Short code = "1a2b3c"  (first 6 chars of base62-encoded hash)

Problem: Hash collisions. Two different URLs can produce the same prefix. You need a collision check on every write — if collision, try 7 chars, then 8, etc.

When to use: Works fine for moderate scale where collisions are rare and a DB lookup on write is acceptable.

Option 2: Counter + Base62 Encoding

A global auto-increment counter. Every new URL gets the next number, encoded in base62 (a-z, A-Z, 0-9).

Counter: 1        → base62 → "b"
Counter: 1000000  → base62 → "4c92"
Counter: 56800235584  → base62 → "zzzzzz"  (6 chars = 56B entries)

Problem: The counter is a single point of failure and a bottleneck. Solutions:

Use a distributed counter (Zookeeper, Redis INCR) — adds complexity and latency
Pre-allocate ranges to app servers (each server gets 1M IDs at a time, hands them out locally)

When to use: Best for predictable, sequential IDs. Reveals enumeration info (IDs are guessable in order) — acceptable for most services, not for private sharing.

Option 3: Random Base62

Generate 6–7 random base62 characters and check for uniqueness in the DB.

Python

import random, string
def generate_code(length=7):
    chars = string.ascii_letters + string.digits  # 62 chars
    return ''.join(random.choices(chars, k=length))
# 62^7 = ~3.5 trillion combinations

Problem: As the DB fills up, collision probability increases. At 10% fill (350B entries for 7 chars), you'll still have ~10% collision rate per attempt. Usually acceptable with retry logic.

Recommended approach for interviews: Counter with pre-allocated ranges. It's predictable, fast, and distributed without a hot single counter.

Data Model

Keep it simple:

SQL

CREATE TABLE urls (
    id          BIGINT PRIMARY KEY,
    short_code  VARCHAR(10) UNIQUE NOT NULL,
    long_url    TEXT NOT NULL,
    user_id     BIGINT,
    created_at  TIMESTAMPTZ DEFAULT now(),
    expires_at  TIMESTAMPTZ,
    click_count BIGINT DEFAULT 0
);

CREATE UNIQUE INDEX idx_short_code ON urls(short_code);

For analytics (if required):

SQL

CREATE TABLE clicks (
    id          BIGINT PRIMARY KEY,
    short_code  VARCHAR(10) NOT NULL,
    clicked_at  TIMESTAMPTZ DEFAULT now(),
    ip_hash     VARCHAR(64),      -- hashed for privacy
    country     VARCHAR(2),
    referrer    TEXT,
    user_agent  TEXT
);

Keep clicks in a separate table (or separate service). Click volume dwarfs URL creation volume — don't let analytics writes block redirects.

Architecture

Browser → CDN/Edge → Load Balancer → API Servers → Cache (Redis)
                                                  ↓ (cache miss)
                                              DB Read Replica

Write path:

API server generates short code
Writes to primary DB
Optionally warms cache (SET short_code → long_url EX 86400)

Read path (the hot path):

Request hits CDN — if cached at edge, returns 301/302 with no origin hit
Cache miss → Redis lookup (sub-millisecond)
Redis miss → DB read replica
Returns redirect

301 vs 302:

301 Permanent — browser caches redirect; no future requests reach your servers. Good for performance, bad for analytics (you can't count clicks).
302 Temporary — every click hits your servers. Better for analytics, higher origin load.

Use 302 if you want click tracking. Use 301 for static links where analytics don't matter.

Caching Strategy

Redirects are the hot path. Cache aggressively.

Layer 1: CDN edge cache (Cloudflare, CloudFront)
         TTL: 1 hour for active links
Layer 2: Redis (per data center)
         TTL: 24 hours, LRU eviction
Layer 3: DB read replica
         Hot rows live in Postgres buffer pool anyway

The 80/20 rule applies hard here: 20% of short codes account for 80% of traffic. A modest Redis cluster handles this.

Cache invalidation: When a URL is deleted or expires, you need to evict from Redis. Either:

Set TTL equal to expiry time at write time
On delete, send a cache invalidation event via message queue

Expiry and Cleanup

Two mechanisms:

Check-on-read: When handling a redirect, check expires_at. If expired, return 410 Gone. Fast, no background job needed.
Background cleanup: A nightly job deletes expired rows. Keeps the DB clean but is optional — expired rows that are never accessed waste space but don't cause bugs.

Abuse Prevention

URL shorteners are a favourite tool for spam and phishing.

Basic controls:

Rate limit by IP: max 10 URL creations per minute per IP
Rate limit by account: max 1,000 per day for free tier
Blocklist of known malicious long URLs (Google Safe Browsing API)
Custom alias validation: alphanumeric only, reserved words blocked (api, admin, login)

For the interview: Mentioning Safe Browsing API integration always impresses. It shows you're thinking about abuse, not just the happy path.

Scaling Discussion

| Bottleneck | Solution | |-----------|---------| | Write throughput | Horizontal scale API servers; pre-allocated ID ranges eliminate counter bottleneck | | Read throughput | Redis cluster + CDN edge caching | | DB storage | Partition clicks table by month; archive old data to cold storage | | Single-region failure | Multi-region with DNS failover; Redis replication per region | | Hot short codes | CDN caching means top 1% of codes never reach origin servers |

What Interviewers Are Actually Testing

You nail the back-of-envelope before proposing any architecture
You compare code-generation strategies — not just "use a hash"
You separate the write path from the read path — they have different scaling requirements
You use caching correctly — CDN → Redis → DB, not just "add Redis"
You discuss the 301 vs 302 trade-off — this is a signal you've thought about analytics
You mention abuse prevention — shows product thinking, not just engineering

Quick Reference

Short code:    6–7 base62 chars → 56B–3.5T combinations
Generation:    Counter with pre-allocated ranges (fast, distributed)
DB:            PostgreSQL, index on short_code
Cache:         Redis (24h TTL) + CDN edge (1h TTL)
Redirect:      302 if analytics needed, 301 if not
Expiry:        Check on read + background cleanup
Rate limiting: Per-IP and per-account, Safe Browsing for abuse
Scale:         Read replicas for DB, Redis cluster for cache, CDN for edge

Deploying Microservices — Docker, Kubernetes & CI/CD

Next Lesson

Case Study: Design a Notification System (Email/SMS/Push)