Design a URL Shortener (bit.ly)

The Interview Question

"Design a URL shortener like bit.ly. Users paste a long URL and get a short code. Anyone with the short code can be redirected to the original URL."

This question is deceptively simple. The interesting problems are: how do you generate unique short codes at scale, how fast does the redirect need to be, and what happens when the same short URL gets hit by a million people simultaneously.

Step 1: Define the Requirements

Functional requirements

Given a long URL, generate a unique short URL (e.g., sho.rt/aB3kP9)
Redirect anyone who visits the short URL to the original
Optional: custom aliases (sho.rt/my-brand)
Optional: analytics (click count, referrer, geography)

Non-functional requirements

100 million new URLs created per day (~1,150 writes/sec)
10 billion redirects per day (~115,000 reads/sec) — heavily read-skewed (100:1)
Short codes must be unique — collisions cause wrong redirects
Redirect latency: under 10ms (users feel redirects; this is a UX requirement)
URLs should not expire unless explicitly set

Scale implication: Over 5 years, ~180 billion URLs stored. At ~500 bytes per record, that's ~90TB of structured data. Manageable in a distributed database.

Step 2: The Short Code — How Do You Generate It?

This is the core of the question. Most candidates say "hash the URL" — that's wrong.

Option A: Hash the URL (MD5 / SHA-256)

MD5("https://example.com/very/long/path") → "a9993e364706816aba3e25717850c26c"
Take first 7 characters                  → "a9993e3"

Problems:

Two different URLs that hash to the same 7-character prefix → collision
The same URL always produces the same code — if you shorten the same URL twice you get the same code (might be desired, might not be)
Hash space of 7 hex chars = 16^7 = ~268 million. At 100M URLs/day you exhaust this in 3 days.

Option B: Base62 Encoding of an Auto-Increment ID

Database ID: 1,000,000,000
Base62 encode (0-9, a-z, A-Z): → "15FTGg"  (6 characters)

Base62^6 = 56 billion combinations — enough for ~1.5 years at 100M/day
Base62^7 = 3.5 trillion combinations — enough for 95+ years

Start with 6 characters, add a 7th when needed. Simple, no collisions, globally sortable.

The problem: a single auto-increment counter in one database is a bottleneck at 1,150 writes/sec, and a single point of failure.

Option C: Distributed ID Generation (Snowflake-style)

64-bit ID structure:
[ 1 bit unused | 41 bits timestamp (ms) | 10 bits machine ID | 12 bits sequence ]

Timestamp:   ~69 years before overflow
Machine ID:  up to 1,024 generator nodes
Sequence:    4,096 IDs per millisecond per node

Total capacity: 4,096 × 1,024 nodes = 4.2 million IDs/ms

Each generator node produces IDs independently — no coordination, no single point of failure. Twitter built this, called it Snowflake. Every major URL shortener uses this pattern or a variation.

Step 3: Architecture

Write path (create short URL):
  Client → API Gateway → URL Service → ID Generator → get unique 64-bit ID
                                    → Base62 encode → "aB3kP9"
                                    → write (short_code, original_url, created_at) to DB
                                    → optionally write to Redis cache
                                    → return short URL to client

Read path (redirect):
  Client → visits sho.rt/aB3kP9
         → Load Balancer → Redirect Service
         → check Redis cache: HIT → return 301/302 redirect immediately
                             MISS → query DB → write to cache → return redirect

┌───────────────┐    write     ┌──────────────────┐
│  URL Service  │─────────────▶│   PostgreSQL /   │
│ (create URL)  │              │   Cassandra DB   │
└───────────────┘              └──────────────────┘
                                        │
┌───────────────┐    read              │
│  Redirect     │─────────────▶ Redis  │
│  Service      │◀── HIT               │
│               │─── MISS ────────────▶│
└───────────────┘

Step 4: Database Choice

PostgreSQL

Good choice if:

You want ACID transactions
Your scale is under ~500M rows with proper indexing
You want SQL for analytics queries

Schema:

urls
─────────────────────────────
short_code    CHAR(7) PRIMARY KEY
original_url  TEXT NOT NULL
user_id       UUID (nullable for anonymous)
created_at    TIMESTAMPTZ
expires_at    TIMESTAMPTZ (nullable)
click_count   BIGINT DEFAULT 0

Index: (short_code) — this is the hot path, already the primary key

Cassandra

Better choice if:

You need horizontal write scale across data centres
You're writing > 10K URLs/sec
You can accept eventual consistency on click counts

Cassandra's primary key model maps perfectly to this use case — you always look up by short_code, never scan. Reads and writes are O(1) distributed hash lookups.

For an interview: start with PostgreSQL, explain the migration path to Cassandra when you cross 500M rows or 10K writes/sec. This demonstrates pragmatism.

Step 5: The Redirect — 301 vs 302

A detail interviewers probe:

| | 301 Permanent | 302 Temporary | |---|---|---| | Browser caches | Yes — forever | No | | Analytics tracking | Breaks (browser skips you) | Works (every click hits your server) | | Server load | Lower (cached redirects) | Higher |

If you want analytics (click counting): use 302. If you want maximum performance and don't need analytics: use 301.

Most URL shorteners use 302 because click analytics is a revenue driver.

Step 6: Caching — The 80/20 Rule

URL access follows a power law: ~20% of short codes account for ~80% of traffic. A viral tweet with a shortened link might receive 10 million clicks in one hour.

Redis cache: short_code → original_url
TTL: 24 hours (re-warm from DB on miss)
Eviction: LRU (Least Recently Used)

Cache hit rate target: 95%+
At 95% hit rate on 115,000 redirects/sec:
  109,250 served from Redis (sub-millisecond)
  5,750 hit the database

The Redis key is just the short code. At ~200 bytes per entry and 100M active URLs, the full working set is ~20GB — fits in a single Redis instance, shard if needed.

Step 7: Thundering Herd Problem

When a viral link gets shared and a million people click it simultaneously, every one of those requests that arrives before the cache is warm will hit the database in parallel — potentially crashing it.

Solution: Cache locking

When the cache is empty for a popular key, only one request goes to the database. All others wait on a distributed lock (Redis SETNX) and then read the result the first requester wrote.

This converts a million parallel DB queries into one DB query and 999,999 cache reads.

Step 8: Cost Control

| Decision | Cheaper alternative | Why | |----------|-------------------|-----| | Store full URL in DB for every request | Deduplicate identical URLs (same URL → same code) | Cuts storage for popular destinations | | Redirect service queries DB every time | Redis cache, 24h TTL | 95% cost reduction on DB reads | | Keep all URLs forever | Expire anonymous URLs after 1 year | Reduces storage linearly over time | | Run own infrastructure | Serverless redirect function (AWS Lambda@Edge) | Pay per redirect, not per idle server |

What the Interviewer Is Actually Testing

Can you explain why hashing is wrong for key generation?
Do you know what Base62 encoding is and why it's used?
Can you describe distributed ID generation (Snowflake)?
Do you understand 301 vs 302 and make a deliberate choice?
Can you explain the thundering herd and how to prevent it?
Do you think about the read/write ratio and size the cache accordingly?