System Design Interview
Design a Ride-Sharing App (Uber / Lyft)
Matching riders to nearby drivers in real time — geospatial indexing, supply/demand pricing, trip state machines
The Interview Question
"Design a ride-sharing application. Riders request a trip, drivers nearby are matched, and the system handles the full trip lifecycle from request to payment."
This question is one of the most complex in the interview canon. It combines real-time geospatial data, state machines, matching algorithms, surge pricing, and distributed coordination — all at massive scale.
Step 1: Requirements
Functional
- Rider opens app and sees nearby drivers on a map
- Rider requests a trip — system matches them to the best driver
- Driver accepts or declines; if declined, next driver is tried
- Both parties see each other's live location during the trip
- Trip has full lifecycle: requested → matched → en_route → arrived → in_trip → completed
- Payment processed at trip end
Non-functional
- 5 million trips per day (~58 per second)
- 3 million active drivers globally at peak
- Driver location updates every 4 seconds — 750,000 location writes/second
- Match a rider to a driver in under 3 seconds
- Location data accurate to within 50 metres
Step 2: The Geospatial Problem — Finding Nearby Drivers
This is the heart of the question. How do you efficiently find all drivers within 2km of a given point when you have 3 million drivers moving every 4 seconds?
Geohash
Geohash divides the world into a grid of cells, each represented by a short string. The key property: cells with similar prefixes are geographically nearby.
Precision 6 geohash: each cell ≈ 1.2km × 0.6km
Precision 7 geohash: each cell ≈ 150m × 150m
Rider at position P → geohash = "dp3wj6"
Find nearby drivers → query for all drivers with geohash prefix "dp3wj"
(neighbouring cells)How it works in practice:
Driver location update:
Compute geohash of (lat, lng) → "dp3wj6t"
Redis GEOADD drivers_geo {lng} {lat} {driver_id}
Rider requests trip at (lat=40.71, lng=-74.00):
Redis GEORADIUS drivers_geo -74.00 40.71 2 km ASC COUNT 10
→ returns 10 nearest drivers sorted by distanceRedis has native geospatial commands built on geohash. GEORADIUS returns the nearest N drivers within a radius in microseconds, even with millions of drivers indexed.
Step 3: Architecture
┌───────────────┐
│ Rider App │
└──────┬────────┘
│ request trip
┌─────────────▼────────────┐
│ API Gateway │
└──┬──────────┬────────────┘
│ │
┌────────────▼──┐ ┌────▼────────────────┐
│ Location │ │ Trip Service │
│ Service │ │ (state machine) │
│ │ └────────┬─────────────┘
│ ← driver │ │
│ updates │ ┌──────▼──────┐
│ every 4s │ │ Matching │
└───────┬───────┘ │ Service │
│ └──────┬──────┘
┌───────▼───────┐ │ notify driver
│ Redis Geo │ ┌──────▼──────┐
│ Index │ │ Driver App │
│ (live map) │ └─────────────┘
└───────────────┘
│ write history
┌───────▼───────┐
│ Location DB │
│ (Cassandra) │
└───────────────┘Step 4: Driver Location Updates — The Write Volume Problem
3 million drivers × 1 update every 4 seconds = 750,000 writes/second.
This cannot go directly to a relational database. Two separate concerns:
1. Live positions (for matching): Redis geospatial index. GEOADD updates a driver's position in microseconds. The entire live position dataset fits in memory — 3M drivers × ~100 bytes = 300MB.
2. Historical positions (for dispute resolution, analytics): Write to Cassandra. Partition by (driver_id, date), cluster by timestamp. Append-only, no updates, scales horizontally.
On every driver location update:
→ Update Redis geospatial index (synchronous, fast path)
→ Enqueue to Kafka (async, durable path)
→ Kafka consumer writes to Cassandra (background, doesn't block driver app)If Redis loses its data (restart/failure), it replays from Kafka to rebuild the live index.
Step 5: Trip State Machine
Every trip progresses through states. Invalid transitions must be rejected — you cannot move from requested to completed without going through in_trip.
┌──────────────┐
│ REQUESTED │ ← rider submits trip request
└──────┬───────┘
│ driver matched
┌──────▼───────┐
│ MATCHED │ ← driver accepted
└──────┬───────┘
│ driver moving to rider
┌──────▼───────┐
│ EN_ROUTE │
└──────┬───────┘
│ driver at pickup location
┌──────▼───────┐
│ ARRIVED │
└──────┬───────┘
│ rider gets in, trip starts
┌──────▼───────┐
│ IN_TRIP │
└──────┬───────┘
│ destination reached
┌──────▼───────┐
│ COMPLETED │ → trigger payment
└──────────────┘
CANCELLED is reachable from: REQUESTED, MATCHED, EN_ROUTE, ARRIVEDThis state machine lives in the Trip Service. The trip row in PostgreSQL stores current_state. Every state transition is an atomic DB update with an optimistic lock check: "only update to state X if current state is Y." If two events try to transition simultaneously, one wins and the other retries or fails gracefully.
Step 6: Matching Algorithm
The matching service has a deceptively hard sub-problem: which driver should be matched to this rider?
Naive: pick the nearest driver.
Reality: nearest driver might be stuck in traffic, might be heading away from the rider's destination, might decline frequently (lowering platform reliability).
Production factors:
Score = f(
distance_to_rider,
estimated_time_to_rider (ETA, not just distance — accounts for traffic),
driver_acceptance_rate,
driver_rating,
direction_alignment (is the driver heading toward the rider's destination?)
)For an interview, describe the scoring function rather than the specific weights. The important point: it's not just nearest-driver, it's a multi-variable optimisation problem.
Timeout and retry:
Match attempt:
Find top 3 drivers by score
Send match request to driver #1, wait 10 seconds
→ Accept: create trip
→ Decline or timeout: try driver #2
→ Repeat for driver #3
→ No match: notify rider, retry search with wider radiusStep 7: Surge Pricing — Why It's an Engineering Problem
Surge pricing isn't just a business decision — it's an infrastructure component that prevents cascade failures.
When demand spikes (e.g., rain, concerts, airport rush), without surge:
- All riders request trips simultaneously
- Matching service is overwhelmed
- No drivers are available
- Every request fails or times out
With surge:
- Higher price → some riders wait or cancel (demand reduction)
- Higher earnings → more drivers go online (supply increase)
- System stabilises
Implementation:
Every 60 seconds:
Count open ride requests per geohash zone (demand)
Count available drivers per zone (supply)
Surge multiplier = f(demand / supply) — published to a config cache
Client reads surge multiplier from cache before showing price
Trip price = base_fare × surge_multiplierThe surge multiplier is computed by a background job and cached — it does not need real-time precision.
Step 8: Database Choice Summary
| Data | Database | Why | |------|----------|-----| | Live driver positions | Redis Geospatial | In-memory, sub-ms geo queries | | Trip state / metadata | PostgreSQL | ACID, state machine transitions, payment records | | Historical locations | Cassandra | 750K writes/sec, append-only, time-series | | Driver/rider profiles | PostgreSQL | Relational, low write volume | | Surge pricing config | Redis | Config cache, read-heavy, fast TTL refresh |
What the Interviewer Is Actually Testing
- Do you identify the geospatial index problem and propose Redis geohash or similar?
- Do you separate live positions (Redis) from historical positions (Cassandra)?
- Do you design a state machine for the trip lifecycle?
- Can you explain matching beyond "find the nearest driver"?
- Do you handle the 750K writes/second location update problem explicitly?
- Do you understand that surge pricing is both a business and an engineering feature?
Related Case Studies
Go Deeper
Case studies teach the "what". Our courses teach the "how" — the patterns behind these decisions, built up from first principles.
Explore Courses