System design interviews are graded on three things: whether you structure the problem, whether you know the building blocks, and whether you can reason about tradeoffs out loud. The good news — there's a repeatable structure that works for almost any prompt, and a small catalog of patterns that cover 80% of real interview problems. Learn both and the "open-ended" part of these interviews gets much more manageable.
Here's the cheatsheet.
The 7-step structure (use this for every prompt)
Whenever a prompt lands ("design Twitter," "design a rate limiter," "design a notification service"), walk through these seven steps in order. Out loud.
- Clarify requirements (3–5 min). Functional first, then non-functional. Ask about scale, read/write ratio, latency targets, consistency vs availability preference.
- Estimate the back-of-envelope (2–3 min). QPS, storage per year, bandwidth. Even rough numbers show you think in scale.
- Define the API (2 min). 3–5 endpoints. This forces concrete thinking.
- Sketch the high-level architecture (5 min). Clients, load balancers, services, data stores, caches. Keep it boxy.
- Deep-dive the two hardest pieces (10 min). The interviewer will often pick. If not, pick the data model and the hottest read path.
- Address bottlenecks and failures (5 min). What breaks at 10x? What happens when the cache dies? How do you handle hot keys?
- Summarize and flag tradeoffs (2 min). What you chose, what you didn't, and why.
45 minutes, seven steps. The candidates who time-box this well clear the bar. The ones who spend 25 minutes on the high-level sketch and never get to bottlenecks do not.
The six patterns that cover most problems
Nearly every system design prompt maps to one (or a combination) of these six patterns. Learn them and you stop reinventing the wheel.
1. Read-heavy content distribution
Examples: Twitter timeline, YouTube, news feed, product catalog.
Core moves:
- Precompute at write time (fan-out-on-write) or at read time (fan-out-on-read) — choose based on fanout distribution
- Aggressive caching at edges (CDN for content, Redis for precomputed feeds)
- Eventually consistent is usually fine
- Sharding by user or content ID
Trap: failing to distinguish celebrity-fanout problems (pull, not push) from normal-user problems (push, not pull).
2. Write-heavy ingest and aggregation
Examples: metrics, logs, analytics, clickstream, IoT.
Core moves:
- Buffer with a message queue (Kafka, Kinesis) — never write directly to the DB
- Stream processing for real-time aggregation (Flink, Spark Streaming)
- Time-series storage or columnar store
- Batch compact periodically
Trap: using a normalized OLTP database for write-heavy analytics. The interviewer is waiting for you to say Kafka.
3. Chat / realtime / notifications
Examples: WhatsApp, Slack, push notifications, live leaderboards.
Core moves:
- Long-lived connections (WebSocket) with a connection-management layer
- Message queue per user (or per channel) for durability
- Presence service decoupled from message delivery
- At-least-once delivery + client-side dedup
Trap: forgetting that "mobile clients drop connections all the time" is the hard part — not the core protocol.
4. Geo / location-aware services
Examples: Uber, Yelp, delivery, proximity search.
Core moves:
- Geohash or S2 cells for spatial indexing
- QuadTree or R-tree for range queries
- Separate write path (driver location updates at 5 Hz) from read path (rider proximity queries)
Trap: not distinguishing update rate (very high, not durable) from query rate (high, needs to be correct).
5. Rate-limiting / throttling / quota
Examples: API gateway, DDoS protection, abuse detection.
Core moves:
- Token bucket or leaky bucket algorithm
- Per-user and per-IP limits
- Distributed counter with Redis + atomic INCR
- Sliding window vs fixed window tradeoffs
Trap: using a naive counter that has race conditions. Mention atomicity explicitly.
6. Payment / transactions / consistency-critical
Examples: Stripe, double-entry ledger, banking, inventory.
Core moves:
- Strong consistency in the hot path (Postgres, Spanner)
- Idempotency keys for every external-facing endpoint
- Two-phase commit or saga for multi-service transactions
- Audit log / event sourcing for compliance
Trap: hand-waving "we'll just use a database transaction" across services. You can't. Say saga.
What senior vs staff answers actually sound like
This distinction catches a lot of candidates. Both senior and staff candidates can produce a reasonable architecture. The difference is how they reason.
Senior (L5/L6 range):
- Produces a clean architecture and names the right components
- Knows the standard tradeoffs (consistency vs availability, push vs pull)
- Handles the obvious bottlenecks (DB sharding, caching)
- Answers "what if we get 10x traffic?" correctly
Staff (L6/L7 range):
- Explicitly asks about org context: "who owns this? what's the team topology?"
- Names the second-order failure modes (cache stampedes, thundering herds, cold-cache recovery)
- Discusses operational concerns (deployment, rollback, observability) unprompted
- Makes and defends non-obvious tradeoffs: "I'd pick eventual consistency here even though it makes the UI worse, because the business tolerates staleness but cannot tolerate downtime during a region failure"
- Articulates what they would not build and why
If you're interviewing for staff+, practice the meta-moves above. If you're interviewing at senior, focus on structure and fluency with the six patterns.
Back-of-envelope numbers to memorize
Having these in your head makes estimation feel natural:
- 1 req/sec per user ≈ 100k DAU / hour of peak ≈ 27 QPS average, 100 QPS peak
- A single well-tuned Postgres instance: ~10k QPS for simple reads, ~1k for writes
- Redis: ~100k ops/sec per instance
- Kafka: ~1M messages/sec per cluster (cheap)
- SSD read: ~100 µs. Network round-trip same-region: ~0.5 ms. Cross-region: 50–100 ms.
- 1 TB/year = ~32 KB/sec average. Sanity-check your storage math against this.
Common mistakes that tank otherwise-strong candidates
- Drawing boxes before asking about requirements. Get scale and access pattern first.
- "I'd just use Kubernetes." Not a system design answer. Name the algorithmic and data choices.
- Forgetting the database schema. Interviewers love to ask. Have it sketched.
- No discussion of failure modes. What happens when the cache goes down? When a region fails?
- Over-engineering for imaginary scale. If the prompt says 1k QPS, don't propose a 50-service microservice mesh.
Practice plan for the next two weeks
Day 1–3: read through the six patterns, pick one example of each, sketch from scratch on paper.
Day 4–7: practice prompts out loud, 45-minute timer, speaking through the 7-step structure. Record yourself.
Day 8–10: two mock interviews with a friend. The friend's only job is to keep asking "why?" and "what happens when that fails?"
Day 11–14: taper. Do one prompt a day to stay sharp. Sleep.
Where to go next
- Pair this with behavioral prep from our STAR method deep dive
- Technical fundamentals in the software engineer interview questions guide
- For senior+ role targeting, see the software engineer salary guide to anchor your negotiation
- Structure your resume to reflect system-level impact — the software engineer resume example is a good reference
The bottom line
System design interviews reward structure over novelty. Run the 7-step flow every time. Know the six patterns cold. Be explicit about tradeoffs. Time-box ruthlessly. Candidates who do this consistently pass the bar regardless of whether they've seen the exact prompt before — because the prompt is always a recombination of things you've seen.
