Learnixo
Back to Case Studies
backendadvanced 13 min read

System Design Interview

Design a High-Throughput Event Ingestion API in .NET

50K events/sec on Channel<T>, batch writes, backpressure, and when to bypass EF Core for Dapper

Key outcome: 50K events/sec with predictable p99 latency
System Design.NETC#PerformanceChannelEF CoreDapperBackpressureInterview Prep

The Interview Question

"Design an API that ingests 50,000 events per second from IoT devices and webhooks. Events must be persisted reliably with low latency on the accept path. How would you build this in .NET?"

This separates candidates who know ASP.NET request handling from those who understand backpressure, batching, and when EF Core is the wrong tool.


Step 1: Requirements

Functional

  • POST /events — accept JSON event batch (1–100 events per request)
  • Events queryable within 30 seconds for dashboards
  • Dead-letter queue for malformed payloads

Non-functional

  • 50,000 events/sec sustained, bursts to 80K
  • Accept path p99 under 20ms (return 202 quickly)
  • No event loss on process crash (durability after ACK)
  • Horizontal scale across 10 API instances

Event size: ~500 bytes average → ~25 MB/sec raw ingress.


Step 2: Wrong Approaches

Synchronous EF SaveChanges per request. One DB round-trip per HTTP request at 50K RPS = database meltdown. Even at 1K RPS, connection pool exhaustion is likely.

Unbounded in-memory queue. Process crash = lost events. OOM under burst traffic.

Single auto-increment ID generator. Bottleneck; use Snowflake-style IDs or DB-independent UUIDs.


Step 3: Accept Fast, Process Async

POST /events
  1. Validate schema (minimal — type, timestamp, tenantId)
  2. Write to Channel (bounded capacity)
  3. Return 202 Accepted immediately

Background consumer:
  4. Drain channel in batches of 500
  5. Bulk INSERT to PostgreSQL (COPY or multi-row INSERT)
  6. On channel full → return 503 with Retry-After
C#
public class EventIngestionService
{
    private readonly Channel<EventBatch> _channel;

    public EventIngestionService()
    {
        _channel = Channel.CreateBounded<EventBatch>(new BoundedChannelOptions(10_000)
        {
            FullMode = BoundedChannelFullMode.Wait,
            SingleReader = false,
            SingleWriter = false,
        });
    }

    public async ValueTask<bool> TryEnqueueAsync(EventBatch batch, CancellationToken ct)
    {
        return await _channel.Writer.WaitToWriteAsync(ct).AsTask()
            && _channel.Writer.TryWrite(batch);
    }
}

[ApiController]
[Route("events")]
public class EventsController(EventIngestionService ingestion) : ControllerBase
{
    [HttpPost]
    public async Task<IActionResult> Ingest([FromBody] EventBatchDto dto, CancellationToken ct)
    {
        var batch = EventBatch.From(dto);
        if (!await ingestion.TryEnqueueAsync(batch, ct))
            return StatusCode(503, new { retryAfterSeconds = 5 });

        return Accepted(new { batchId = batch.Id });
    }
}

Step 4: Batched Persistence

C#
public class EventBatchWriter : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken ct)
    {
        await foreach (var batch in _channel.Reader.ReadAllAsync(ct))
        {
            _buffer.AddRange(batch.Events);
            if (_buffer.Count >= 500 || _flushTimer.Elapsed > TimeSpan.FromMilliseconds(100))
                await FlushAsync(ct);
        }
    }

    private async Task FlushAsync(CancellationToken ct)
    {
        // Dapper bulk insert — 10x faster than EF per-row
        const string sql = """
            INSERT INTO events (id, tenant_id, type, payload, received_at)
            SELECT * FROM UNNEST(@ids, @tenantIds, @types, @payloads::jsonb, @receivedAt)
            """;

        await _connection.ExecuteAsync(sql, new { /* arrays */ });
        _buffer.Clear();
    }
}

EF Core vs Dapper vs COPY — Decision Matrix

| Tool | Throughput | When to use | |------|------------|-------------| | EF Core AddRange + SaveChanges | Low | Complex domain logic per event | | Dapper multi-row INSERT | High | Simple event log, no change tracking | | PostgreSQL COPY / BinaryImporter | Highest | 50K+ rows/sec, append-only log |

Interview answer: "EF Core is wrong for the hot ingestion path — I'd use Dapper or Npgsql COPY. EF stays on the read/admin side if needed."


Step 5: Architecture

  Devices / Webhooks
         │
         ▼
┌─────────────────┐   bounded    ┌──────────────────┐
│  Kestrel (×10)  │──Channel─▶│  BatchWriter     │
│  POST /events   │              │  (per instance)  │
│  → 202 Accepted │              └────────┬─────────┘
└─────────────────┘                       │
                                          ▼
                                 ┌─────────────────┐
                                 │  PostgreSQL     │
                                 │  events (partitioned by day)
                                 └─────────────────┘

Partitioning: PARTITION BY RANGE (received_at) — drop old partitions instead of DELETE.

Alternative at higher scale: Accept path writes to Kafka / Azure Event Hubs, separate consumer group does DB writes. Adds operational complexity but decouples ingest from persistence completely.


Step 6: Backpressure & Health

C#
// Health check — fail readiness if channel is 90% full
public class IngestionHealthCheck : IHealthCheck
{
    public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext ctx, CancellationToken ct)
    {
        var utilization = (double)_channel.Count / _channel.Capacity;
        return Task.FromResult(utilization > 0.9
            ? HealthCheckResult.Degraded("Ingestion backlog high")
            : HealthCheckResult.Healthy());
    }
}

Kubernetes removes overloaded pods from rotation → load shifts to healthy instances.

Rate limiting: Per-tenant token bucket in Redis — prevents one tenant from starving others.


Step 7: Durability Trade-off

| ACK timing | Durability | Latency | |------------|------------|---------| | 202 after Channel write | Lost if process crashes before flush | Lowest | | 202 after DB commit | Fully durable | Higher (defeats purpose) | | 202 after Channel + WAL spill to disk | Durable with local file | Medium |

Production pattern: Write to Channel, background flusher persists to DB. For stricter durability, spill to a local append-only file (or Redis Stream) before ACK.


What Interviewers Are Testing

  1. Separation of accept and persist paths — never block HTTP on DB
  2. Channel<T> — bounded, backpressure, in-process producer-consumer
  3. Batching — amortise DB round-trips
  4. Tool choice — articulate why not EF on hot path
  5. Horizontal scale — stateless API, partition-aware DB
  6. Observability — channel depth metric, flush latency, events/sec

Strong closing: "I'd load-test with NBomber or k6, watch Gen2 GC and pool exhaustion, and only add Kafka when single-region PostgreSQL batching stops meeting SLO — not before."

Related Case Studies

Go Deeper

Case studies teach the "what". Our courses teach the "how" — the patterns behind these decisions, built up from first principles.

Explore Courses