Kafka Architecture: Topics, Partitions, Offsets & Consumer Groups

What Is Kafka?

Apache Kafka is a distributed event streaming platform. Unlike traditional message queues, Kafka:

Persists messages — events are stored on disk (configurable retention, default 7 days)
Replays at will — consumers can re-read old events
Scales horizontally — partitions distribute load across brokers
Decouples producers from consumers — many consumers can read the same event

When to use Kafka vs SQS/RabbitMQ:

| Kafka | SQS / RabbitMQ | |-------|---------------| | Event streaming (retain & replay) | Task queues (process once, discard) | | Multiple consumers per event | Each message consumed once | | High throughput (millions/sec) | Moderate throughput | | Complex event processing | Simple job dispatch | | Audit logs, CDC, analytics | Background jobs, notifications |

Core Architecture

Producers
  ↓  (publish events)
Kafka Cluster
├── Broker 1
├── Broker 2
└── Broker 3
  ↓  (subscribe)
Consumer Groups
├── Group A (analytics-service)
└── Group B (notification-service)

Topics

A topic is a named stream of events — like a table in a database, or a channel in a chat app. Events are appended to the topic and retained for the configured period.

Topic: "appointments"
───────────────────────────────────────────►
0: {id:"APT-1", event:"created",  clinic:"CLN-001"}
1: {id:"APT-1", event:"confirmed",clinic:"CLN-001"}
2: {id:"APT-2", event:"created",  clinic:"CLN-002"}
3: {id:"APT-1", event:"cancelled",clinic:"CLN-001"}

Partitions

Each topic is split into partitions — ordered, immutable sequences of events. Partitions enable parallelism: multiple brokers host different partitions, multiple consumers read different partitions concurrently.

Topic: "appointments" (3 partitions)

Partition 0: [0, 3, 6, 9, ...]   ← events for CLN-001
Partition 1: [0, 1, 4, 7, ...]   ← events for CLN-002
Partition 2: [0, 2, 5, 8, ...]   ← events for CLN-003

Partition key determines which partition a message goes to. Same key → same partition → guaranteed ordering per key.

Python

# Messages with the same key go to the same partition
producer.send("appointments",
    key="CLN-001",     # all CLN-001 events → same partition → ordered
    value=event_data
)

Offsets

Each message within a partition has a sequential offset (0, 1, 2, ...). Consumers track their position by committing offsets — this is how Kafka knows what each consumer has processed.

Partition 0:  [0][1][2][3][4][5]
                          ↑
                     committed offset (consumer has processed up to here)

Replication

Each partition has a leader and replicas on other brokers:

Partition 0:
  Broker 1 → LEADER    (receives writes, serves reads)
  Broker 2 → FOLLOWER  (replicates from leader)
  Broker 3 → FOLLOWER  (replicates from leader)

If Broker 1 fails, Kafka elects a new leader from the followers. Set replication.factor=3 for production.

min.insync.replicas=2 — a write is only acknowledged when at least 2 replicas (including leader) have written it. Prevents data loss on broker failure.

Consumer Groups

A consumer group is a set of consumers that share the work of reading from a topic. Each partition is assigned to exactly one consumer within the group:

Topic: "appointments" (3 partitions)

Consumer Group "analytics":
  Consumer A → reads Partition 0
  Consumer B → reads Partition 1
  Consumer C → reads Partition 2

Two different consumer groups read independently — they each get all the events:

Group "analytics"      → sees all events (for analytics)
Group "notifications"  → sees all events (for sending reminders)

Rebalancing: When a consumer joins or leaves, Kafka redistributes partitions. During rebalance, the group pauses briefly.

Consumer Group Sizing Rule

You can't have more consumers than partitions. Extra consumers sit idle:

3 partitions, 5 consumers → only 3 consume, 2 are idle

Scale partitions when you need more parallelism.

Delivery Guarantees

| Guarantee | How | Risk | |-----------|-----|------| | At most once | Commit offset before processing | Message loss if processing fails | | At least once | Commit offset after processing | Duplicate processing on retry | | Exactly once | Idempotent producer + transactional consumer | Complex, some overhead |

At-least-once is the default and most common. Design your consumers to be idempotent (safe to re-process the same message).

Key Configuration Parameters

Producer

| Config | Recommended | Effect | |--------|------------|--------| | acks=all | Production | Wait for all ISR replicas to confirm | | retries=2147483647 | Production | Retry on transient failures | | enable.idempotence=true | Production | Exactly-once delivery at producer level | | compression.type=snappy | Production | Compress batches — reduces network I/O | | linger.ms=5 | High throughput | Buffer messages for 5ms to batch | | batch.size=65536 | High throughput | 64KB batch size |

Consumer

| Config | Recommended | Effect | |--------|------------|--------| | auto.offset.reset=earliest | Production | Start from beginning on new group | | enable.auto.commit=false | Production | Manual commit for reliability | | max.poll.records=500 | Production | Process 500 records per poll | | session.timeout.ms=45000 | Production | 45s before consumer is declared dead |

Topics in Practice

Naming Conventions

..

appointments.appointments.created
appointments.appointments.status_changed
appointments.appointments.cancelled
patients.patients.registered
calls.calls.started
calls.calls.ended
calls.transcripts.created

Topic Configuration (via Terraform)

HCL

resource "kafka_topic" "appointments" {
  name               = "appointments.appointments.created"
  replication_factor = 3
  partitions         = 12    # 12 allows up to 12 parallel consumers

  config = {
    "retention.ms"          = "604800000"   # 7 days
    "cleanup.policy"        = "delete"
    "min.insync.replicas"   = "2"
    "compression.type"      = "snappy"
    "max.message.bytes"     = "1048576"     # 1MB max message
  }
}

Log Compaction (for state topics)

HCL

resource "kafka_topic" "clinic_state" {
  name       = "clinics.clinics.state"
  partitions = 6

  config = {
    "cleanup.policy" = "compact"   # keep latest value per key
    "retention.ms"   = "-1"        # never delete
  }
}

Compacted topics keep only the latest message per key — perfect for a "current state" view:

Key: CLN-001 → only the most recent clinic record
Key: CLN-002 → only the most recent clinic record

Kafka vs Alternatives

| Use Case | Use | |----------|-----| | High-throughput event streaming | Kafka | | Simple background jobs | SQS, RabbitMQ | | Real-time analytics pipeline | Kafka | | Dead-letter queue with visibility timeout | SQS | | Fan-out to many consumers | Kafka or SNS + SQS | | Ordered processing per entity | Kafka (partition by key) | | Serverless (Lambda triggers) | SQS, SNS |