Publish/Subscribe
Master the Pub/Sub pattern: topics, subscriptions, fan-out, message filtering, durable subscriptions, competing consumers, at-least-once delivery, and how Pub/Sub compares to direct messaging and point-to-point queues.
Publish/Subscribe (Pub/Sub) is the pattern that makes event-driven architecture possible at scale. It is simple in concept — publishers send messages to a topic; subscribers receive them — but its power comes from the decoupling it creates and the topology flexibility it enables. This lesson covers how Pub/Sub works, the delivery guarantees it can offer, the filter and routing capabilities that make it practical, and the operational realities of running Pub/Sub systems in production.
What Pub/Sub Is
Publish/Subscribe is a messaging pattern where:
- Publishers send messages to a named topic (a logical channel) without knowing who will receive them
- Subscribers register interest in a topic and receive all messages published to it
- The broker manages the topic, stores messages, and delivers them to all subscribers
Publisher A ──►
Publisher B ──► [ order.placed topic ] ──► Subscriber 1 (Inventory)
──► Subscriber 2 (Billing)
──► Subscriber 3 (Analytics)
──► Subscriber 4 (Notification)The defining property: each message is delivered to every subscriber (fan-out). This is different from a queue, where each message is delivered to exactly one consumer.
Pub/Sub vs. Point-to-Point Queue
These two patterns are often confused because they both use a broker and both involve async message delivery. They solve different problems:
| Property | Point-to-Point Queue | Pub/Sub Topic | |----------|----------------------|---------------| | Delivery | One consumer per message | All subscribers per message | | Purpose | Work distribution / load balancing | Event broadcasting / fan-out | | Consumer awareness | Consumers share a queue | Subscribers are independent | | Adding a consumer | Shares the load | Receives a full copy of all messages | | Typical use | Task queues, job processing | Event notifications, data pipelines |
Queue: 1 message → 1 consumer (work is shared)
Topic: 1 message → N consumers (each gets their own copy)
Topics and Subscriptions
Topic
A topic is a named logical channel. Publishers do not know who subscribes to a topic — they simply write to it.
Topic naming conventions matter for discoverability and routing:
com.organisation.domain.entity.event-type
com.systemforge.orders.order.placed
com.systemforge.payments.payment.failed
com.systemforge.inventory.stock.lowSubscription
A subscription is a consumer's named interest in a topic. Each subscription maintains its own independent cursor (position) in the message stream — one subscriber falling behind does not affect others.
topic: order.placed
├── subscription: inventory-service (cursor at message 1205)
├── subscription: billing-service (cursor at message 1198)
└── subscription: analytics-pipeline (cursor at message 900)Each subscription processes the message stream at its own pace. The analytics pipeline is behind — that is fine; it does not affect inventory or billing.
Durable vs. Non-Durable Subscriptions
Durable subscriptions persist even when the subscriber is offline. Messages that arrive while the subscriber is down are stored and delivered when it reconnects.
Non-durable subscriptions exist only while the subscriber is connected. Messages that arrive when the subscriber is offline are lost.
Use durable subscriptions for any production integration where message loss is not acceptable.
Fan-Out
Fan-out is the mechanism by which a single published message is delivered to multiple subscriptions independently:
message: OrderPlaced { orderId: 1234 }
│
├──► inventory subscription → Inventory Service processes the message
├──► billing subscription → Billing Service processes the message
└──► notify subscription → Notification Service processes the messageFan-out is the primary value proposition of Pub/Sub. It means:
- The publisher needs to make one write to publish to all consumers
- Adding a new consumer requires zero changes to the publisher or any existing consumer
- Each consumer's failure is isolated — if Billing fails, Inventory continues processing
Message Filtering
Without filtering, every subscriber receives every message on the topic — including messages it does not care about. Filtering reduces unnecessary processing and network transfer.
Subscription-Level Filters
Most Pub/Sub brokers support filter expressions evaluated at the broker before delivery:
Azure Service Bus Topics — SQL filter:
-- Only deliver messages where region = 'EU'
region = 'EU'
-- Only deliver high-value orders
amount > 1000 AND currency = 'GBP'AWS SNS — filter policy:
{
"eventType": ["order.placed", "order.updated"],
"region": ["EU", "UK"]
}Google Cloud Pub/Sub — message filtering:
attributes.region = "EU" AND attributes.priority = "high"Filters are evaluated against message attributes (metadata in the message header), not the message body. Keep filter-relevant data in attributes so the broker does not have to deserialize the payload for every message.
Topic-Per-Event-Type vs. Filtering
Two strategies for managing multiple event types:
Separate topics per event type:
order.placed ── subscribers for placed events
order.cancelled ── subscribers for cancelled events
order.shipped ── subscribers for shipped eventsSingle topic with filtering:
orders topic ── inventory subscription (filter: type = 'placed')
── billing subscription (filter: type = 'placed' OR 'cancelled')
── audit subscription (no filter — receives all events)Guidance:
- Separate topics when event volume is high and routing would be expensive
- Filtering when the events are naturally grouped and most consumers need multiple types
Competing Consumers (Queue + Pub/Sub Combined)
Pub/Sub fan-out and queue-based competing consumers are complementary:
topic: order.placed
├── subscription: inventory-workers
│ ├── consumer instance 1 ─┐
│ ├── consumer instance 2 ─┼── competing consumers sharing the subscription
│ └── consumer instance 3 ─┘
└── subscription: billing-workers
├── consumer instance 1
└── consumer instance 2Within each subscription, competing consumers share the workload — each message is delivered to exactly one consumer instance. This gives you both fan-out (across subscriptions) and horizontal scaling (within a subscription).
This pattern is native to:
- Azure Service Bus Topics — subscription = queue for competing consumers
- Apache Kafka — consumer group = competing consumers within the same partition
- AWS SNS + SQS — SNS fan-out to multiple SQS queues; each SQS queue has competing consumers
Delivery Guarantees
At-Least-Once Delivery
The standard guarantee in most Pub/Sub systems: the broker will deliver the message to each subscription at least once. In failure scenarios (consumer crashes after processing but before acknowledging), the message may be redelivered.
Consequence: every consumer must be idempotent — processing the same message twice must have the same effect as processing it once.
Idempotency techniques:
- Store processed message IDs in a deduplication table with a TTL; check before processing
- Use conditional database writes (upsert / optimistic concurrency)
- Design state transitions to be re-entrant
At-Most-Once Delivery
The broker delivers the message once and does not redeliver if delivery fails. Possible message loss. Appropriate only for high-volume, low-value events where occasional loss is acceptable (telemetry, metrics).
Exactly-Once Delivery
The hardest guarantee: delivered exactly once, no duplicates, no loss. Truly exact-once requires coordination between the broker and the consumer's storage system (a distributed transaction). Kafka supports this with idempotent producers and transactional APIs, but at a performance cost.
Practical advice: Build idempotent consumers rather than relying on exactly-once broker guarantees. Idempotency is more portable and more resilient than broker-level deduplication.
Message Ordering
Ordering Within a Topic
Most Pub/Sub systems do not guarantee global message ordering across all publishers. Within a single publisher, messages are usually ordered, but across multiple publishers they interleave unpredictably.
Kafka: ordering guaranteed within a partition. Messages for the same entity (same order ID, same customer ID) routed to the same partition via the partition key will arrive in order to any consumer in the same consumer group.
Azure Service Bus: ordering within a session (messages with the same SessionId are delivered in order to one consumer at a time).
Google Cloud Pub/Sub: ordering guaranteed when using an ordering key.
Designing for Out-of-Order Delivery
When ordering cannot be guaranteed, design consumers to handle it:
- Include a sequence number or event timestamp in the message
- Use a sequence buffer: hold messages until you have received all preceding messages before processing
- Apply the last-write-wins rule for state updates when re-ordering is acceptable
Push vs. Pull Delivery
Push Delivery
The broker delivers messages to consumers by calling a consumer endpoint (webhook/HTTP). Consumers do not poll.
Broker ──► POST /webhook → Consumer Service
(broker drives the delivery rate)Advantages: Simple consumer — just expose an HTTP endpoint. Broker handles timing and retries.
Disadvantages: Consumer must be publicly reachable. Broker controls delivery rate — consumer must handle backpressure.
Pull Delivery
Consumers poll the broker to retrieve messages at their own pace.
Consumer Service ──► GET /messages?maxCount=10 → Broker
◄── [batch of messages] ◄──Advantages: Consumer controls the rate. Works behind firewalls. Natural backpressure — the consumer fetches when ready.
Disadvantages: Consumer must actively poll. Latency depends on poll frequency.
Most enterprise messaging systems support pull. Many also support push for webhook-style integrations (AWS SNS → HTTP endpoint, Azure Event Grid, Google Cloud Pub/Sub push subscriptions).
Retention and Replay
Message Retention
How long does the broker keep messages after delivery?
| System | Default Retention | Maximum Retention | |--------|-------------------|-------------------| | Apache Kafka | Configurable (time or size) | Indefinite (with tiered storage) | | Azure Service Bus | 1 day | 14 days | | Azure Event Hub | 1 day | 90 days (with Capture) | | AWS SQS | 4 days | 14 days | | Google Cloud Pub/Sub | 7 days | 7 days |
Kafka's indefinite retention is its killer feature for integration: consumers can replay the entire event history at any time. This enables:
- Rebuilding a failed consumer's state from scratch
- Onboarding a new consumer that needs historical data
- Replaying events after a bug fix to re-process affected records
Replay from a Subscription
For brokers without infinite retention, implement your own replay capability:
- Archive all messages to cold storage (Azure Blob Storage, S3, GCS) as they are published
- Build a replay service that reads from cold storage and republishes to the topic
Dead Letter Queue in Pub/Sub
Each subscription should have a Dead Letter Queue (DLQ) — a separate channel for messages that failed to process after exhausting retries.
subscription: inventory-service
└── On message processing failure × N:
Route to: inventory-service.DLQDLQ configuration per subscription is critical — a message that poisons one subscription's DLQ must not affect other subscriptions' processing.
DLQ operations:
- Alert immediately when any message lands in a DLQ
- Investigate root cause before resubmitting
- Fix the root cause (data, code, or configuration)
- Resubmit in controlled batches, monitoring for re-failure
Platform Comparison
| Platform | Model | Retention | Ordering | Filtering | Best For | |----------|-------|-----------|----------|-----------|----------| | Apache Kafka | Partitioned log | Indefinite | Per-partition | Consumer-side | High-throughput event streaming, replay | | Azure Service Bus Topics | Subscriptions | Up to 14 days | Sessions | SQL filter | Enterprise messaging, ordered workflows | | Azure Event Grid | Push-based | 24 hours | No | Event type, subject | Serverless triggers, cloud event routing | | AWS SNS + SQS | Fan-out to queues | 14 days (SQS) | FIFO queues | Filter policy | AWS-native fan-out | | Google Cloud Pub/Sub | Subscriptions | 7 days | Ordering keys | Attribute filter | GCP-native, push and pull | | RabbitMQ | Exchanges + queues | Configurable | Per-queue | Routing keys, headers | Flexible routing, on-premises |
Pub/Sub Anti-Patterns to Avoid
Chatty topics — publishing every tiny state change (field updates, cursor movements) as an event. The topic becomes high-volume noise, and consumers struggle to filter signal from it.
Overloaded topic — putting unrelated event types on the same topic to "keep it simple." Consumers receive irrelevant events; filtering becomes complex.
Consumer groups that do too much — one consumer group handling dozens of different event types. Split into focused consumers per domain.
Missing idempotency — trusting at-least-once delivery to mean "exactly once." It does not. Always build idempotent consumers.
No DLQ configuration — a consumer that fails continuously will hold up message processing or silently drop messages without a DLQ.
Unbounded subscriptions — creating subscriptions for testing or experiments and never cleaning them up. Orphaned subscriptions consume broker resources and retention space indefinitely.
Lesson Summary
- Pub/Sub delivers each published message to every subscriber independently. Fan-out is the defining property that makes it different from a point-to-point queue.
- Durable subscriptions maintain a per-subscriber cursor so each consumer progresses at its own pace without affecting others.
- Filtering at the subscription level keeps consumers from processing irrelevant messages — use message attributes, not payload parsing, for filters.
- Competing consumers (multiple instances sharing a subscription) provide horizontal scaling within the fan-out topology.
- At-least-once delivery is the standard — build idempotent consumers. Do not rely on broker-level deduplication.
- Kafka's event log with indefinite retention enables replay and late-joining consumers — the most powerful Pub/Sub capability for integration use cases.
Next: Messaging Systems
Enjoyed this article?
Explore the Integration Engineering learning path for more.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.