Module 4: Integration Patterns and Techniques

Patterns are the vocabulary of integration architecture. When an architect says "we'll use a scatter-gather here" or "route to the dead letter queue after three retries," they are referencing a shared catalogue of proven solutions to recurring integration problems. This module covers the most important patterns, when to apply them, and how to combine them into complete integration solutions.

Common Integration Patterns and Their Applications

The Enterprise Integration Patterns (EIP) catalogue, established by Gregor Hohpe and Bobby Woolf, defines 65 patterns across messaging channels, message construction, message routing, message transformation, and system management. This module covers the patterns you will use on almost every project.

Message Construction Patterns

Command Message
A message that tells the receiver to do something. The sender expects the receiver to execute the action.

{ "type": "CreateOrder", "orderId": "ORD-1234", "customerId": "CUST-99", "items": [...] }

Event Message
A message that announces something that happened. The sender does not know or care who receives it.

{ "type": "OrderPlaced", "orderId": "ORD-1234", "placedAt": "2026-04-18T10:30:00Z" }

Document Message
A message that transfers a data structure — no command, no event, just a document to be stored or processed.

{ "invoice": { "invoiceId": "INV-5678", "amount": 1250.00, "dueDate": "2026-05-18" } }

Request-Reply
A command message paired with a reply channel. The sender includes a replyTo address in the message header; the receiver sends its response to that address.

Use a correlation ID (a unique identifier included in both the request and reply) to match responses to requests when processing is asynchronous.

Message Routing Patterns

Content-Based Router
Inspects the message content and routes it to the appropriate channel based on what it contains.

if (message.region == "EU") → route to EU-orders-queue
if (message.region == "US") → route to US-orders-queue
if (message.region == "APAC") → route to APAC-orders-queue

Use case: Route orders to regional fulfilment systems based on delivery address.

Message Filter
Discards messages that do not meet a specified criterion, passing only those that do.

Use case: A downstream system only cares about high-value orders (> £10,000). Filter out all others.

Splitter
Takes a single message containing multiple items and splits it into individual messages, one per item.

Use case: A batch file arrives with 500 customer records. Split into 500 individual customer messages for parallel processing.

Aggregator
The inverse of the splitter. Collects related messages and combines them into a single message once a completion condition is met.

Use case: Collect all line items for an order before sending the complete order to the warehouse system.

Scatter-Gather
Sends the same message to multiple recipients simultaneously, collects their responses, and aggregates them into a single reply.

Use case: Send a price-check request to three supplier systems in parallel and return the lowest price.

Process Manager (Saga)
Maintains the state of a multi-step, long-running business process and coordinates the sequence of messages across multiple systems.

Use case: Order fulfilment workflow — check inventory, charge payment, allocate stock, dispatch shipping.

Point-to-Point Integration vs. Hub-and-Spoke Integration

Point-to-Point

Each system connects directly to the systems it needs to exchange data with. No intermediary.

System A ──────────────────► System B
    │
    └──────────────────────► System C

System B ──────────────────► System D

Advantages:

Simple for small numbers of integrations
No central infrastructure dependency
Low latency (no broker hop)

Disadvantages:

Scales as O(n²) — each new system potentially connects to every existing system
Transformation logic is duplicated across connections
Hard to monitor — no central point of visibility
Change is painful — modifying System A's output requires updating every consumer

When to use: Fewer than 5 systems, integrations are simple, team is small, or you are prototyping.

Hub-and-Spoke

All systems connect to a central hub (ESB, integration broker, or iPaaS). The hub handles routing, transformation, and protocol mediation.

System A ──►
System B ──► ESB / Hub ──► System C
System D ──►              ► System E

Advantages:

Scales as O(n) — each new system connects only to the hub
Single place to apply transformation, routing, and security
Centralised monitoring and logging
Loose coupling between systems — System A does not know about System C

Disadvantages:

Hub is a single point of failure (mitigate with clustering and HA configuration)
Hub becomes a bottleneck at high throughput
Changes to the hub affect all integrations
Requires skilled hub administrators

When to use: More than 5 systems, diverse protocols, centralised governance is required, or the organisation has an existing ESB investment.

Publish/Subscribe Model and Event-Driven Integration

Publish/Subscribe

In a pub/sub topology, publishers emit messages to a topic (a named channel) without knowing who the consumers are. Consumers subscribe to topics and receive all messages published to them.

Order Service ──► [order.placed topic] ──► Inventory Service
                                       ──► Billing Service
                                       ──► Notification Service
                                       ──► Analytics Service

Key characteristics:

Publishers and consumers are fully decoupled — neither knows the other exists
Adding a new consumer requires no change to the publisher
The broker delivers each message to all active subscribers (fan-out)
Durable subscriptions (in systems like Azure Service Bus Topics or Kafka) ensure consumers receive messages even if they were offline

Event-Driven Integration extends pub/sub by making events the primary integration mechanism across the entire system landscape. Systems no longer call each other directly — they emit events and react to events.

Event Schema Design

Good event schemas:

Are self-describing — include enough context for a consumer to act without making additional calls
Use a standard envelope — a consistent set of metadata fields (event ID, type, source, timestamp, version)
Follow CloudEvents specification for portability across event brokers

JSON

{
  "specversion": "1.0",
  "type": "com.systemforge.order.placed",
  "source": "order-service",
  "id": "abc-123",
  "time": "2026-04-18T10:30:00Z",
  "data": {
    "orderId": "ORD-1234",
    "customerId": "CUST-99",
    "totalAmount": 299.99
  }
}

Event Ordering and Consistency

Events are not always delivered in the order they were produced. Design consumers for out-of-order delivery:

Include a sequence number or timestamp in events so consumers can detect and handle out-of-order arrival
Use event versioning (event schema evolution) to handle schema changes without breaking consumers
Design consumers to be idempotent — processing the same event twice must not corrupt state

Transformation and Mapping Techniques

Transformation is required whenever source and target systems use different data models, formats, or protocols.

Structural Transformation

Field mapping — map a field in the source to a different field name in the target.

Source: customer.firstName  →  Target: contact.givenName
Source: customer.lastName   →  Target: contact.familyName

Format conversion — change the data format while preserving the same logical data.

Source: dateOfBirth: "18-04-1985"  →  Target: birthDate: "1985-04-18"
Source: amount: "1250.50"          →  Target: amount: 1250.50  (string → number)

Aggregation — combine multiple source fields into one target field.

Source: firstName + " " + lastName  →  Target: fullName

Splitting — break one source field into multiple target fields.

Source: fullAddress  →  Target: street, city, postcode, country

Canonical Data Model (CDM)

A canonical data model is a shared, neutral data format that all systems translate to and from. Instead of building n(n-1)/2 pairwise transformations, you build n transformations — one per system, to/from the CDM.

System A ──► [A→CDM transform] ──► CDM ──► [CDM→B transform] ──► System B
                                       ──► [CDM→C transform] ──► System C

CDMs add transformation complexity upfront but pay back significantly as the number of systems grows. Most enterprise integration platforms include a built-in CDM or provide tools to define one.

Protocol Mediation

Sometimes the source and target use different protocols. The integration layer translates between them:

REST (HTTP/JSON) ↔ SOAP (HTTP/XML)
HTTP ↔ AMQP (message queue)
File (CSV/XML) ↔ REST API
Database polling ↔ Event stream

Most integration platforms handle common protocol mediation natively. For less common combinations, a custom adapter may be required.

Error Handling and Exception Management in Integrations

Categories of Integration Errors

| Category | Examples | Handling Strategy | |---|---|---| | Transient | Network timeout, brief service unavailability | Retry with exponential backoff | | Data quality | Missing required field, invalid format | Reject to DLQ, alert data owner | | Business rule | Duplicate order ID, unknown customer | Reject, log, notify business | | Systemic | Target system down, schema mismatch | Alert operations, halt or pause flow | | Security | Auth failure, cert expiry | Alert security team immediately |

Retry Policies

Never retry immediately on failure. Use exponential backoff with jitter:

Attempt 1: fail → wait 1s
Attempt 2: fail → wait 2s
Attempt 3: fail → wait 4s
Attempt 4: fail → wait 8s + random jitter
Attempt 5: fail → route to Dead Letter Queue

The jitter (random offset) prevents all retrying consumers from hammering a recovering service simultaneously (the thundering herd problem).

Set a maximum retry count. Without a limit, a permanently broken message will be retried indefinitely, blocking healthy messages behind it.

Dead Letter Queue (DLQ)

The DLQ is a special channel where messages go after exhausting their retry budget. A DLQ is not a bin — it is a diagnostic and recovery tool.

DLQ best practices:

Alert on DLQ depth — a DLQ with growing message count requires immediate investigation
Preserve original message and error context — store the original message alongside the error reason and timestamp
Build a resubmission mechanism — make it easy to fix the root cause and replay DLQ messages
Review DLQ messages regularly — recurring patterns indicate a systemic issue

Saga Pattern for Distributed Transactions

When an integration involves multiple systems that must all succeed or all fail (a distributed transaction), use the Saga pattern.

A saga is a sequence of local transactions, each publishing an event. If any step fails, compensating transactions undo the previous steps.

Example: Order fulfilment saga

1. Reserve inventory     → success: emit InventoryReserved
2. Charge payment        → success: emit PaymentCharged
3. Dispatch shipment     → success: emit ShipmentDispatched

If step 3 fails:
  Compensate step 2: Refund payment
  Compensate step 1: Release inventory

The saga coordinator (or choreography via events) manages the state machine. The key principle: there are no distributed locks, only compensations.

Module Summary

Master the core EIP patterns: message construction (command, event, document), routing (content-based router, splitter, aggregator, scatter-gather), and process management (saga).
Point-to-point is simple but scales poorly; hub-and-spoke centralises governance but introduces a bottleneck — choose based on system count and governance needs.
Publish/subscribe decouples publishers and consumers completely; it is the preferred topology for event-driven integration at scale.
Canonical data models reduce transformation complexity as the system count grows — introduce one early, not retroactively.
Error handling is not optional: define retry policies, DLQ routing, and alerting for every integration. Sagas handle distributed transaction failure gracefully.

Next: Module 5 — Integration Governance and Security