Module 3: Integration Design and Planning

Integration architecture starts long before any code is written. The quality of an integration solution is determined more by how well it is designed and planned than by which technology it uses. This module covers the end-to-end process: from the first conversation with a business stakeholder to a signed-off integration design ready for build.

Gathering Integration Requirements

Why Integration Requirements Are Different

Application requirements ask: "What should this system do?" Integration requirements ask: "How should this system behave at its boundaries — what data does it send and receive, when, and under what conditions?"

Gathering integration requirements well prevents:

Discovering critical data format mismatches during development
Building the wrong integration direction (push vs. pull)
Missing non-functional requirements that determine technology choice
Building integrations that work in test but fail under production load

The Integration Requirements Workshop

Run a structured requirements workshop with representatives from:

The business owner — explains the process that needs to be automated or improved
The system owners — understand what each system can and cannot do
The IT operations team — knows about network constraints, security policies, monitoring requirements
The data team — can advise on data quality, master data, and canonical models

Requirements Checklist

Functional requirements:

[ ] What business process does this integration support?
[ ] What data needs to flow from which system to which system?
[ ] In which direction does data flow? (unidirectional, bidirectional, both at different times?)
[ ] What triggers the integration? (event, schedule, user action, inbound message?)
[ ] What happens when the integration completes? What happens when it fails?
[ ] Are there business rules that must be applied during the integration? (filtering, routing, enrichment?)

Data requirements:

[ ] What is the data model at the source?
[ ] What is the data model at the target?
[ ] What transformation is needed (field mapping, format conversion, aggregation)?
[ ] Is there a canonical data model, or will one need to be defined?
[ ] What are the data quality rules? What happens when a record fails validation?
[ ] What is the volume of data per integration run or per day?

Non-functional requirements:

[ ] What is the required latency? (real-time, near-real-time, batch?)
[ ] What is the throughput requirement? (messages per second, records per batch run?)
[ ] What is the availability SLA? (what percentage of the time must the integration be working?)
[ ] What is the recovery time objective (RTO) if the integration fails?
[ ] What are the security requirements? (encryption in transit, at rest, authentication method?)
[ ] Are there compliance or regulatory constraints? (GDPR data residency, HIPAA, PCI-DSS?)
[ ] Who needs to be notified when the integration fails?

Analysing System Landscapes and Identifying Integration Points

The System Landscape Map

Before designing any integration, produce a system landscape map — a high-level view of all systems involved, their owners, and the existing connections between them.

A basic landscape map captures:

┌─────────────────────────────────────────────────────────┐
│  System Name  │  Owner Team  │  Technology  │  APIs/Interfaces  │
├─────────────────────────────────────────────────────────┤
│  SAP ERP       │  Finance     │  SAP S/4HANA │  iDoc, REST API   │
│  Salesforce    │  Sales       │  Salesforce  │  REST API         │
│  Warehouse WMS │  Operations  │  Infor WMS   │  SOAP, CSV files  │
│  Data Platform │  Analytics   │  Azure       │  Event Hub, SQL   │
└─────────────────────────────────────────────────────────┘

Identifying Integration Points

Walk through each business process and identify every point where data crosses a system boundary. For each integration point, capture:

Source system and target system
Trigger — what initiates the data flow
Data entity — what data is exchanged (order, customer, invoice, event)
Direction — source → target, target → source, or bidirectional
Frequency — real-time, hourly batch, daily batch, on-demand
Criticality — what breaks if this integration fails?

Integration Inventory

Maintain an integration inventory (sometimes called an interface catalogue or integration registry) throughout the project. This becomes a living document that the operations team uses after go-live.

| ID | Name | Source | Target | Trigger | Frequency | Criticality | Status | |----|------|--------|--------|---------|-----------|-------------|--------| | INT-001 | Order to WMS | SAP ERP | Warehouse WMS | Order confirmed | Real-time | High | Live | | INT-002 | Customer sync | Salesforce | SAP ERP | Nightly batch | Daily | Medium | Design |

Designing Integration Solutions Based on Business Needs

The Design Process

Understand the business goal — not the technical requirement, but the underlying business outcome the integration enables
Define the integration contract — the agreed interface: message format, protocol, authentication, SLA
Select the integration pattern — based on coupling, latency, and reliability requirements (covered in Module 4)
Select the technology — based on pattern, ecosystem, and non-functional requirements (covered in Module 2)
Design the data mapping — document field-by-field transformation between source and target schemas
Design error handling — define what happens at every failure point
Produce the Integration Design Document (IDD)

The Integration Design Document

An IDD is the authoritative design artefact for a single integration. A well-structured IDD includes:

Overview — business context, integration purpose, systems involved
Interface contract — source schema, target schema, transformation rules
Sequence diagram — step-by-step message flow including error paths
Data mapping table — source field → target field, with transformation notes
Error handling — retry policy, DLQ routing, alerting, escalation
Security — authentication mechanism, encryption requirements
Non-functional — latency SLA, throughput requirement, availability target
Monitoring — what metrics to capture, what thresholds trigger alerts
Open issues — outstanding decisions or questions

Design for the Unhappy Path First

Most integration failures come from poorly designed error paths. Before designing the happy path, ask:

What happens if the source system sends malformed data?
What happens if the target system is unavailable?
What happens if the message is delivered twice?
What happens if the transformation fails for one record in a batch of 10,000?

Design these paths explicitly — do not leave them to developer discretion.

Planning Integration Projects and Creating Project Timelines

Why Integration Projects Are Often Underestimated

Integration projects consistently run over time because:

System owners are not available to answer questions or provide test environments
Undocumented system behaviours are discovered only during build
Data quality issues appear only when real data is used
Non-functional requirements (performance, security) are addressed too late
Testing is underestimated — integration testing is harder than unit testing

Integration Project Phases

Phase 1: Discovery (1–3 weeks)

Requirements workshops
System landscape mapping
Integration inventory
Dependency identification (who do we need access from, and when?)

Phase 2: Design (1–4 weeks)

Integration Design Documents for each interface
Canonical data model design (if required)
Security design review
Design sign-off from all system owners

Phase 3: Build (2–8 weeks, depending on scope)

Integration development
Unit testing of individual transformations and flows
Developer integration testing (system A → integration layer → system B in dev environments)

Phase 4: Testing (2–4 weeks)

System integration testing (SIT)
Performance testing
Security testing
User acceptance testing (UAT)

Phase 5: Deployment and Go-Live (1–2 weeks)

Deployment to production
Parallel running (if applicable)
Monitoring validation
Handover to operations

Estimation Guidelines

| Integration Complexity | Rough Build Estimate | |---|---| | Simple data pass-through (no transformation) | 1–3 days | | Field mapping + format transformation | 3–7 days | | Multi-step orchestration with business rules | 1–3 weeks | | Bidirectional sync with conflict resolution | 2–4 weeks | | Complex legacy system integration (SOAP, file-based) | 3–6 weeks |

Always add 20–30% contingency for discovery of undocumented behaviours and data quality issues.

Considering Scalability, Performance, and Security in Integration Design

Scalability

Design integrations to scale before they need to.

Horizontal scaling — add more consumer instances to process messages in parallel. Requires message idempotency (processing the same message twice must produce the same result).

Partitioning — in high-throughput event streaming (Kafka), partition topics by a key (e.g., customer ID) so messages for the same entity are always processed in order by the same consumer.

Back-pressure — design consumers to signal when they are at capacity so producers slow down, rather than overwhelming downstream systems.

Batch sizing — for database integrations, choose batch sizes that balance throughput against memory and lock contention. Test at production data volumes.

Performance

Define performance targets before design, not after.

Key metrics to specify upfront:

Throughput — messages per second or records per batch
End-to-end latency — time from event to target system update
Processing time — time the integration layer takes to transform and route
Backlog tolerance — how large can the message queue grow before business impact occurs?

Common performance anti-patterns:

Calling synchronous APIs inside a loop for each record in a batch (N+1 calls)
Not indexing lookup tables used in transformation
Not using connection pooling for database lookups
Holding database transactions open during network calls

Security

Build security into the design — do not bolt it on after build.

Authentication between systems:

Use service accounts with the minimum required permissions
Prefer mutual TLS (mTLS) or OAuth 2.0 client credentials for service-to-service auth
Never embed credentials in code — use a secrets manager (Azure Key Vault, AWS Secrets Manager, HashiCorp Vault)

Data in transit:

All integration traffic must use TLS 1.2 or higher
Verify certificates — do not disable certificate validation in production

Data at rest:

Messages stored in queues or event stores must be encrypted at rest
PII and sensitive data must be identified and handled according to data classification policy

Network:

Restrict integration endpoints to known IP ranges where possible
Use private endpoints for cloud messaging services
Never expose internal integration endpoints directly to the internet

Module Summary

Integration requirements must capture functional needs, data requirements, and non-functional requirements — all three are equally important.
The system landscape map and integration inventory are essential artefacts that document what exists and what needs to be built.
The Integration Design Document is the authoritative specification for each integration — it defines the contract, transformation, error handling, and security requirements.
Integration projects are consistently underestimated; build in contingency for undocumented system behaviours and data quality issues.
Scalability, performance, and security must be designed in from the start — retrofitting them after build is expensive and error-prone.

Next: Module 4 — Integration Patterns and Techniques