Trade-off Analysis — CAP, Consistency vs Availability — Solution Architecture | Learnixo

Every Architecture Decision Is a Trade-off

There is no pattern that is universally best.
There is no ORM, architecture style, or deployment model that has no downsides.

Architecture is the act of choosing which trade-offs to live with,
given the specific context, team, and constraints of the system.

Questions to ask before every significant decision:
  → What does this make easier?
  → What does this make harder?
  → Who benefits from this choice?
  → Who pays the cost?
  → What becomes difficult to change later?
  → What constraint or requirement drives this choice?

If you cannot answer these questions, you are not ready to make the decision.

Consistency vs Availability

CAP theorem (simplified):
  In a distributed system, you can have at most two of:
    → Consistency: every read returns the latest write
    → Availability: every request gets a response
    → Partition tolerance: the system continues operating during network failures

  Network partitions happen. So you choose: consistency or availability.

Clinical example: Warfarin prescription approval
  Option A (Consistency):
    Prescription approval requires a synchronous INR check against the LabResults service.
    If LabResults is down, prescriptions cannot be approved.
    → Consistent, but unavailable when the upstream is unavailable.

  Option B (Availability):
    Prescription approval writes to an outbox. INR check is validated asynchronously.
    If LabResults is down, prescriptions are queued; approval completes after check.
    → Available, but the check result arrives with delay.

  Clinical decision: Option A for Warfarin specifically (clinical safety over availability).
  Option B is acceptable for non-critical medications.

The CAP trade-off is not abstract — it directly determines patient safety policy.

Coupling vs Autonomy

Tight coupling:
  Module A calls Module B's API synchronously.
  A is available only when B is available.
  A and B must deploy together when B's API changes.

Loose coupling (events):
  Module A publishes an event; Module B subscribes.
  A and B can deploy independently.
  B's internal implementation can change without affecting A.
  Cost: eventual consistency, event contract versioning complexity.

Example decision: Prescriptions module and INR validation

  Tightly coupled (simpler):
    await _labService.GetLatestInrAsync(patientId, ct);
    → Prescription handler fails if LabService is unreachable.
    → Simple to understand. Hard to test in isolation.
    → Both services must be deployed at compatible versions simultaneously.

  Loosely coupled (more robust):
    // Prescriptions subscribes to InrResultRecordedModuleEvent
    // Caches the latest INR per patient locally
    var localInr = await _db.InrSnapshots
        .Where(s => s.PatientId == patientId)
        .OrderByDescending(s => s.RecordedAt)
        .FirstOrDefaultAsync(ct);
    → Works when LabService is down.
    → INR value may be up to N minutes stale.
    → More complex: snapshot table, event handler, staleness policy.

Trade-off: simplicity vs resilience. Both are valid choices for different parts of the system.

Normalisation vs Read Performance

// Fully normalised (correct write model):
// To display a prescription list, you need 4 joins:
// prescriptions → patients → wards → medications

SELECT p.id, p.dose_amount, pat.full_name, w.ward_name, m.medication_name
FROM prescriptions.prescriptions p
JOIN patients.patients pat ON pat.id = p.patient_id
JOIN wards.wards w ON w.id = p.ward_id
JOIN medications.medications m ON m.id = p.medication_id
WHERE p.status = 'Active'
ORDER BY p.created_at DESC
OFFSET 0 ROWS FETCH NEXT 50 ROWS ONLY;

// Problem: cross-module join — violates module isolation.
// Also: 4 joins on 500 concurrent requests = slow.

// Denormalised read model (solves read performance AND isolation):
// Each module maintains its own denormalised read table

CREATE TABLE prescriptions.prescription_list_view (
    id               UNIQUEIDENTIFIER NOT NULL PRIMARY KEY,
    patient_full_name NVARCHAR(200)   NOT NULL,
    ward_name        NVARCHAR(100)    NOT NULL,
    medication_name  NVARCHAR(200)    NOT NULL,
    dose_amount      DECIMAL(10,4)    NOT NULL,
    status           NVARCHAR(50)     NOT NULL,
    created_at       DATETIME2        NOT NULL
);
// Updated via domain events — no join needed at query time.

// Trade-off:
//   + Reads are fast (single table scan)
//   + No cross-module joins
//   - Data may be seconds stale
//   - Extra table to maintain and keep consistent
//   - Event handler bugs cause stale read model data

Abstraction vs Simplicity

Every abstraction has a cost:
  → More code to write and maintain
  → More cognitive overhead for new developers
  → Potential over-engineering for a problem that doesn't exist yet

When abstraction pays:
  → You have a known variation point (multiple notification channels, multiple report formats)
  → You need to swap implementations (real DB vs test double)
  → The abstraction boundary matches a module or deployment boundary

When abstraction costs more than it saves:
  → YAGNI: "You Ain't Gonna Need It"
  → Adding IEmailService as an interface when there will never be a second implementation
     and there's no test that benefits from mocking it
  → Wrapping every repository in a Unit of Work when EF Core's DbContext is already a UoW

Example: Do you need IWarfarinDosageCalculator?
  If dosage calculation will never change → no. A static method or simple helper is enough.
  If dosage calculation depends on clinical guidelines that vary by hospital → yes.
  If you need to test handlers without triggering real dosage logic → yes.

The question is always: what real problem does this abstraction solve today?

Trade-off Decision Framework

Step 1: State the forces
  → What does Option A give us?
  → What does Option A cost us?
  → Same for Option B (and C if applicable)

Step 2: Apply context
  → Team size and experience level
  → Operational capability (who runs this in production?)
  → Regulatory constraints
  → Timeline and budget

Step 3: Weight by quality attributes
  → Which quality attributes are high priority for this system?
  → Which option better satisfies the top 2-3 quality attributes?

Step 4: Make the decision explicit
  → Write an ADR. Include the options and why you rejected each.
  → State what gets harder with the chosen option.

Step 5: Define a review point
  → "We will revisit this decision if: the team grows past 10, or response times exceed 1s."
  → Constraints change. Decisions should be reviewable.

Production issue I've seen: A team chose event sourcing for their entire clinical platform, attracted by the audit trail benefit. After 18 months, querying current prescription state required replaying thousands of events per patient. Simple reports that previously took 200ms took 8 seconds. The trade-off analysis had been done once, for the audit trail requirement, and the read-performance cost was acknowledged but dismissed as "we can add projections later." Two years of technical debt later, they were adding projections — while the system was live, under load, with data already in production. The read-performance trade-off of event sourcing needed to be part of the original decision with a concrete plan for handling it.

Key Takeaway

Every architecture decision trades something for something else. Consistency buys correctness at the cost of availability. Loose coupling buys autonomy at the cost of complexity. Denormalisation buys read speed at the cost of stale data and extra maintenance. Make trade-offs explicit: state what you gain, what you give up, and which quality attribute drives the choice. Write the decision in an ADR with a review condition so it can be revisited when the context changes.