CAP Theorem in Practice — Consistency, Availability, and Partition Tolerance

The Theorem Everyone Misquotes

Eric Brewer's CAP theorem states that a distributed data store cannot simultaneously guarantee all three of the following properties:

Consistency (C) — Every read receives the most recent write, or an error.
Availability (A) — Every request receives a non-error response (though it may not be the most recent data).
Partition Tolerance (P) — The system continues to operate despite network partitions (message loss or delay).

The "pick 2" framing is misleading. In any real distributed system, partitions will happen — network failures are a fact of life. The real choice is: when a partition occurs, do you sacrifice consistency or availability? You always need P. The choice is between CP and AP.

This distinction matters enormously in .NET systems that connect to external databases, message buses, and caches. Getting it wrong means either data corruption or phantom downtime.

CP Systems — Consistency Over Availability

A CP system guarantees that all nodes see the same data, but if a partition separates a node from the majority, that node will refuse to serve reads or writes rather than risk returning stale data.

PostgreSQL — The Quintessential CP Database

PostgreSQL with synchronous replication is CP. When you configure:

SQL

-- postgresql.conf on primary
synchronous_standby_names = 'replica1'
synchronous_commit = on

The primary will not acknowledge a write to the client until the replica has written the WAL (Write-Ahead Log) to disk. If replica1 goes unreachable, the primary blocks. Writes hang. That is the consistency guarantee at work — you will never have replica1 return data that primary hasn't committed, because primary waited.

This is the appropriate trade-off for financial ledger data, inventory counts, or any domain where "reading stale data" leads to real business harm (double-spending, overselling, regulatory violations).

SQL Server Always On — Synchronous Mode

SQL Server Availability Groups support the same model:

SQL

ALTER AVAILABILITY GROUP [MyAG]
MODIFY REPLICA ON 'Replica1'
WITH (AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
      FAILOVER_MODE = AUTOMATIC);

The secondary replica must acknowledge log hardening before the primary returns success. During a network partition, the primary loses quorum and stops accepting writes to prevent a split-brain scenario.

AP Systems — Availability Over Consistency

An AP system stays responsive during partitions but may return stale or conflicting data. Nodes can diverge and reconcile later.

Cassandra — Tunable AP by Default

Cassandra is famously AP. It uses consistent hashing to distribute data across a ring of nodes, and replication factor RF determines how many copies exist. When you read or write, the consistency level (CL) determines how many replicas must respond.

With RF=3 and CL=ONE, a write succeeds if one replica acknowledges it. A partition can leave two nodes with different values. Cassandra resolves this with last-write-wins based on the client timestamp — a strategy that silently drops updates if clocks drift.

The trade-off: Cassandra never refuses a write. This makes it excellent for time-series data, event logs, and IoT telemetry where availability trumps strict accuracy.

DynamoDB — Tunable Consistency Per Read

DynamoDB sits in the AP camp by default but lets you opt into consistency:

// Eventual consistency read (AP behaviour — cheaper, faster)
var getRequest = new GetItemRequest
{
    TableName = "Inventory",
    Key = new Dictionary<string, AttributeValue>
    {
        ["ProductId"] = new AttributeValue { S = productId }
    },
    ConsistentRead = false  // default — may return stale data
};

// Strong consistency read (CP behaviour — more expensive, 2x read capacity)
var strongRequest = new GetItemRequest
{
    TableName = "Inventory",
    Key = new Dictionary<string, AttributeValue>
    {
        ["ProductId"] = new AttributeValue { S = productId }
    },
    ConsistentRead = true  // reads from the leader replica
};

This per-operation tunability is why DynamoDB is described as "tunable" — you pay the availability cost only where business logic demands it.

How .NET Systems Choose Consistency Level

EF Core with Read Replicas

The most common .NET pattern for scaling PostgreSQL is directing writes to the primary and reads to read replicas. This is deliberately eventually consistent — a row written to the primary may not yet appear on the replica. EF Core's UseQuerySplittingBehavior is a separate concern (it splits JOIN queries), but the read-replica pattern requires explicit routing.

// DbContext factory that routes by operation type
public class AppDbContextFactory
{
    private readonly string _primaryCs;
    private readonly string _replicaCs;

    public AppDbContextFactory(IConfiguration config)
    {
        _primaryCs = config.GetConnectionString("Primary")!;
        _replicaCs = config.GetConnectionString("Replica")!;
    }

    public AppDbContext CreateWriteContext()
    {
        var opts = new DbContextOptionsBuilder<AppDbContext>()
            .UseNpgsql(_primaryCs,
                o => o.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery))
            .Options;
        return new AppDbContext(opts);
    }

    public AppDbContext CreateReadContext()
    {
        var opts = new DbContextOptionsBuilder<AppDbContext>()
            .UseNpgsql(_replicaCs,
                o => o.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery))
            .Options;
        return new AppDbContext(opts);
    }
}

// Service usage
public class ProductService(AppDbContextFactory factory)
{
    // Reads tolerate eventual consistency — use replica
    public async Task<List<Product>> GetCatalogAsync()
    {
        await using var ctx = factory.CreateReadContext();
        return await ctx.Products.AsNoTracking().ToListAsync();
    }

    // Writes must hit the primary
    public async Task UpdatePriceAsync(Guid productId, decimal newPrice)
    {
        await using var ctx = factory.CreateWriteContext();
        var product = await ctx.Products.FindAsync(productId)
            ?? throw new KeyNotFoundException();
        product.Price = newPrice;
        await ctx.SaveChangesAsync();
    }
}

The risk: if you write a price update and immediately read it back through the replica, you may see the old price for up to a few hundred milliseconds. For a product catalog, that is acceptable. For a payment confirmation screen, it is not.

Redis — The `WAIT` Command for Synchronous Replication

Redis is AP by default. If you use Redis Sentinel or Redis Cluster and the primary fails before replication completes, the replica-that-becomes-primary will be missing recent writes.

The WAIT numreplicas timeout command blocks the client until numreplicas replicas acknowledge the previous write, or until timeout milliseconds pass. This converts a specific operation into CP behaviour:

public class RedisInventoryCache(IConnectionMultiplexer redis)
{
    private readonly IDatabase _db = redis.GetDatabase();

    // Strongly consistent write — blocks until 1 replica acknowledges
    public async Task SetStockAsync(string productId, int quantity)
    {
        var key = $"stock:{productId}";
        await _db.StringSetAsync(key, quantity);

        // Wait for at least 1 replica to replicate, up to 500ms
        // Returns the number of replicas that acknowledged
        var replicated = await _db.ExecuteAsync("WAIT", 1, 500);
        var count = (long)replicated;

        if (count < 1)
        {
            // Replication did not complete in time — log warning
            // but do NOT throw: the write landed on primary
            // This is a monitoring signal, not a hard failure
        }
    }

    // Eventual read — may return data that hasn't propagated from primary
    public async Task<int?> GetStockAsync(string productId)
    {
        var value = await _db.StringGetAsync($"stock:{productId}");
        return value.HasValue ? (int?)int.Parse(value!) : null;
    }
}

WAIT does not give you true CP Redis — there is still a window between the write and the WAIT call. It reduces the replication lag to near-zero for the calling thread, which is often sufficient for inventory reservation workflows.

PACELC — The More Practical Model

The CAP theorem only applies when there is a network partition. Daniel Abadi's PACELC theorem extends this by asking: even when the system is running normally (no partition), does it optimise for latency or consistency?

PACELC:
  If Partition   → choose between Availability and Consistency  (the CA of CAP)
  Else (normal)  → choose between Latency      and Consistency  (the LC of PACELC)

This captures the real engineering trade-off: synchronous replication adds latency even when nothing is wrong. Systems like DynamoDB and Cassandra sacrifice consistency all the time (EL) to get lower latency, not just during partitions.

| System | Partition Behaviour | Normal Behaviour | |---|---|---| | PostgreSQL sync replication | CP — blocks | CL — high latency | | PostgreSQL async replication | AP — possible stale reads | LC — low latency | | Cassandra CL=QUORUM | CP — may reject | LC — low latency | | Cassandra CL=ONE | AP — diverges | LL — very low latency | | Redis (no WAIT) | AP — loses writes | LL — very low latency | | DynamoDB ConsistentRead=true | CP | CL | | DynamoDB ConsistentRead=false | AP | LL |

For .NET architects, PACELC is the frame you should use when justifying database choices to stakeholders. "We use async replication (AP/LC) because our SLA allows for 200ms data lag but cannot tolerate write latency above 20ms" is a real architectural decision backed by the model.

Real Scenario: Inventory Service Trade-Off

Consider an e-commerce inventory service. The stock_level table tracks how many units remain for each product. Two competing options:

Option A — Strong Consistency (CP, sync replication)

Every stock decrement waits for the replica to acknowledge
Write latency: ~5–15ms extra per operation
No risk of overselling: the primary is always authoritative
During a replica network partition, writes block until timeout

Option B — Eventual Consistency (AP, async replication + optimistic concurrency)

Stock decrements write to the primary with async replication
Reads from replicas may be 0–300ms behind
Oversell risk is managed with a row-version check on the primary at checkout time
No latency penalty during normal operation

Here is Option B in practice, using xmin (PostgreSQL's internal row version) for optimistic concurrency:

public class InventoryService(NpgsqlDataSource db)
{
    // Read from replica — may be slightly stale, fine for display
    public async Task<StockLevel> GetDisplayStockAsync(Guid productId)
    {
        await using var conn = await db.OpenConnectionAsync();
        await using var cmd = conn.CreateCommand();
        cmd.CommandText = """
            SELECT product_id, quantity, xmin
            FROM stock_levels
            WHERE product_id = @productId
            """;
        cmd.Parameters.AddWithValue("productId", productId);
        await using var reader = await cmd.ExecuteReaderAsync();
        await reader.ReadAsync();
        return new StockLevel(
            reader.GetGuid(0),
            reader.GetInt32(1),
            RowVersion: reader.GetFieldValue<uint>(2)
        );
    }

    // Decrement stock at checkout — MUST go to primary, checks row version
    public async Task<bool> TryReserveAsync(Guid productId, int quantity, uint expectedXmin)
    {
        await using var conn = await db.OpenConnectionAsync(); // primary connection string
        await using var cmd = conn.CreateCommand();
        // The xmin check prevents lost updates from concurrent requests
        cmd.CommandText = """
            UPDATE stock_levels
            SET quantity = quantity - @qty
            WHERE product_id = @productId
              AND xmin = @expectedXmin
              AND quantity >= @qty
            """;
        cmd.Parameters.AddWithValue("productId", productId);
        cmd.Parameters.AddWithValue("qty", quantity);
        cmd.Parameters.AddWithValue("expectedXmin",
            NpgsqlTypes.NpgsqlDbType.Xid, expectedXmin);

        var rowsAffected = await cmd.ExecuteNonQueryAsync();
        return rowsAffected == 1; // false means conflict or insufficient stock
    }
}

public record StockLevel(Guid ProductId, int Quantity, uint RowVersion);

When TryReserveAsync returns false, the caller retries GetDisplayStockAsync (from primary this time) and repeats the attempt — classic optimistic concurrency. The replica is used only for display; all mutating operations touch the primary.

This is a conscious AP/LC choice for reads and CP behaviour at the final write. It is not "pick 2" — it is using the right model for each operation.

Practical Decision Framework for .NET Teams

Use strong consistency when:

Data drives financial calculations (balances, invoices, credits)
Regulatory compliance requires audit trails with no gaps
Two services must agree on a fact before proceeding (e.g., payment before shipment)

Use eventual consistency when:

Data is additive or idempotent (log entries, telemetry)
The user experience tolerates brief staleness (product catalog, recommendation lists)
Write throughput is the primary constraint (high-volume event ingestion)

Patterns that bridge the gap:

Read-your-writes consistency — after a write, route the user's next read to the primary for a short TTL, then fall back to the replica.
Monotonic read consistency — record which replica version a user last read; always route to an equal or newer replica. Not natively supported by most ORMs, requires a sticky session or token.
Causal consistency — pass a causality token (e.g., a write timestamp) to the read call; the server waits until it has caught up to that point. DynamoDB's global tables support this via ClientRequestToken.

Key Takeaways

The CAP theorem is often taught as a static system property. In reality, modern databases let you tune consistency per operation. The engineering skill is in identifying which operations truly need strong consistency and paying the latency cost only there.

PACELC is the honest extension: even without partitions, synchronous replication adds latency. AP/LC systems (Cassandra, Redis without WAIT, async PostgreSQL) are not broken — they are making an explicit trade-off that is correct for many workloads.

In .NET systems:

Use ConsistentRead = true in DynamoDB for checkout flows; false for browse flows.
Use WAIT 1 500 in Redis when the next operation depends on reading that write back.
Route EF Core SaveChangesAsync to primary, AsNoTracking queries to replicas — and validate at write time with optimistic concurrency.
Choose synchronous_commit = on in PostgreSQL only for data where the cost of a lost write exceeds the cost of higher write latency.

CAP Theorem in Practice — Consistency, Availability, and Partition Tolerance

The Theorem Everyone Misquotes

CP Systems — Consistency Over Availability

PostgreSQL — The Quintessential CP Database

SQL Server Always On — Synchronous Mode

AP Systems — Availability Over Consistency

Cassandra — Tunable AP by Default

DynamoDB — Tunable Consistency Per Read

How .NET Systems Choose Consistency Level

EF Core with Read Replicas

Redis — The `WAIT` Command for Synchronous Replication

PACELC — The More Practical Model

Real Scenario: Inventory Service Trade-Off

Practical Decision Framework for .NET Teams

Key Takeaways

Enjoyed this article?

Leave a comment

The Theorem Everyone Misquotes

CP Systems — Consistency Over Availability

PostgreSQL — The Quintessential CP Database

SQL Server Always On — Synchronous Mode

AP Systems — Availability Over Consistency

Cassandra — Tunable AP by Default

DynamoDB — Tunable Consistency Per Read

How .NET Systems Choose Consistency Level

EF Core with Read Replicas

Redis — The WAIT Command for Synchronous Replication

PACELC — The More Practical Model

Real Scenario: Inventory Service Trade-Off

Practical Decision Framework for .NET Teams

Key Takeaways

Enjoyed this article?

Leave a comment

Redis — The `WAIT` Command for Synchronous Replication