System Design · Lesson 14 of 26

Sync vs Async Communication — REST, gRPC & Message Queues

How your services communicate is one of the most consequential architectural decisions you'll make. Get it wrong and you'll have cascading failures, data inconsistency, and a system that's hard to reason about.

This article gives you a framework for choosing the right communication pattern for each service boundary.

Synchronous vs Asynchronous — The Core Trade-off

Synchronous (Request-Response):
  Service A ──── request ────→ Service B
  Service A ←─── response ─── Service B
  Service A blocks until B responds

Asynchronous (Fire and Forget):
  Service A ──── message ────→ Message Broker
  Service A continues (doesn't wait)
                               Message Broker ──→ Service B (eventually)

| | Synchronous | Asynchronous | |---|---|---| | Coupling | Temporal coupling (A needs B to be up) | Decoupled (A just needs the broker) | | Complexity | Simple request/response | Publisher, broker, consumer, at-least-once delivery | | Latency | Immediate response | Eventual processing | | Resilience | A fails if B is down | A succeeds even if B is down | | Best for | Need a result right now | Fire-and-forget, workflows, events |

REST — The Default for Synchronous

REST over HTTP/HTTPS is the default for service-to-service communication. It's well-understood, tooled, and language-agnostic.

When REST works well:

CRUD operations on resources
Public-facing APIs
When the caller needs an immediate response
When teams use different tech stacks (REST is language-agnostic)

REST best practices for microservices:

GET    /orders/{id}           → get order by ID
POST   /orders                → create order
PUT    /orders/{id}/status    → update order status
DELETE /orders/{id}           → cancel order

Versioning:
  /api/v1/orders  → version in URL (simple, visible)
  Accept: application/vnd.myapi.v1+json → version in header (clean URLs)

REST limitations:

HTTP overhead for high-frequency internal calls
No schema enforcement at compile time
No built-in streaming
Over-fetching (response includes fields you don't need)

gRPC — For High-Performance Internal Calls

gRPC uses HTTP/2 + Protocol Buffers (binary serialization). It's faster than REST, has schema enforcement, and supports streaming.

gRPC vs REST performance:

JSON over HTTP/1.1:
  Request:  ~200 bytes (text, headers)
  Parse:    ~0.5ms (JSON deserialization)

Protobuf over HTTP/2:
  Request:  ~50 bytes (binary, compressed)
  Parse:    ~0.05ms (binary deserialization)
  
10× smaller messages, 10× faster parsing

Defining a gRPC service:

PROTOBUF

// order.proto
syntax = "proto3";

service OrderService {
  rpc GetOrder (GetOrderRequest) returns (Order);
  rpc ListOrders (ListOrdersRequest) returns (stream Order);  // server streaming
  rpc CreateOrder (CreateOrderRequest) returns (CreateOrderResponse);
}

message Order {
  string id = 1;
  string user_id = 2;
  OrderStatus status = 3;
  repeated OrderItem items = 4;
  google.protobuf.Timestamp created_at = 5;
}

enum OrderStatus {
  PENDING = 0;
  CONFIRMED = 1;
  SHIPPED = 2;
  DELIVERED = 3;
}

// .NET gRPC service implementation
public class OrderGrpcService : OrderService.OrderServiceBase
{
    private readonly IOrderRepository _orders;
    
    public override async Task<Order> GetOrder(
        GetOrderRequest request, ServerCallContext context)
    {
        var order = await _orders.GetByIdAsync(request.Id);
        if (order == null)
            throw new RpcException(new Status(StatusCode.NotFound, "Order not found"));
        
        return MapToProto(order);
    }
    
    // Server-side streaming: push multiple responses for one request
    public override async Task ListOrders(
        ListOrdersRequest request,
        IServerStreamWriter<Order> responseStream,
        ServerCallContext context)
    {
        await foreach (var order in _orders.StreamByUserAsync(request.UserId))
        {
            await responseStream.WriteAsync(MapToProto(order));
        }
    }
}

When to use gRPC:

High-frequency internal service-to-service calls
When you need streaming (real-time data, large result sets)
When schema enforcement matters (breaking change detection at compile time)
Mobile clients on poor network connections (binary is smaller)

When NOT to use gRPC:

Browser clients (limited gRPC-web support — REST is easier)
Public APIs (REST is more accessible)
Simple one-off queries (overhead of proto definition isn't worth it)

Message Queues — For Asynchronous Decoupling

A message queue sits between services. The publisher sends a message and moves on. The consumer processes it when it's ready.

RabbitMQ — Task Queues and Routing

RabbitMQ uses an exchange/queue model. Exchanges route messages to queues based on routing keys.

Topology:
  Publisher → Exchange → (routing rule) → Queue → Consumer

Exchange types:
  Direct:  exact routing key match
  Topic:   routing key pattern (order.created, order.*)
  Fanout:  broadcast to all bound queues
  Headers: route by message headers

// Publishing to RabbitMQ
public class OrderEventPublisher
{
    private readonly IModel _channel;
    
    public void PublishOrderCreated(OrderCreatedEvent @event)
    {
        var body = JsonSerializer.SerializeToUtf8Bytes(@event);
        
        _channel.BasicPublish(
            exchange: "order-events",
            routingKey: "order.created",
            basicProperties: null,
            body: body
        );
    }
}

// Consuming from RabbitMQ
public class NotificationConsumer : BackgroundService
{
    protected override Task ExecuteAsync(CancellationToken ct)
    {
        _channel.QueueDeclare("notification-queue", durable: true);
        _channel.QueueBind("notification-queue", "order-events", "order.created");
        
        var consumer = new EventingBasicConsumer(_channel);
        consumer.Received += async (_, ea) =>
        {
            var @event = JsonSerializer.Deserialize<OrderCreatedEvent>(ea.Body.ToArray());
            await _notificationService.SendOrderConfirmationAsync(@event);
            _channel.BasicAck(ea.DeliveryTag, multiple: false);  // acknowledge
        };
        
        _channel.BasicConsume("notification-queue", autoAck: false, consumer);
        return Task.CompletedTask;
    }
}

Azure Service Bus — Enterprise Messaging

Azure Service Bus is the managed equivalent for Azure workloads. Supports queues (point-to-point) and topics (publish-subscribe).

// Azure Service Bus sender
var client = new ServiceBusClient(connectionString);
var sender = client.CreateSender("order-events");

var message = new ServiceBusMessage(
    JsonSerializer.SerializeToUtf8Bytes(orderCreatedEvent))
{
    ContentType = "application/json",
    CorrelationId = correlationId,
    SessionId = order.UserId,  // session-based ordering per user
};

await sender.SendMessageAsync(message);

Kafka — Event Streaming at Scale

Kafka is a distributed log. Messages are persisted and consumers can replay history.

Kafka key concepts:
  Topic: a named log (e.g., "order-events")
  Partition: topics split into partitions for parallelism
  Offset: position of a message in a partition
  Consumer Group: multiple consumers sharing topic consumption
  
Partitioning by key:
  Kafka routes messages with the same key to the same partition
  This guarantees ordering per key (e.g., per user_id or order_id)
  
                  Partition 0: user_1 messages
  Topic ─────────→ Partition 1: user_2 messages
                  Partition 2: user_3 messages

Use Kafka when: Event replay is needed, high throughput (millions of messages/sec), event sourcing, stream processing (Kafka Streams, ksqlDB).

Use RabbitMQ/Azure Service Bus when: You need routing flexibility, priority queues, dead-letter queues with easy management, or you're on Azure.

Events vs Commands

These are conceptually different message types:

Command: A request to do something. One sender, one receiver.

Message: "ProcessPayment" for order #123 → Payment Service
Semantics: "Please do this"
If Payment Service is down, this is a problem

Event: A notification that something happened. One publisher, many subscribers.

Message: "OrderPlaced" for order #123
Semantics: "This happened. Do what you need."
Publisher doesn't know or care who reacts
Notification Service, Inventory Service, Analytics Service all subscribe

Rule of thumb:

Use commands when you need a result or acknowledgment
Use events when you want to notify interested parties without coupling

Choreography vs Orchestration

When coordinating a multi-step workflow across services, you have two options:

Choreography — Services React to Events

No central coordinator. Each service knows what to do when it receives an event.

OrderPlaced event published
  → Inventory Service reacts: reserve stock → InventoryReserved event
  → InventoryReserved event published
  → Payment Service reacts: charge card → PaymentProcessed event
  → PaymentProcessed event published
  → Notification Service reacts: send email
  
No one is "in charge". Services are decoupled.

Pros: Loose coupling. Services can be added/removed without changing others.

Cons: Workflow logic is distributed — hard to understand the full picture. Hard to track what step a business process is at. Debugging requires tracing events across multiple services.

Orchestration — Central Coordinator

A dedicated orchestrator service drives the workflow.

// Order Saga Orchestrator
public class CreateOrderSaga
{
    public async Task ExecuteAsync(CreateOrderCommand command)
    {
        // Step 1: Reserve inventory
        var reservation = await _inventoryService.ReserveAsync(command.Items);
        
        if (!reservation.Success)
        {
            await _orderService.MarkFailedAsync(command.OrderId, "Inventory unavailable");
            return;
        }
        
        // Step 2: Process payment
        var payment = await _paymentService.ChargeAsync(command.UserId, command.Total);
        
        if (!payment.Success)
        {
            // Compensate: release inventory
            await _inventoryService.ReleaseAsync(reservation.Id);
            await _orderService.MarkFailedAsync(command.OrderId, "Payment failed");
            return;
        }
        
        // Step 3: Confirm order
        await _orderService.ConfirmAsync(command.OrderId, payment.TransactionId);
        
        // Step 4: Notify (fire and forget — not business-critical)
        await _notificationService.SendConfirmationAsync(command.UserId, command.OrderId);
    }
}

Pros: Workflow is visible in one place. Easy to track state and handle failures.

Cons: Orchestrator is coupled to all participants. Can become a bottleneck.

Recommendation: Use choreography for simple event fans-out. Use orchestration (Saga) for complex, multi-step transactions where you need explicit compensating transactions.

The Outbox Pattern — Reliable Event Publishing

The hardest problem in event-driven microservices: publishing an event and updating the database atomically.

The problem:
  1. Update order status in DB  ✓
  2. Publish "OrderShipped" event to Kafka  ✗ (Kafka is down)
  
  → DB updated, event never published
  → Notification never sent, inventory never updated
  → Data inconsistency

The Outbox pattern uses the database as a reliable relay:

1. In the SAME database transaction:
   - Update order status
   - INSERT into outbox table: {event: "OrderShipped", payload: {...}, published: false}
   
2. Separate background worker (Outbox Processor):
   - Polls outbox table for unpublished events
   - Publishes each event to Kafka/Service Bus
   - Marks event as published in outbox table
   
Result: Either both the order update and the outbox entry commit,
        or neither does. The outbox processor handles eventual publishing.

// Outbox table
CREATE TABLE outbox_events (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    event_type VARCHAR(100) NOT NULL,
    payload JSONB NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    published_at TIMESTAMP NULL,
    retry_count INT DEFAULT 0
);

// Service layer — atomic: DB update + outbox insert
public async Task ShipOrderAsync(Guid orderId)
{
    using var tx = await _db.BeginTransactionAsync();
    
    await _db.ExecuteAsync(
        "UPDATE orders SET status = 'Shipped', shipped_at = NOW() WHERE id = @id",
        new { id = orderId });
    
    await _db.ExecuteAsync(
        "INSERT INTO outbox_events (event_type, payload) VALUES (@type, @payload::jsonb)",
        new { type = "OrderShipped", payload = JsonSerializer.Serialize(new { OrderId = orderId }) });
    
    await tx.CommitAsync();
    // Even if Kafka is down, the event is safely stored in the DB
}

Correlation IDs — Tracing Across Services

When a single user request touches 5 services, you need to trace it across all of them.

Incoming request:
  POST /checkout  →  X-Correlation-ID: req-abc123

Service A logs:  [req-abc123] Created order
Service B logs:  [req-abc123] Reserved inventory
Service C logs:  [req-abc123] Charged payment
Service D logs:  [req-abc123] Sent confirmation email

Without correlation IDs:
  When checkout fails, you search logs across 4 services with no link.
  With correlation IDs: grep for "req-abc123" and see the full trace.

// ASP.NET Core middleware — propagate correlation ID
public class CorrelationIdMiddleware
{
    public async Task InvokeAsync(HttpContext context)
    {
        var correlationId = context.Request.Headers["X-Correlation-ID"].FirstOrDefault()
            ?? Guid.NewGuid().ToString();
        
        context.Items["CorrelationId"] = correlationId;
        context.Response.Headers["X-Correlation-ID"] = correlationId;
        
        // Add to outgoing HTTP client requests
        using var scope = _logger.BeginScope(new { CorrelationId = correlationId });
        await _next(context);
    }
}

// HttpClient factory — inject correlation ID into downstream calls
services.AddHttpClient("order-service")
    .AddHttpMessageHandler<CorrelationIdDelegatingHandler>();

Choosing the Right Protocol for Each Boundary

Boundary                           Protocol
────────────────────────────────────────────────────────────
API Gateway → Services             REST (public-facing, tooling)
Service → Service (sync needed)    gRPC (internal, typed, fast)
Service → Service (async ok)       Message Queue (decoupled, resilient)
Order placed → Notification        Event (async, publish-subscribe)
Checkout → Inventory               gRPC or REST (sync, need immediate confirmation)
Order shipped → Notify/Analytics   Event via Kafka (fanout, replay needed)

Key Takeaways

REST for CRUD and public APIs. gRPC for high-frequency internal calls (faster, typed, streaming).
Message queues decouple services: publisher doesn't need receiver to be up.
RabbitMQ/Azure Service Bus for routing and task queues. Kafka for event streams at scale with replay.
Events broadcast what happened. Commands request an action with a specific handler.
Choreography for simple event fan-outs. Orchestration (Saga) for complex multi-step transactions.
Outbox pattern is the correct way to reliably publish events alongside a database write.
Correlation IDs are mandatory for tracing requests across microservices.

Lesson Checkpoint

Service Boundaries — How to Split Without Regret

Next Lesson

Data Management in Microservices — No Shared DB