Learnixo

Docker Compose · Lesson 5 of 5

Health Checks and Service Dependencies

The Startup Order Problem

Without health checks:
  docker compose up starts all services simultaneously
  prescription-service starts and immediately tries to connect to sqlserver
  sqlserver is still initialising — connection fails
  prescription-service crashes with "Cannot connect to SQL Server"
  Docker restarts it (if restart: unless-stopped)
  sqlserver finishes starting
  prescription-service starts again — works
  
  This wastes 20-40 seconds and fills logs with spurious errors.
  In CI/CD it can fail the pipeline before the service has a chance to start.

With health checks + depends_on:
  sqlserver starts
  Docker waits until sqlserver is healthy (health check passes)
  Only then does prescription-service start
  No connection failures, no restart loop

SQL Server Health Check

YAML
services:
  sqlserver:
    image: mcr.microsoft.com/mssql/server:2022-latest
    environment:
      - ACCEPT_EULA=Y
      - SA_PASSWORD=${DB_PASSWORD}
    healthcheck:
      test:
        - CMD
        - /opt/mssql-tools/bin/sqlcmd
        - -S
        - localhost
        - -U
        - sa
        - -P
        - ${DB_PASSWORD}
        - -Q
        - "SELECT 1"
      interval: 10s      # check every 10 seconds
      timeout: 5s        # fail if no response in 5 seconds
      retries: 10        # declare unhealthy after 10 consecutive failures
      start_period: 30s  # don't count failures in the first 30 seconds (SQL Server init time)

ASP.NET Core Health Check Endpoint

C#
// Register health checks in Program.cs
builder.Services.AddHealthChecks()
    .AddSqlServer(
        connectionString: builder.Configuration.GetConnectionString("Clinical")!,
        name: "sql-server",
        tags: new[] { "ready" })
    .AddRedis(
        connectionString: builder.Configuration.GetConnectionString("Redis")!,
        name: "redis",
        tags: new[] { "ready" });

// Map endpoints
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
    Predicate = _ => false  // no dependency checks — just "is the process running?"
});

app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
    Predicate = check => check.Tags.Contains("ready"),
    ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse
});
YAML
# Docker Compose: check the /health/live endpoint
services:
  prescription-service:
    image: clinical/prescription-service:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health/live"]
      interval: 15s
      timeout: 5s
      retries: 3
      start_period: 20s   # allow time for migrations to run on first start
    depends_on:
      sqlserver:
        condition: service_healthy  # wait for sqlserver to be healthy
      redis:
        condition: service_healthy

depends_on Conditions

YAML
services:
  prescription-service:
    depends_on:
      sqlserver:
        condition: service_healthy    # wait for health check to pass
      patient-service:
        condition: service_started    # just wait for container to start (no health check on patient-service)
      migration-job:
        condition: service_completed_successfully  # wait for a one-shot migration job to exit 0

  # One-shot migration job that runs EF Core migrations
  migration-job:
    image: clinical/prescription-service:latest
    command: ["dotnet", "ef", "database", "update"]
    environment:
      - ConnectionStrings__Clinical=Server=sqlserver;...
    depends_on:
      sqlserver:
        condition: service_healthy
    restart: "no"   # don't restart — it's a one-shot job

Debugging Unhealthy Containers

Bash
# List containers with health status
docker compose ps
# Look for: STATUS column shows "(unhealthy)" or "(health: starting)"

# Inspect the last 5 health check results:
docker inspect --format='{{json .State.Health}}' clinical_sqlserver_1 | jq .

# Example output:
# {
#   "Status": "unhealthy",
#   "FailingStreak": 3,
#   "Log": [
#     {
#       "Start": "2026-03-15T10:00:00Z",
#       "End":   "2026-03-15T10:00:05Z",
#       "ExitCode": 1,
#       "Output": "Sqlcmd: Error: Microsoft ODBC Driver 17 for SQL Server..."
#     }
#   ]
# }

# Stream logs from a specific service:
docker compose logs sqlserver -f --tail=50

# Execute a command inside a container to debug:
docker compose exec sqlserver bash
# Or for SQL Server specifically:
docker compose exec sqlserver \
  /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $SA_PASSWORD -Q "SELECT @@VERSION"

Redis Health Check

YAML
services:
  redis:
    image: redis:7-alpine
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      # Returns "PONG" if Redis is up  exit code 0 = healthy
      interval: 10s
      timeout: 3s
      retries: 5

Nginx Waiting for All Backend Services

YAML
services:
  nginx:
    image: nginx:alpine
    depends_on:
      prescription-service:
        condition: service_healthy
      patient-service:
        condition: service_healthy
    # Nginx only starts after all backend services are healthy
    # Prevents nginx from routing traffic to services that aren't ready
    ports:
      - "443:443"

Production issue I've seen: A team's docker-compose had no health checks and no depends_on conditions. On each deployment, all services started simultaneously. The prescription service started before the database was ready and crashed. Docker's restart: unless-stopped restarted it, but by then the migration job (also restarted) was running again — creating a race condition between the API restarting and the migration completing. On some deployments, the API started serving requests before migrations completed, leading to "invalid object name" SQL errors for users. Adding health checks and condition: service_completed_successfully on the migration job took 20 minutes and eliminated the race condition permanently.


Key Takeaway

Health checks in Docker Compose solve the startup ordering problem — services wait until their dependencies are genuinely ready, not just started. Define health checks on every infrastructure service (SQL Server, Redis, RabbitMQ). Use depends_on with condition: service_healthy to express real readiness dependencies. ASP.NET Core's /health/live endpoint is the target for Docker health checks — it should be fast and dependency-free. Debug unhealthy containers with docker inspect --format='{{json .State.Health}}' to see the last health check results and output.