System Design · Lesson 20 of 26

Deploying Microservices — Docker, Kubernetes & CI/CD

Why Each Service Needs Its Own Pipeline

In a monolith, one pipeline builds and deploys everything. In a microservices system, this creates a bottleneck: a change to Catalog Service now requires all other services to wait for the same pipeline.

Independent pipelines enable:

Independent deployment — ship Order Service without touching Catalog
Independent scaling — scale only the services under load
Isolated failures — a broken Inventory pipeline doesn't block Orders from deploying
Team autonomy — each team owns their service's pipeline end-to-end

The trade-off is more infrastructure to manage. The right approach: shared pipeline templates with per-service configuration.

Docker Multi-Stage Builds for .NET

A production Docker image should be as small as possible. The .NET SDK (~800 MB) is not needed at runtime — only the ASP.NET Core runtime (~200 MB) is.

Multi-stage builds solve this: use the SDK image to build, then copy only the published output into a minimal runtime image.

Standard .NET 8 multi-stage Dockerfile

DOCKERFILE

# Stage 1: Restore dependencies (cached separately for faster rebuilds)
FROM mcr.microsoft.com/dotnet/sdk:8.0-alpine AS restore
WORKDIR /src
COPY ["services/orders/Orders.Api/Orders.Api.csproj", "Orders.Api/"]
COPY ["services/orders/Orders.Application/Orders.Application.csproj", "Orders.Application/"]
COPY ["services/orders/Orders.Domain/Orders.Domain.csproj", "Orders.Domain/"]
COPY ["services/orders/Orders.Infrastructure/Orders.Infrastructure.csproj", "Orders.Infrastructure/"]
COPY ["shared/MicroMart.Contracts/MicroMart.Contracts.csproj", "../../shared/MicroMart.Contracts/"]
RUN dotnet restore "Orders.Api/Orders.Api.csproj"

# Stage 2: Build and publish
FROM restore AS publish
COPY services/orders/ .
COPY shared/ ../../shared/
RUN dotnet publish "Orders.Api/Orders.Api.csproj" \
    -c Release \
    -o /app/publish \
    --no-restore \
    /p:UseAppHost=false \
    /p:PublishTrimmed=false

# Stage 3: Final runtime image (no SDK, no build tools)
FROM mcr.microsoft.com/dotnet/aspnet:8.0-alpine AS runtime
WORKDIR /app

# Security: run as non-root
RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser
USER appuser

COPY --from=publish /app/publish .

# Health check for Docker and Kubernetes
HEALTHCHECK --interval=10s --timeout=3s --start-period=30s --retries=3 \
    CMD wget -qO- http://localhost:8080/health/live || exit 1

ENTRYPOINT ["dotnet", "Orders.Api.dll"]

Result: the final image is typically 80–120 MB, down from 800 MB+ if you used the SDK image directly.

.dockerignore — keep the context small

**/.git
**/.vs
**/bin
**/obj
**/*.user
**/node_modules
**/.env
**/appsettings.Development.json
**/*.Tests/
**/Dockerfile*
**/.dockerignore

Verify your image size

Bash

docker build -t orders-service:local -f services/orders/Dockerfile .
docker images orders-service:local
# REPOSITORY       TAG     SIZE
# orders-service   local   98.4MB

Docker Compose for Local Development

In local dev, you want all services running with hot-reload, shared networking, and real dependencies (PostgreSQL, RabbitMQ, Redis).

YAML

# docker-compose.yml — infrastructure only (not services, for dev)
services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_PASSWORD: dev_password
      POSTGRES_USER:     micromart
    ports: ["5432:5432"]
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./scripts/init-databases.sql:/docker-entrypoint-initdb.d/init.sql

  rabbitmq:
    image: rabbitmq:3-management-alpine
    ports:
      - "5672:5672"
      - "15672:15672"    # management UI
    environment:
      RABBITMQ_DEFAULT_USER: micromart
      RABBITMQ_DEFAULT_PASS: dev_password

  redis:
    image: redis:7-alpine
    ports: ["6379:6379"]
    command: redis-server --appendonly yes

  seq:
    image: datalust/seq:latest
    ports: ["5341:80"]
    environment:
      ACCEPT_EULA: Y

volumes:
  postgres_data:

YAML

# docker-compose.override.yml — run all services in containers
services:
  gateway:
    build: { context: ., dockerfile: gateway/Dockerfile }
    ports: ["5000:8080"]
    environment:
      - ASPNETCORE_ENVIRONMENT=Development
    depends_on: [orders, catalog, inventory]

  orders:
    build: { context: ., dockerfile: services/orders/Dockerfile }
    environment:
      - ConnectionStrings__Orders=Host=postgres;Database=orders;Username=micromart;Password=dev_password
      - RabbitMQ__Host=rabbitmq
      - Redis__ConnectionString=redis:6379
    depends_on: [postgres, rabbitmq, redis]
    develop:
      watch:
        - action: sync+restart
          path: ./services/orders
          target: /src

  inventory:
    build: { context: ., dockerfile: services/inventory/Dockerfile }
    environment:
      - ConnectionStrings__Inventory=Host=postgres;Database=inventory;Username=micromart;Password=dev_password
      - RabbitMQ__Host=rabbitmq
    depends_on: [postgres, rabbitmq]

Bash

# Start everything
docker compose up -d

# Start only infrastructure (run services locally with dotnet run)
docker compose up -d postgres rabbitmq redis seq

# Rebuild a single service after code changes
docker compose up -d --build orders

# Watch logs from all services
docker compose logs -f orders inventory

Kubernetes Manifests Per Service

Each service needs its own set of Kubernetes resources. The minimal production set:

services/orders/k8s/
├── deployment.yaml
├── service.yaml
├── configmap.yaml
├── hpa.yaml
└── networkpolicy.yaml

Deployment

YAML

# services/orders/k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders
  namespace: production
  labels:
    app: orders
    version: "1.0.0"
spec:
  replicas: 2
  selector:
    matchLabels:
      app: orders
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1          # allow 1 extra pod during update
      maxUnavailable: 0    # never take a pod down before a new one is ready
  template:
    metadata:
      labels:
        app: orders
        version: "1.0.0"
    spec:
      serviceAccountName: orders-service
      containers:
        - name: orders
          image: acrmicromart.azurecr.io/orders:$(IMAGE_TAG)
          ports:
            - containerPort: 8080
          env:
            - name: ASPNETCORE_ENVIRONMENT
              value: Production
            - name: ConnectionStrings__Orders
              valueFrom:
                secretKeyRef:
                  name: orders-db-secret
                  key: connection-string
          envFrom:
            - configMapRef:
                name: orders-config
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "512Mi"
              cpu: "500m"
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 3
          livenessProbe:
            httpGet:
              path: /health/live
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 15
            failureThreshold: 3
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: orders

ConfigMap and Service

YAML

# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: orders-config
  namespace: production
data:
  RabbitMQ__Host: rabbitmq.production.svc.cluster.local
  Redis__ConnectionString: redis.production.svc.cluster.local:6379
  Auth__Authority: https://auth.micromart.com
  Logging__MinimumLevel__Default: Information
---
# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: orders
  namespace: production
spec:
  selector:
    app: orders
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP    # internal only — gateway handles external traffic

Horizontal Pod Autoscaler

YAML

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: orders-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: orders
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60    # wait 60s before scaling up again
    scaleDown:
      stabilizationWindowSeconds: 300   # wait 5 min before scaling down

Helm for Packaging

Helm packages all Kubernetes manifests into a reusable chart with environment-specific values.

charts/orders/
├── Chart.yaml
├── values.yaml             # defaults
├── values.staging.yaml     # staging overrides
├── values.production.yaml  # production overrides
└── templates/
    ├── deployment.yaml
    ├── service.yaml
    ├── configmap.yaml
    ├── hpa.yaml
    └── _helpers.tpl

YAML

# charts/orders/values.yaml
image:
  repository: acrmicromart.azurecr.io/orders
  tag: latest
  pullPolicy: IfNotPresent

replicaCount: 2

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "512Mi"
    cpu: "500m"

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

config:
  environment: Production
  rabbitMqHost: rabbitmq.production.svc.cluster.local
  logLevel: Information

YAML

# values.staging.yaml — overrides for staging
replicaCount: 1
config:
  environment: Staging
  logLevel: Debug
autoscaling:
  enabled: false

Bash

# Deploy to staging
helm upgrade --install orders-staging ./charts/orders \
  --namespace staging \
  --values ./charts/orders/values.staging.yaml \
  --set image.tag=$GIT_SHA

# Deploy to production
helm upgrade --install orders ./charts/orders \
  --namespace production \
  --values ./charts/orders/values.production.yaml \
  --set image.tag=$GIT_SHA \
  --atomic \           # roll back automatically if deployment fails
  --timeout 5m

Independent Deployment

The power of microservices is independent deployment. Only deploy what changed:

YAML

# .github/workflows/orders-ci.yml
on:
  push:
    branches: [main]
    paths:
      - "services/orders/**"
      - "shared/MicroMart.Contracts/**"
      - "charts/orders/**"

# Only triggers when orders service code or its chart changes
# Catalog, Inventory, Gateway pipelines are NOT triggered

YAML

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build and push image
        run: |
          docker buildx build \
            --platform linux/amd64 \
            --tag $ACR_REGISTRY/orders:${{ github.sha }} \
            --tag $ACR_REGISTRY/orders:latest \
            --push \
            --cache-from $ACR_REGISTRY/orders:latest \
            -f services/orders/Dockerfile .

      - name: Deploy to staging
        run: |
          helm upgrade --install orders-staging ./charts/orders \
            --namespace staging \
            --values ./charts/orders/values.staging.yaml \
            --set image.tag=${{ github.sha }} \
            --atomic --timeout 3m

      - name: Run smoke tests
        run: dotnet test tests/Smoke/ --filter "Category=Smoke"

      - name: Deploy to production
        run: |
          helm upgrade --install orders ./charts/orders \
            --namespace production \
            --values ./charts/orders/values.production.yaml \
            --set image.tag=${{ github.sha }} \
            --atomic --timeout 5m

Blue-Green Deployments

Blue-green runs two identical environments. Traffic points to Blue (live). You deploy to Green (idle), run smoke tests, then flip the traffic switch. Rollback is instant: flip back to Blue.

YAML

# Two deployments — Blue (current) and Green (new version)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-green
  namespace: production
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: orders
        slot: green     # ← label distinguishes blue from green
    spec:
      containers:
        - name: orders
          image: acrmicromart.azurecr.io/orders:v2.0.0

YAML

# Service — points to "blue" initially
apiVersion: v1
kind: Service
metadata:
  name: orders
spec:
  selector:
    app: orders
    slot: blue    # ← change to "green" to flip traffic

Bash

# After green is healthy, flip traffic
kubectl patch service orders \
  -p '{"spec":{"selector":{"app":"orders","slot":"green"}}}' \
  --namespace production

# Rollback: flip back to blue
kubectl patch service orders \
  -p '{"spec":{"selector":{"app":"orders","slot":"blue"}}}' \
  --namespace production

Canary Releases

A canary release routes a small percentage of traffic to the new version before rolling it out fully.

YAML

# orders-canary-deployment.yaml — 1 replica = ~20% of traffic (vs 4 stable)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-canary
  namespace: production
spec:
  replicas: 1        # 1 canary + 4 stable = 20% canary traffic
  template:
    metadata:
      labels:
        app: orders    # same label — Service load-balances across both deployments
        track: canary
    spec:
      containers:
        - name: orders
          image: acrmicromart.azurecr.io/orders:v2.0.0-canary

With Istio, you can do percentage-based routing without replica math:

YAML

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: orders
  namespace: production
spec:
  hosts: ["orders"]
  http:
    - route:
        - destination:
            host: orders
            subset: stable
          weight: 90
        - destination:
            host: orders
            subset: canary
          weight: 10    # 10% to canary

GitOps with Argo CD

GitOps: the desired cluster state is declared in Git. Argo CD (or Flux) watches the Git repo and reconciles the cluster to match.

Bash

# Install Argo CD
kubectl create namespace argocd
kubectl apply -n argocd \
  -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

YAML

# argo/applications/orders.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: orders
  namespace: argocd
spec:
  project: micromart
  source:
    repoURL: https://github.com/asmanasir/MicroMart.git
    targetRevision: main
    path: charts/orders
    helm:
      valueFiles:
        - values.production.yaml
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true       # remove resources deleted from Git
      selfHeal: true    # reconcile if cluster state drifts from Git
    syncOptions:
      - CreateNamespace=true

With this setup:

Developer pushes to main
GitHub Actions builds the image, pushes to ACR, updates the image tag in the Helm chart values file
Argo CD detects the change in Git, applies the new Helm release to the cluster
Rollback = git revert + push

CI/CD Strategy: Which Pipeline Runs When

Repository structure:
  services/
    orders/          → orders-ci.yml    (triggers on: services/orders/**)
    catalog/         → catalog-ci.yml   (triggers on: services/catalog/**)
    inventory/       → inventory-ci.yml
    notifications/   → notifications-ci.yml
  gateway/           → gateway-ci.yml
  charts/            → (triggered by above, not independently)
  shared/contracts/  → all pipelines (triggers all service pipelines)

Shared contracts are the exception — a change to a shared message contract must trigger all consumer service pipelines to verify compatibility.

YAML

# .github/workflows/contracts-changed.yml
on:
  push:
    paths:
      - "shared/MicroMart.Contracts/**"

jobs:
  trigger-all-consumers:
    strategy:
      matrix:
        service: [orders, catalog, inventory, notifications]
    steps:
      - name: Trigger ${{ matrix.service }} pipeline
        uses: actions/github-script@v7
        with:
          script: |
            await github.rest.actions.createWorkflowDispatch({
              owner: context.repo.owner,
              repo:  context.repo.repo,
              workflow_id: '${{ matrix.service }}-ci.yml',
              ref: 'main',
            });

Summary

| Concern | Solution | |---------|----------| | Small Docker images | Multi-stage builds — SDK for build, runtime for final image | | Local development | docker-compose with infrastructure + service containers | | Per-service Kubernetes config | Deployment + Service + ConfigMap + HPA per service | | Environment-specific config | Helm charts with values.staging.yaml / values.production.yaml | | Independent deployment | Per-service GitHub Actions pipeline with path filters | | Zero-downtime updates | Rolling deployment with maxUnavailable: 0 | | Instant rollback | Blue-green deployment — flip service selector | | Gradual rollout | Canary deployment — 10% traffic to new version first | | Autoscaling | HPA on CPU + memory (or custom metrics) | | GitOps | Argo CD — cluster state declared in Git, reconciled automatically | | Shared contract changes | Trigger all consumer pipelines automatically |

The key discipline: each service is deployed, scaled, and rolled back independently. If deploying one service requires coordination with another team, you have a coupling problem — not a deployment problem.

Testing Microservices — Unit, Integration & Contract

Next Lesson

Case Study: Design a URL Shortener (like bit.ly)