Learnixo

FastAPI for AI Engineers · Lesson 11 of 12

Deploying to Azure Container Apps

What Is Azure Container Apps?

Azure Container Apps (ACA) is Microsoft's serverless container hosting platform. It sits above raw Kubernetes — you do not manage nodes, clusters, or Kubernetes YAML directly. Instead, you describe your container and ACA handles scheduling, scaling, networking, and TLS.

Key characteristics relevant to AI services:

| Feature | Detail | |---------|--------| | Scale to zero | Containers scale down to zero replicas when idle — no traffic, no cost | | KEDA-based scaling | Scales up based on HTTP concurrency, queue depth, CPU, or custom metrics | | Ingress | Built-in HTTPS ingress with automatic TLS certificate | | Secrets | Native secrets or Azure Key Vault references | | Managed identity | Container can authenticate to Azure services without stored credentials | | Dapr | Optional sidecar for service discovery, pub/sub, and state management | | Revision management | Multiple revisions run side by side for canary or blue-green deployments |

Prerequisites

Bash
# Install Azure CLI
az version   # needs 2.53+

# Install the Container Apps extension
az extension add --name containerapp --upgrade

# Login
az login

# Register providers (one-time per subscription)
az provider register --namespace Microsoft.App
az provider register --namespace Microsoft.OperationalInsights

Core Concepts

  • Environment — a shared networking boundary. Multiple Container Apps can talk to each other within one environment.
  • Container App — one deployable unit. Wraps one container (or multiple sidecars).
  • Revision — an immutable snapshot of a Container App's configuration. Deploying a new image creates a new revision.
  • Ingress — controls how traffic reaches the Container App from outside.
  • Scale rule — tells ACA when to add or remove replicas.

Create an Environment and Container Registry

Bash
RESOURCE_GROUP="rg-ai-platform"
LOCATION="uksouth"
ENVIRONMENT="cae-ai-platform"
REGISTRY="acraiplat.azurecr.io"

# Resource group
az group create --name $RESOURCE_GROUP --location $LOCATION

# Container Apps Environment
az containerapp env create \
  --name $ENVIRONMENT \
  --resource-group $RESOURCE_GROUP \
  --location $LOCATION

# Azure Container Registry (if you don't have one)
az acr create \
  --name acraiplat \
  --resource-group $RESOURCE_GROUP \
  --sku Basic \
  --admin-enabled true

Building and Pushing the Image

Bash
# Build using ACR Tasks (no local Docker needed)
az acr build \
  --registry acraiplat \
  --image ai-platform-api:1.0.0 \
  --file Dockerfile \
  .

# Or build locally and push
docker build -t acraiplat.azurecr.io/ai-platform-api:1.0.0 .
docker push acraiplat.azurecr.io/ai-platform-api:1.0.0

containerapp.yaml

The declarative specification for a Container App. This is the source of truth — check it into version control alongside your application code.

YAML
# containerapp.yaml
location: uksouth
resourceGroup: rg-ai-platform
type: Microsoft.App/containerApps
name: ca-ai-platform-api
properties:
  managedEnvironmentId: /subscriptions/<SUB_ID>/resourceGroups/rg-ai-platform/providers/Microsoft.App/managedEnvironments/cae-ai-platform

  configuration:
    activeRevisionsMode: Single

    # Registry authentication (using managed identity is preferred over admin key)
    registries:
      - server: acraiplat.azurecr.io
        identity: system

    # Ingress: expose port 8000 externally with HTTPS
    ingress:
      external: true
      targetPort: 8000
      transport: http
      allowInsecure: false
      traffic:
        - weight: 100
          latestRevision: true

    # Secrets — reference Key Vault instead of storing values here
    secrets:
      - name: openai-api-key
        keyVaultUrl: https://kv-ai-platform.vault.azure.net/secrets/AzureOpenAIKey
        identity: system
      - name: database-url
        keyVaultUrl: https://kv-ai-platform.vault.azure.net/secrets/DatabaseUrl
        identity: system
      - name: redis-url
        keyVaultUrl: https://kv-ai-platform.vault.azure.net/secrets/RedisUrl
        identity: system

  template:
    containers:
      - name: api
        image: acraiplat.azurecr.io/ai-platform-api:1.0.0

        # Resource allocation
        resources:
          cpu: 1.0
          memory: 2.0Gi

        # Environment variables
        env:
          - name: AZURE_OPENAI_API_KEY
            secretRef: openai-api-key
          - name: DATABASE_URL
            secretRef: database-url
          - name: REDIS_URL
            secretRef: redis-url
          - name: AZURE_OPENAI_ENDPOINT
            value: https://my-aoai.openai.azure.com/
          - name: ENVIRONMENT
            value: production
          - name: LOG_LEVEL
            value: info
          - name: PYTHONUNBUFFERED
            value: "1"

        # Health probes
        probes:
          - type: liveness
            httpGet:
              path: /health
              port: 8000
            initialDelaySeconds: 10
            periodSeconds: 10
            failureThreshold: 3

          - type: readiness
            httpGet:
              path: /health/ready
              port: 8000
            periodSeconds: 15
            failureThreshold: 2
            successThreshold: 1

          - type: startup
            httpGet:
              path: /health/started
              port: 8000
            failureThreshold: 30
            periodSeconds: 10

    # Scaling rules
    scale:
      minReplicas: 0      # Scale to zero when idle
      maxReplicas: 10
      rules:
        - name: http-scaling
          http:
            metadata:
              concurrentRequests: "10"   # Add a replica when any existing replica has 10+ concurrent requests

Deploy with az CLI

Bash
# First deploy (create)
az containerapp create \
  --resource-group rg-ai-platform \
  --environment cae-ai-platform \
  --yaml containerapp.yaml \
  --name ca-ai-platform-api

# Update image (new deployment)
az containerapp update \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --image acraiplat.azurecr.io/ai-platform-api:1.1.0

# Apply full YAML changes
az containerapp update \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --yaml containerapp.yaml

Managed Identity and Key Vault

Using a managed identity lets the Container App access Azure Key Vault without storing credentials anywhere:

Bash
# Enable system-assigned managed identity
az containerapp identity assign \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --system-assigned

# Get the principal ID
PRINCIPAL_ID=$(az containerapp show \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --query "identity.principalId" \
  --output tsv)

# Grant the identity access to Key Vault secrets
az keyvault set-policy \
  --name kv-ai-platform \
  --resource-group rg-ai-platform \
  --object-id $PRINCIPAL_ID \
  --secret-permissions get list

Once the identity has access, the keyVaultUrl reference in containerapp.yaml resolves at runtime — no API key stored in the YAML.

Checking Logs

Bash
# Stream live logs from the latest revision
az containerapp logs show \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --follow

# Show logs from a specific revision
az containerapp logs show \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --revision ca-ai-platform-api--abc123 \
  --tail 200

# Filter to error lines only (requires jq)
az containerapp logs show \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --output json \
  | jq '.[] | select(.severity == "ERROR")'

Logs are also available in Azure Log Analytics (linked to the Container Apps Environment):

KUSTO
// Log Analytics query — last 100 errors in the past hour
ContainerAppConsoleLogs_CL
| where TimeGenerated > ago(1h)
| where ContainerAppName_s == "ca-ai-platform-api"
| where Log_s contains "ERROR"
| project TimeGenerated, RevisionName_s, Log_s
| order by TimeGenerated desc
| limit 100

Revision Management: Canary Deployments

Azure Container Apps supports traffic splitting between revisions:

Bash
# Deploy new revision without sending traffic to it yet
az containerapp revision copy \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --image acraiplat.azurecr.io/ai-platform-api:2.0.0

# Get revision names
az containerapp revision list \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --query "[].name" \
  --output table

# Split traffic: 90% to stable, 10% to canary
az containerapp ingress traffic set \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --revision-weight \
    ca-ai-platform-api--v1=90 \
    ca-ai-platform-api--v2=10

# After validation  route all traffic to new revision
az containerapp ingress traffic set \
  --resource-group rg-ai-platform \
  --name ca-ai-platform-api \
  --revision-weight ca-ai-platform-api--v2=100

CI/CD Pipeline (GitHub Actions)

YAML
# .github/workflows/deploy.yml
name: Build and Deploy

on:
  push:
    branches: [main]

jobs:
  build-deploy:
    runs-on: ubuntu-latest
    permissions:
      id-token: write    # For OIDC auth to Azure
      contents: read

    steps:
      - uses: actions/checkout@v4

      - name: Azure login (OIDC)
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

      - name: Build and push image to ACR
        run: |
          az acr build \
            --registry acraiplat \
            --image ai-platform-api:${{ github.sha }} \
            --file Dockerfile \
            .

      - name: Deploy to Container Apps
        run: |
          az containerapp update \
            --resource-group rg-ai-platform \
            --name ca-ai-platform-api \
            --image acraiplat.azurecr.io/ai-platform-api:${{ github.sha }}

      - name: Wait for readiness
        run: |
          # Poll /health/ready until it returns 200 or timeout
          URL=$(az containerapp show \
            --resource-group rg-ai-platform \
            --name ca-ai-platform-api \
            --query "properties.configuration.ingress.fqdn" \
            --output tsv)

          for i in $(seq 1 20); do
            STATUS=$(curl -s -o /dev/null -w "%{http_code}" https://$URL/health/ready)
            if [ "$STATUS" = "200" ]; then
              echo "Service is ready"
              exit 0
            fi
            echo "Attempt $i: status=$STATUS — waiting 15s"
            sleep 15
          done
          echo "Service did not become ready in time"
          exit 1

Scale-to-Zero Considerations

When minReplicas: 0, the Container App scales down completely when idle. The first request after a cold start takes time to spin up a new container — the cold start latency. For an AI service with a large model loaded in lifespan, this can be 30–60 seconds.

Options:

  1. Set minReplicas: 1 — always one replica running, eliminates cold starts, costs a small idle charge
  2. Startup probe with long failureThreshold — gives the container time to load the model before readiness
  3. Separate endpoints — move heavy model loading to a separate service; keep the API stateless and fast to start
YAML
# Minimum 1 replica to avoid cold starts for production
scale:
  minReplicas: 1
  maxReplicas: 20
  rules:
    - name: http-scaling
      http:
        metadata:
          concurrentRequests: "20"

Key Takeaways

  • Azure Container Apps abstracts Kubernetes — you describe your container in YAML and ACA handles scheduling, scaling, and TLS
  • Use keyVaultUrl in the secrets section with a managed identity instead of storing secret values in containerapp.yaml
  • Set liveness (/health), readiness (/health/ready), and startup (/health/started) probes — they are essential for zero-downtime deployments
  • Scale to zero with minReplicas: 0 minimises cost; use minReplicas: 1 to eliminate cold start latency for production services
  • az containerapp update --image <image> creates a new revision; combine with traffic splitting for canary deployments
  • az containerapp logs show --follow streams logs in real time; use Log Analytics KQL queries for historical analysis
  • The GitHub Actions OIDC flow (azure/login with client-id) avoids storing long-lived secrets in CI

Next lesson: Interview Questions — FastAPI and async Python for AI services.