FastAPI for AI Engineers · Lesson 11 of 12
Deploying to Azure Container Apps
What Is Azure Container Apps?
Azure Container Apps (ACA) is Microsoft's serverless container hosting platform. It sits above raw Kubernetes — you do not manage nodes, clusters, or Kubernetes YAML directly. Instead, you describe your container and ACA handles scheduling, scaling, networking, and TLS.
Key characteristics relevant to AI services:
| Feature | Detail | |---------|--------| | Scale to zero | Containers scale down to zero replicas when idle — no traffic, no cost | | KEDA-based scaling | Scales up based on HTTP concurrency, queue depth, CPU, or custom metrics | | Ingress | Built-in HTTPS ingress with automatic TLS certificate | | Secrets | Native secrets or Azure Key Vault references | | Managed identity | Container can authenticate to Azure services without stored credentials | | Dapr | Optional sidecar for service discovery, pub/sub, and state management | | Revision management | Multiple revisions run side by side for canary or blue-green deployments |
Prerequisites
# Install Azure CLI
az version # needs 2.53+
# Install the Container Apps extension
az extension add --name containerapp --upgrade
# Login
az login
# Register providers (one-time per subscription)
az provider register --namespace Microsoft.App
az provider register --namespace Microsoft.OperationalInsightsCore Concepts
- Environment — a shared networking boundary. Multiple Container Apps can talk to each other within one environment.
- Container App — one deployable unit. Wraps one container (or multiple sidecars).
- Revision — an immutable snapshot of a Container App's configuration. Deploying a new image creates a new revision.
- Ingress — controls how traffic reaches the Container App from outside.
- Scale rule — tells ACA when to add or remove replicas.
Create an Environment and Container Registry
RESOURCE_GROUP="rg-ai-platform"
LOCATION="uksouth"
ENVIRONMENT="cae-ai-platform"
REGISTRY="acraiplat.azurecr.io"
# Resource group
az group create --name $RESOURCE_GROUP --location $LOCATION
# Container Apps Environment
az containerapp env create \
--name $ENVIRONMENT \
--resource-group $RESOURCE_GROUP \
--location $LOCATION
# Azure Container Registry (if you don't have one)
az acr create \
--name acraiplat \
--resource-group $RESOURCE_GROUP \
--sku Basic \
--admin-enabled trueBuilding and Pushing the Image
# Build using ACR Tasks (no local Docker needed)
az acr build \
--registry acraiplat \
--image ai-platform-api:1.0.0 \
--file Dockerfile \
.
# Or build locally and push
docker build -t acraiplat.azurecr.io/ai-platform-api:1.0.0 .
docker push acraiplat.azurecr.io/ai-platform-api:1.0.0containerapp.yaml
The declarative specification for a Container App. This is the source of truth — check it into version control alongside your application code.
# containerapp.yaml
location: uksouth
resourceGroup: rg-ai-platform
type: Microsoft.App/containerApps
name: ca-ai-platform-api
properties:
managedEnvironmentId: /subscriptions/<SUB_ID>/resourceGroups/rg-ai-platform/providers/Microsoft.App/managedEnvironments/cae-ai-platform
configuration:
activeRevisionsMode: Single
# Registry authentication (using managed identity is preferred over admin key)
registries:
- server: acraiplat.azurecr.io
identity: system
# Ingress: expose port 8000 externally with HTTPS
ingress:
external: true
targetPort: 8000
transport: http
allowInsecure: false
traffic:
- weight: 100
latestRevision: true
# Secrets — reference Key Vault instead of storing values here
secrets:
- name: openai-api-key
keyVaultUrl: https://kv-ai-platform.vault.azure.net/secrets/AzureOpenAIKey
identity: system
- name: database-url
keyVaultUrl: https://kv-ai-platform.vault.azure.net/secrets/DatabaseUrl
identity: system
- name: redis-url
keyVaultUrl: https://kv-ai-platform.vault.azure.net/secrets/RedisUrl
identity: system
template:
containers:
- name: api
image: acraiplat.azurecr.io/ai-platform-api:1.0.0
# Resource allocation
resources:
cpu: 1.0
memory: 2.0Gi
# Environment variables
env:
- name: AZURE_OPENAI_API_KEY
secretRef: openai-api-key
- name: DATABASE_URL
secretRef: database-url
- name: REDIS_URL
secretRef: redis-url
- name: AZURE_OPENAI_ENDPOINT
value: https://my-aoai.openai.azure.com/
- name: ENVIRONMENT
value: production
- name: LOG_LEVEL
value: info
- name: PYTHONUNBUFFERED
value: "1"
# Health probes
probes:
- type: liveness
httpGet:
path: /health
port: 8000
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 3
- type: readiness
httpGet:
path: /health/ready
port: 8000
periodSeconds: 15
failureThreshold: 2
successThreshold: 1
- type: startup
httpGet:
path: /health/started
port: 8000
failureThreshold: 30
periodSeconds: 10
# Scaling rules
scale:
minReplicas: 0 # Scale to zero when idle
maxReplicas: 10
rules:
- name: http-scaling
http:
metadata:
concurrentRequests: "10" # Add a replica when any existing replica has 10+ concurrent requestsDeploy with az CLI
# First deploy (create)
az containerapp create \
--resource-group rg-ai-platform \
--environment cae-ai-platform \
--yaml containerapp.yaml \
--name ca-ai-platform-api
# Update image (new deployment)
az containerapp update \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--image acraiplat.azurecr.io/ai-platform-api:1.1.0
# Apply full YAML changes
az containerapp update \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--yaml containerapp.yamlManaged Identity and Key Vault
Using a managed identity lets the Container App access Azure Key Vault without storing credentials anywhere:
# Enable system-assigned managed identity
az containerapp identity assign \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--system-assigned
# Get the principal ID
PRINCIPAL_ID=$(az containerapp show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--query "identity.principalId" \
--output tsv)
# Grant the identity access to Key Vault secrets
az keyvault set-policy \
--name kv-ai-platform \
--resource-group rg-ai-platform \
--object-id $PRINCIPAL_ID \
--secret-permissions get listOnce the identity has access, the keyVaultUrl reference in containerapp.yaml resolves at runtime — no API key stored in the YAML.
Checking Logs
# Stream live logs from the latest revision
az containerapp logs show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--follow
# Show logs from a specific revision
az containerapp logs show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--revision ca-ai-platform-api--abc123 \
--tail 200
# Filter to error lines only (requires jq)
az containerapp logs show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--output json \
| jq '.[] | select(.severity == "ERROR")'Logs are also available in Azure Log Analytics (linked to the Container Apps Environment):
// Log Analytics query — last 100 errors in the past hour
ContainerAppConsoleLogs_CL
| where TimeGenerated > ago(1h)
| where ContainerAppName_s == "ca-ai-platform-api"
| where Log_s contains "ERROR"
| project TimeGenerated, RevisionName_s, Log_s
| order by TimeGenerated desc
| limit 100Revision Management: Canary Deployments
Azure Container Apps supports traffic splitting between revisions:
# Deploy new revision without sending traffic to it yet
az containerapp revision copy \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--image acraiplat.azurecr.io/ai-platform-api:2.0.0
# Get revision names
az containerapp revision list \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--query "[].name" \
--output table
# Split traffic: 90% to stable, 10% to canary
az containerapp ingress traffic set \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--revision-weight \
ca-ai-platform-api--v1=90 \
ca-ai-platform-api--v2=10
# After validation — route all traffic to new revision
az containerapp ingress traffic set \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--revision-weight ca-ai-platform-api--v2=100CI/CD Pipeline (GitHub Actions)
# .github/workflows/deploy.yml
name: Build and Deploy
on:
push:
branches: [main]
jobs:
build-deploy:
runs-on: ubuntu-latest
permissions:
id-token: write # For OIDC auth to Azure
contents: read
steps:
- uses: actions/checkout@v4
- name: Azure login (OIDC)
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Build and push image to ACR
run: |
az acr build \
--registry acraiplat \
--image ai-platform-api:${{ github.sha }} \
--file Dockerfile \
.
- name: Deploy to Container Apps
run: |
az containerapp update \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--image acraiplat.azurecr.io/ai-platform-api:${{ github.sha }}
- name: Wait for readiness
run: |
# Poll /health/ready until it returns 200 or timeout
URL=$(az containerapp show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--query "properties.configuration.ingress.fqdn" \
--output tsv)
for i in $(seq 1 20); do
STATUS=$(curl -s -o /dev/null -w "%{http_code}" https://$URL/health/ready)
if [ "$STATUS" = "200" ]; then
echo "Service is ready"
exit 0
fi
echo "Attempt $i: status=$STATUS — waiting 15s"
sleep 15
done
echo "Service did not become ready in time"
exit 1Scale-to-Zero Considerations
When minReplicas: 0, the Container App scales down completely when idle. The first request after a cold start takes time to spin up a new container — the cold start latency. For an AI service with a large model loaded in lifespan, this can be 30–60 seconds.
Options:
- Set
minReplicas: 1— always one replica running, eliminates cold starts, costs a small idle charge - Startup probe with long
failureThreshold— gives the container time to load the model before readiness - Separate endpoints — move heavy model loading to a separate service; keep the API stateless and fast to start
# Minimum 1 replica to avoid cold starts for production
scale:
minReplicas: 1
maxReplicas: 20
rules:
- name: http-scaling
http:
metadata:
concurrentRequests: "20"Key Takeaways
- Azure Container Apps abstracts Kubernetes — you describe your container in YAML and ACA handles scheduling, scaling, and TLS
- Use
keyVaultUrlin thesecretssection with a managed identity instead of storing secret values incontainerapp.yaml - Set liveness (
/health), readiness (/health/ready), and startup (/health/started) probes — they are essential for zero-downtime deployments - Scale to zero with
minReplicas: 0minimises cost; useminReplicas: 1to eliminate cold start latency for production services az containerapp update --image <image>creates a new revision; combine with traffic splitting for canary deploymentsaz containerapp logs show --followstreams logs in real time; use Log Analytics KQL queries for historical analysis- The GitHub Actions OIDC flow (
azure/loginwithclient-id) avoids storing long-lived secrets in CI
Next lesson: Interview Questions — FastAPI and async Python for AI services.