Deploying FastAPI to Azure Container Apps
Deploy a FastAPI AI service to Azure Container Apps with containerapp.yaml, Key Vault secret references, health probes, scaling rules, and az CLI deployment commands.
What Is Azure Container Apps?
Azure Container Apps (ACA) is Microsoft's serverless container hosting platform. It sits above raw Kubernetes — you do not manage nodes, clusters, or Kubernetes YAML directly. Instead, you describe your container and ACA handles scheduling, scaling, networking, and TLS.
Key characteristics relevant to AI services:
| Feature | Detail | |---------|--------| | Scale to zero | Containers scale down to zero replicas when idle — no traffic, no cost | | KEDA-based scaling | Scales up based on HTTP concurrency, queue depth, CPU, or custom metrics | | Ingress | Built-in HTTPS ingress with automatic TLS certificate | | Secrets | Native secrets or Azure Key Vault references | | Managed identity | Container can authenticate to Azure services without stored credentials | | Dapr | Optional sidecar for service discovery, pub/sub, and state management | | Revision management | Multiple revisions run side by side for canary or blue-green deployments |
Prerequisites
# Install Azure CLI
az version # needs 2.53+
# Install the Container Apps extension
az extension add --name containerapp --upgrade
# Login
az login
# Register providers (one-time per subscription)
az provider register --namespace Microsoft.App
az provider register --namespace Microsoft.OperationalInsightsCore Concepts
- Environment — a shared networking boundary. Multiple Container Apps can talk to each other within one environment.
- Container App — one deployable unit. Wraps one container (or multiple sidecars).
- Revision — an immutable snapshot of a Container App's configuration. Deploying a new image creates a new revision.
- Ingress — controls how traffic reaches the Container App from outside.
- Scale rule — tells ACA when to add or remove replicas.
Create an Environment and Container Registry
RESOURCE_GROUP="rg-ai-platform"
LOCATION="uksouth"
ENVIRONMENT="cae-ai-platform"
REGISTRY="acraiplat.azurecr.io"
# Resource group
az group create --name $RESOURCE_GROUP --location $LOCATION
# Container Apps Environment
az containerapp env create \
--name $ENVIRONMENT \
--resource-group $RESOURCE_GROUP \
--location $LOCATION
# Azure Container Registry (if you don't have one)
az acr create \
--name acraiplat \
--resource-group $RESOURCE_GROUP \
--sku Basic \
--admin-enabled trueBuilding and Pushing the Image
# Build using ACR Tasks (no local Docker needed)
az acr build \
--registry acraiplat \
--image ai-platform-api:1.0.0 \
--file Dockerfile \
.
# Or build locally and push
docker build -t acraiplat.azurecr.io/ai-platform-api:1.0.0 .
docker push acraiplat.azurecr.io/ai-platform-api:1.0.0containerapp.yaml
The declarative specification for a Container App. This is the source of truth — check it into version control alongside your application code.
# containerapp.yaml
location: uksouth
resourceGroup: rg-ai-platform
type: Microsoft.App/containerApps
name: ca-ai-platform-api
properties:
managedEnvironmentId: /subscriptions/<SUB_ID>/resourceGroups/rg-ai-platform/providers/Microsoft.App/managedEnvironments/cae-ai-platform
configuration:
activeRevisionsMode: Single
# Registry authentication (using managed identity is preferred over admin key)
registries:
- server: acraiplat.azurecr.io
identity: system
# Ingress: expose port 8000 externally with HTTPS
ingress:
external: true
targetPort: 8000
transport: http
allowInsecure: false
traffic:
- weight: 100
latestRevision: true
# Secrets — reference Key Vault instead of storing values here
secrets:
- name: openai-api-key
keyVaultUrl: https://kv-ai-platform.vault.azure.net/secrets/AzureOpenAIKey
identity: system
- name: database-url
keyVaultUrl: https://kv-ai-platform.vault.azure.net/secrets/DatabaseUrl
identity: system
- name: redis-url
keyVaultUrl: https://kv-ai-platform.vault.azure.net/secrets/RedisUrl
identity: system
template:
containers:
- name: api
image: acraiplat.azurecr.io/ai-platform-api:1.0.0
# Resource allocation
resources:
cpu: 1.0
memory: 2.0Gi
# Environment variables
env:
- name: AZURE_OPENAI_API_KEY
secretRef: openai-api-key
- name: DATABASE_URL
secretRef: database-url
- name: REDIS_URL
secretRef: redis-url
- name: AZURE_OPENAI_ENDPOINT
value: https://my-aoai.openai.azure.com/
- name: ENVIRONMENT
value: production
- name: LOG_LEVEL
value: info
- name: PYTHONUNBUFFERED
value: "1"
# Health probes
probes:
- type: liveness
httpGet:
path: /health
port: 8000
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 3
- type: readiness
httpGet:
path: /health/ready
port: 8000
periodSeconds: 15
failureThreshold: 2
successThreshold: 1
- type: startup
httpGet:
path: /health/started
port: 8000
failureThreshold: 30
periodSeconds: 10
# Scaling rules
scale:
minReplicas: 0 # Scale to zero when idle
maxReplicas: 10
rules:
- name: http-scaling
http:
metadata:
concurrentRequests: "10" # Add a replica when any existing replica has 10+ concurrent requestsDeploy with az CLI
# First deploy (create)
az containerapp create \
--resource-group rg-ai-platform \
--environment cae-ai-platform \
--yaml containerapp.yaml \
--name ca-ai-platform-api
# Update image (new deployment)
az containerapp update \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--image acraiplat.azurecr.io/ai-platform-api:1.1.0
# Apply full YAML changes
az containerapp update \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--yaml containerapp.yamlManaged Identity and Key Vault
Using a managed identity lets the Container App access Azure Key Vault without storing credentials anywhere:
# Enable system-assigned managed identity
az containerapp identity assign \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--system-assigned
# Get the principal ID
PRINCIPAL_ID=$(az containerapp show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--query "identity.principalId" \
--output tsv)
# Grant the identity access to Key Vault secrets
az keyvault set-policy \
--name kv-ai-platform \
--resource-group rg-ai-platform \
--object-id $PRINCIPAL_ID \
--secret-permissions get listOnce the identity has access, the keyVaultUrl reference in containerapp.yaml resolves at runtime — no API key stored in the YAML.
Checking Logs
# Stream live logs from the latest revision
az containerapp logs show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--follow
# Show logs from a specific revision
az containerapp logs show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--revision ca-ai-platform-api--abc123 \
--tail 200
# Filter to error lines only (requires jq)
az containerapp logs show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--output json \
| jq '.[] | select(.severity == "ERROR")'Logs are also available in Azure Log Analytics (linked to the Container Apps Environment):
// Log Analytics query — last 100 errors in the past hour
ContainerAppConsoleLogs_CL
| where TimeGenerated > ago(1h)
| where ContainerAppName_s == "ca-ai-platform-api"
| where Log_s contains "ERROR"
| project TimeGenerated, RevisionName_s, Log_s
| order by TimeGenerated desc
| limit 100Revision Management: Canary Deployments
Azure Container Apps supports traffic splitting between revisions:
# Deploy new revision without sending traffic to it yet
az containerapp revision copy \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--image acraiplat.azurecr.io/ai-platform-api:2.0.0
# Get revision names
az containerapp revision list \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--query "[].name" \
--output table
# Split traffic: 90% to stable, 10% to canary
az containerapp ingress traffic set \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--revision-weight \
ca-ai-platform-api--v1=90 \
ca-ai-platform-api--v2=10
# After validation — route all traffic to new revision
az containerapp ingress traffic set \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--revision-weight ca-ai-platform-api--v2=100CI/CD Pipeline (GitHub Actions)
# .github/workflows/deploy.yml
name: Build and Deploy
on:
push:
branches: [main]
jobs:
build-deploy:
runs-on: ubuntu-latest
permissions:
id-token: write # For OIDC auth to Azure
contents: read
steps:
- uses: actions/checkout@v4
- name: Azure login (OIDC)
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Build and push image to ACR
run: |
az acr build \
--registry acraiplat \
--image ai-platform-api:${{ github.sha }} \
--file Dockerfile \
.
- name: Deploy to Container Apps
run: |
az containerapp update \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--image acraiplat.azurecr.io/ai-platform-api:${{ github.sha }}
- name: Wait for readiness
run: |
# Poll /health/ready until it returns 200 or timeout
URL=$(az containerapp show \
--resource-group rg-ai-platform \
--name ca-ai-platform-api \
--query "properties.configuration.ingress.fqdn" \
--output tsv)
for i in $(seq 1 20); do
STATUS=$(curl -s -o /dev/null -w "%{http_code}" https://$URL/health/ready)
if [ "$STATUS" = "200" ]; then
echo "Service is ready"
exit 0
fi
echo "Attempt $i: status=$STATUS — waiting 15s"
sleep 15
done
echo "Service did not become ready in time"
exit 1Scale-to-Zero Considerations
When minReplicas: 0, the Container App scales down completely when idle. The first request after a cold start takes time to spin up a new container — the cold start latency. For an AI service with a large model loaded in lifespan, this can be 30–60 seconds.
Options:
- Set
minReplicas: 1— always one replica running, eliminates cold starts, costs a small idle charge - Startup probe with long
failureThreshold— gives the container time to load the model before readiness - Separate endpoints — move heavy model loading to a separate service; keep the API stateless and fast to start
# Minimum 1 replica to avoid cold starts for production
scale:
minReplicas: 1
maxReplicas: 20
rules:
- name: http-scaling
http:
metadata:
concurrentRequests: "20"Key Takeaways
- Azure Container Apps abstracts Kubernetes — you describe your container in YAML and ACA handles scheduling, scaling, and TLS
- Use
keyVaultUrlin thesecretssection with a managed identity instead of storing secret values incontainerapp.yaml - Set liveness (
/health), readiness (/health/ready), and startup (/health/started) probes — they are essential for zero-downtime deployments - Scale to zero with
minReplicas: 0minimises cost; useminReplicas: 1to eliminate cold start latency for production services az containerapp update --image <image>creates a new revision; combine with traffic splitting for canary deploymentsaz containerapp logs show --followstreams logs in real time; use Log Analytics KQL queries for historical analysis- The GitHub Actions OIDC flow (
azure/loginwithclient-id) avoids storing long-lived secrets in CI
Next lesson: Interview Questions — FastAPI and async Python for AI services.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.