Back to blog
AI Systemsadvanced

Skill 9 — Azure Cloud: Container Apps, Azure OpenAI & AI Search in Production

Deploy PharmaBot to Azure Container Apps, configure Azure OpenAI and AI Search for production, manage secrets with Azure Key Vault, and set up autoscaling.

Asma Hafeez KhanMay 15, 20264 min read
AzureContainer AppsAzure OpenAIAzure AI SearchKey VaultCloud Deployment
Share:𝕏

Azure Services Used

| Service | Role | Cost Model | |---|---|---| | Azure Container Apps | Runs the FastAPI backend | Pay per request (scales to zero) | | Azure OpenAI | GPT-4o inference + embeddings | Pay per token | | Azure AI Search | Vector + hybrid search | Pay per search unit | | Azure Container Registry | Stores Docker images | Pay per GB stored | | Azure Cache for Redis | Rate limiting + session cache | Pay per tier | | Azure Database for PostgreSQL | Drug metadata | Pay per compute hour | | Azure Key Vault | All secrets (no secrets in code) | Pay per operation |


Step 1: Build and Push the Docker Image

DOCKERFILE
# Dockerfile
FROM python:3.11-slim AS builder
WORKDIR /app
COPY pyproject.toml .
RUN pip install --no-cache-dir -e ".[prod]"

FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
COPY pharmabot/ ./pharmabot/
COPY data/ ./data/

EXPOSE 8000
CMD ["uvicorn", "pharmabot.main:app", "--host", "0.0.0.0", "--port", "8000"]
Bash
# Build and push to Azure Container Registry
az acr login --name pharmabotacr

docker build -t pharmabot:latest .
docker tag pharmabot:latest pharmabotacr.azurecr.io/pharmabot:latest
docker push pharmabotacr.azurecr.io/pharmabot:latest

Multi-stage build keeps the final image under 200MB by excluding build tools and test dependencies.


Step 2: Azure OpenAI Setup

Bash
# Create Azure OpenAI resource
az cognitiveservices account create \
  --name pharmabot-openai \
  --resource-group pharmabot-rg \
  --kind OpenAI \
  --sku S0 \
  --location eastus

# Deploy GPT-4o model
az cognitiveservices account deployment create \
  --name pharmabot-openai \
  --resource-group pharmabot-rg \
  --deployment-name gpt-4o \
  --model-name gpt-4o \
  --model-version "2024-05-13" \
  --model-format OpenAI \
  --sku-capacity 10    # 10K tokens per minute limit

# Deploy text-embedding-3-small for RAG
az cognitiveservices account deployment create \
  --name pharmabot-openai \
  --resource-group pharmabot-rg \
  --deployment-name text-embedding-3-small \
  --model-name text-embedding-3-small \
  --model-version "1" \
  --model-format OpenAI \
  --sku-capacity 50

Step 3: Azure AI Search Setup

Bash
# Create AI Search instance (free tier for dev, Standard S1 for prod)
az search service create \
  --name pharmabot-search \
  --resource-group pharmabot-rg \
  --sku Standard \
  --location eastus

# Get the admin key
az search admin-key show \
  --service-name pharmabot-search \
  --resource-group pharmabot-rg

Create the index (run once, via Python):

Python
from azure.search.documents.indexes.aio import SearchIndexClient
from azure.core.credentials import AzureKeyCredential
from pharmabot.rag.search_index import create_drug_index_schema

async def create_index():
    client = SearchIndexClient(
        endpoint="https://pharmabot-search.search.windows.net",
        credential=AzureKeyCredential("YOUR_ADMIN_KEY"),
    )
    await client.create_or_update_index(create_drug_index_schema())
    print("Index created.")

Step 4: Secrets in Azure Key Vault

Never put secrets in environment variables passed directly — use Key Vault references:

Bash
# Create Key Vault
az keyvault create \
  --name pharmabot-kv \
  --resource-group pharmabot-rg \
  --location eastus

# Store all secrets
az keyvault secret set --vault-name pharmabot-kv --name "AzureOpenAIKey"    --value "your-key"
az keyvault secret set --vault-name pharmabot-kv --name "AzureSearchKey"    --value "your-key"
az keyvault secret set --vault-name pharmabot-kv --name "DatabaseURL"       --value "your-conn-string"
az keyvault secret set --vault-name pharmabot-kv --name "RedisURL"          --value "your-redis-url"
az keyvault secret set --vault-name pharmabot-kv --name "JWTSecretKey"      --value "your-jwt-secret"

Step 5: Deploy to Azure Container Apps

Bash
# Create Container Apps environment
az containerapp env create \
  --name pharmabot-env \
  --resource-group pharmabot-rg \
  --location eastus

# Deploy the container with Key Vault secret references
az containerapp create \
  --name pharmabot-api \
  --resource-group pharmabot-rg \
  --environment pharmabot-env \
  --image pharmabotacr.azurecr.io/pharmabot:latest \
  --registry-server pharmabotacr.azurecr.io \
  --target-port 8000 \
  --ingress external \
  --min-replicas 0 \
  --max-replicas 5 \
  --cpu 0.5 \
  --memory 1.0Gi \
  --secrets \
    "azure-openai-key=keyvaultref:pharmabot-kv/AzureOpenAIKey,identityref:system" \
    "db-url=keyvaultref:pharmabot-kv/DatabaseURL,identityref:system" \
  --env-vars \
    "AZURE_OPENAI_API_KEY=secretref:azure-openai-key" \
    "DATABASE_URL=secretref:db-url" \
    "AZURE_OPENAI_ENDPOINT=https://pharmabot-openai.openai.azure.com/" \
    "AZURE_OPENAI_DEPLOYMENT=gpt-4o"

Scale to zero (--min-replicas 0) means you pay nothing when no requests are coming in. The first request cold-starts the container in ~3s.


Step 6: Configure Autoscaling

Bash
# Scale based on HTTP request count
az containerapp update \
  --name pharmabot-api \
  --resource-group pharmabot-rg \
  --scale-rule-name http-scaler \
  --scale-rule-type http \
  --scale-rule-http-concurrency 20    # scale up when >20 concurrent requests

With this config:

  • 0 replicas → 0 requests/min
  • 1 replica → up to 20 concurrent requests
  • 5 replicas → up to 100 concurrent requests

Checkpoint

Verify production deployment:

Bash
# Get the production URL
APP_URL=$(az containerapp show \
  --name pharmabot-api \
  --resource-group pharmabot-rg \
  --query "properties.configuration.ingress.fqdn" -o tsv)

# Health check
curl https://$APP_URL/health

# Test production chat
curl -N -X POST https://$APP_URL/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "What is metformin?", "session_id": "prod-test"}'

Both should return 200. If health check shows database: error, check the Key Vault reference permissions — the Container App's managed identity needs Key Vault Secrets User role.

Enjoyed this article?

Explore the AI Systems learning path for more.

Found this helpful?

Share:𝕏

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.