Dockerising a FastAPI AI Service

Why Containerise?

Docker containers give AI services:

Reproducible environments — the same image runs on a developer laptop, CI pipeline, and production cluster
Dependency isolation — Python packages and native libraries are bundled, not installed globally on the host
Horizontal scaling — run as many identical container instances as you need
Immutable deployments — roll back by tagging and running an older image

The Simplest Dockerfile

Before optimising, here is the simplest working Dockerfile for a FastAPI service:

DOCKERFILE

FROM python:3.12-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

This works but has problems:

Runs as root (security risk)
Single build stage — dev tools end up in the production image
Every file change rebuilds the pip install layer (slow builds)
No .dockerignore — copies .git, __pycache__, secrets

Multi-Stage Build

Multi-stage builds separate the "build" environment from the "runtime" environment. The final image only contains what is needed to run the service — not pip, build tools, or intermediate files.

DOCKERFILE

# ============================================================
# Stage 1: Build — install dependencies into a virtual environment
# ============================================================
FROM python:3.12-slim AS builder

WORKDIR /build

# Install build tools needed for some Python packages (e.g. psycopg2, numpy)
RUN apt-get update \
    && apt-get install -y --no-install-recommends gcc libpq-dev \
    && rm -rf /var/lib/apt/lists/*

# Create a virtual environment in a known path
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Copy requirements first — Docker caches this layer separately from app code
COPY requirements.txt .
RUN pip install --no-cache-dir --upgrade pip \
    && pip install --no-cache-dir -r requirements.txt

# ============================================================
# Stage 2: Runtime — minimal image with only the venv and app code
# ============================================================
FROM python:3.12-slim AS runtime

# Install only runtime native libs (e.g. libpq for psycopg2)
RUN apt-get update \
    && apt-get install -y --no-install-recommends libpq5 \
    && rm -rf /var/lib/apt/lists/*

# Copy the virtual environment from the build stage
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Create a non-root user and group
RUN groupadd --system appgroup \
    && useradd --system --gid appgroup --no-create-home appuser

WORKDIR /app

# Copy application code
COPY --chown=appuser:appgroup . .

# Switch to non-root user
USER appuser

EXPOSE 8000

CMD ["uvicorn", "main:app", \
     "--host", "0.0.0.0", \
     "--port", "8000", \
     "--workers", "1", \
     "--log-level", "info"]

The final image does not contain pip, gcc, build headers, or the build stage's filesystem — it is significantly smaller and has a smaller attack surface.

Non-Root User

Running as root inside a container means that if an attacker escapes the container sandbox, they have root on the host. Running as a non-root user is a fundamental security baseline.

DOCKERFILE

# Create system user (no home dir, no shell, no password)
RUN groupadd --system appgroup \
    && useradd --system --gid appgroup --no-create-home --shell /bin/false appuser

# Give the user ownership of the app directory
COPY --chown=appuser:appgroup . .

# Switch to non-root
USER appuser

Verify in a running container:

Bash

docker exec my-container whoami   # Should print: appuser

Layer Caching: Requirements Before App Code

Docker builds images layer by layer. If a layer's input hasn't changed, Docker reuses the cached layer. The key insight:

requirements.txt changes rarely
Application code changes constantly

Copy requirements.txt first, run pip install, then copy the rest:

DOCKERFILE

# Layer 1: requirements (cached if requirements.txt unchanged)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Layer 2: app code (rebuilt on every code change — but layer 1 is cached)
COPY . .

If you copy everything in one step (COPY . .), then any code change forces pip install to run again — slow builds on every commit.

CMD: Uvicorn Configuration

Single worker (recommended for containers)

In a containerised environment, process-level parallelism is handled by running multiple container replicas, not by running multiple workers per container. Use a single worker and let the orchestrator scale the number of containers:

DOCKERFILE

CMD ["uvicorn", "main:app", \
     "--host", "0.0.0.0", \
     "--port", "8000", \
     "--workers", "1", \
     "--log-level", "info", \
     "--access-log"]

Multiple workers (on large VMs)

If you run on a large VM (8+ cores) and want to use them within a single container:

DOCKERFILE

CMD ["uvicorn", "main:app", \
     "--host", "0.0.0.0", \
     "--port", "8000", \
     "--workers", "4"]

Gunicorn as process manager

Gunicorn provides better process management (automatic worker respawn, graceful restarts):

DOCKERFILE

CMD ["gunicorn", "main:app", \
     "--workers", "4", \
     "--worker-class", "uvicorn.workers.UvicornWorker", \
     "--bind", "0.0.0.0:8000", \
     "--timeout", "120", \
     "--graceful-timeout", "30", \
     "--log-level", "info"]

The .dockerignore File

.dockerignore prevents files from being sent to the Docker build context (and thus being copied into the image). Without it, COPY . . copies your entire project directory including unnecessary or sensitive files.

# Version control
.git
.gitignore

# Python cache
__pycache__
*.pyc
*.pyo
*.pyd
.Python

# Virtual environments
.venv
venv
env

# Tests (not needed in production image)
tests/
pytest.ini
.pytest_cache
coverage.xml
.coverage

# Development config
.env
.env.local
*.env
docker-compose*.yml

# IDE files
.vscode
.idea
*.swp

# Build artifacts
*.egg-info
dist/
build/

# Documentation
*.md
docs/

# CI
.github/
.gitlab-ci.yml

This keeps the image small and — critically — prevents .env files with secrets from ending up inside the image layer.

Requirements File

Use pip freeze > requirements.txt for reproducibility, or better, use pip-tools to manage a requirements.in → requirements.txt workflow:

# requirements.txt — pinned for reproducible builds
fastapi==0.115.5
uvicorn[standard]==0.32.1
gunicorn==23.0.0
openai==1.55.0
pydantic==2.10.3
pydantic-settings==2.6.1
httpx==0.28.0
asyncpg==0.30.0
redis==5.2.0
python-dotenv==1.0.1

Environment Variable Injection

Never bake secrets into the Docker image. Inject them at runtime:

Bash

# Development — from a .env file
docker run --env-file .env -p 8000:8000 my-ai-service:latest

# Production — pass individual variables
docker run \
  -e OPENAI_API_KEY="sk-..." \
  -e DATABASE_URL="postgresql://..." \
  -e REDIS_URL="redis://..." \
  -p 8000:8000 \
  my-ai-service:latest

Read them in FastAPI via pydantic-settings:

Python

# config.py
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8")

    openai_api_key: str
    azure_openai_endpoint: str = ""
    azure_openai_api_key: str = ""
    database_url: str
    redis_url: str = "redis://localhost:6379"
    environment: str = "production"
    debug: bool = False
    log_level: str = "info"

settings = Settings()

BaseSettings reads environment variables first, then falls back to .env if present.

Complete Production Dockerfile

DOCKERFILE

# ============================================================
# FastAPI AI Service — Production Dockerfile
# ============================================================

# ---- Build stage ----
FROM python:3.12-slim AS builder

LABEL stage=builder

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        gcc \
        libpq-dev \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /build

RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

COPY requirements.txt .
RUN pip install --no-cache-dir --upgrade pip==24.3.1 \
    && pip install --no-cache-dir -r requirements.txt

# ---- Runtime stage ----
FROM python:3.12-slim AS runtime

LABEL maintainer="Asma Hafeez Khan <asma@example.com>"
LABEL org.opencontainers.image.title="AI Platform Service"
LABEL org.opencontainers.image.version="1.0.0"

# Runtime native deps only
RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        libpq5 \
        curl \
    && rm -rf /var/lib/apt/lists/*

# Non-root user
RUN groupadd --gid 1001 appgroup \
    && useradd --uid 1001 --gid appgroup --no-create-home --shell /bin/false appuser

# Copy venv from builder
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1

WORKDIR /app

# Copy app code (owned by non-root user)
COPY --chown=appuser:appgroup . .

USER appuser

EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=5s --start-period=60s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

CMD ["uvicorn", "main:app", \
     "--host", "0.0.0.0", \
     "--port", "8000", \
     "--workers", "1", \
     "--log-level", "info", \
     "--access-log", \
     "--no-use-colors"]

Build and Run

Bash

# Build the image
docker build -t my-ai-service:latest .

# Run locally with env vars from file
docker run --rm \
  --env-file .env \
  -p 8000:8000 \
  --name ai-service \
  my-ai-service:latest

# Check health
curl http://localhost:8000/health

# View logs
docker logs ai-service --follow

# Inspect image layers and size
docker image inspect my-ai-service:latest
docker history my-ai-service:latest

Docker Compose for Local Development

YAML

# docker-compose.yml
services:
  api:
    build: .
    ports:
      - "8000:8000"
    env_file: .env
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    volumes:
      - .:/app   # Mount source for hot-reload (dev only)
    command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
      POSTGRES_DB: aidb
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user -d aidb"]
      interval: 5s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 3

Key Takeaways

Multi-stage builds keep the production image small — the runtime stage contains only the venv, app code, and native runtime libs, not build tools
Copy requirements.txt and pip install before COPY . . to maximise layer caching — unchanged dependencies don't get reinstalled on every build
Always create a non-root user (useradd --system) and switch to it with USER before the CMD
Use a .dockerignore file to exclude .git, __pycache__, .env, tests/, and *.md from the build context
Inject secrets via environment variables at runtime — never bake them into the image
PYTHONUNBUFFERED=1 ensures Python output is flushed immediately to Docker logs without buffering
Use a single worker per container and let the orchestrator scale container replicas horizontally

Next lesson: Deploying FastAPI to Azure Container Apps.

Dockerising a FastAPI AI Service

Why Containerise?

The Simplest Dockerfile

Multi-Stage Build

Non-Root User

Layer Caching: Requirements Before App Code

CMD: Uvicorn Configuration

Single worker (recommended for containers)

Multiple workers (on large VMs)

Gunicorn as process manager

The .dockerignore File

Requirements File

Environment Variable Injection

Complete Production Dockerfile

Build and Run

Docker Compose for Local Development

Key Takeaways

Enjoyed this article?

Leave a comment