Skill 2 — Backend Engineering: Build the FastAPI Core (Async, Pydantic v2, OpenAPI)
Build the PharmaBot FastAPI backend from scratch — async endpoints, Pydantic v2 request/response schemas, Server-Sent Events streaming, and automatic OpenAPI documentation.
Why FastAPI for an AI Backend?
Three reasons FastAPI is the right choice for PharmaBot:
- Async native —
async defendpoints don't block threads while waiting for Azure OpenAI - Pydantic v2 built-in — request validation and response serialization with zero boilerplate
- Auto OpenAPI docs — Swagger UI at
/docsgenerated from your type hints
Project Setup
# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install dependencies
pip install fastapi uvicorn[standard] pydantic-settings sqlalchemy[asyncio] aiosqliteThe pyproject.toml groups dependencies into extras so you install everything with one command:
pip install -e ".[dev]" # installs pharmabot + all dev dependenciesApplication Entry Point
# pharmabot/main.py
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pharmabot.api.chat import router as chat_router
from pharmabot.api.health import router as health_router
from pharmabot.api.search import router as search_router
@asynccontextmanager
async def lifespan(app: FastAPI):
# Startup: initialise DB connection pool, Redis client, embedder
from pharmabot.database import engine
from pharmabot.cache import redis_client
await engine.connect()
yield
# Shutdown: close all connections gracefully
await engine.dispose()
await redis_client.aclose()
app = FastAPI(
title="PharmaBot AI",
description="AI pharmacist assistant — drug information and interaction checking",
version="1.0.0",
lifespan=lifespan,
)
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:5173"], # React dev server
allow_methods=["*"],
allow_headers=["*"],
)
app.include_router(chat_router, prefix="/api")
app.include_router(search_router, prefix="/api")
app.include_router(health_router)Pydantic v2 Schemas
# pharmabot/schemas/chat.py
from pydantic import BaseModel, Field
from typing import Optional
class ChatRequest(BaseModel):
message: str = Field(
min_length=3,
max_length=500,
description="The user's pharmaceutical question",
examples=["What are the side effects of metformin?"],
)
session_id: str = Field(
description="Unique session identifier for conversation memory",
examples=["user-abc-123"],
)
class SearchRequest(BaseModel):
query: str = Field(min_length=3, max_length=300)
top_k: int = Field(default=3, ge=1, le=10)
class SearchResult(BaseModel):
chunk: str
drug_name: str
score: float
source: str # "azure_search" or "pgvector"
label_section: str # e.g. "WARNINGS", "DOSAGE AND ADMINISTRATION"Pydantic v2 validates at the boundary — if a request is malformed, FastAPI returns a 422 Unprocessable Entity with a detailed error before your handler even runs.
The Chat Endpoint — Streaming SSE
# pharmabot/api/chat.py
from fastapi import APIRouter, Depends
from fastapi.responses import StreamingResponse
from pharmabot.schemas.chat import ChatRequest
from pharmabot.agents.triage import TriageAgent
from pharmabot.security.rate_limiter import check_rate_limit
from pharmabot.security.sanitizer import sanitize_input
router = APIRouter()
@router.post("/chat")
async def chat(
request: ChatRequest,
_: None = Depends(check_rate_limit),
) -> StreamingResponse:
# 1. Sanitize — block prompt injection attempts
clean_message = sanitize_input(request.message)
# 2. Stream agent response via Server-Sent Events
async def event_stream():
agent = TriageAgent(session_id=request.session_id)
async for chunk in agent.stream(clean_message):
# SSE format: "data: {text}\n\n"
yield f"data: {chunk}\n\n"
yield "data: [DONE]\n\n"
return StreamingResponse(
event_stream(),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"X-Accel-Buffering": "no", # disables nginx buffering
},
)Why StreamingResponse?
Without streaming, the user stares at a blank screen for 3–8 seconds while GPT-4o generates the full response. With SSE, each token appears as soon as it's generated — the UI feels instantaneous even for long responses.
Health Check Endpoint
# pharmabot/api/health.py
from fastapi import APIRouter
from pharmabot.database import engine
from pharmabot.cache import redis_client
router = APIRouter()
@router.get("/health")
async def health():
checks = {}
# Database ping
try:
async with engine.connect() as conn:
await conn.execute("SELECT 1")
checks["database"] = "ok"
except Exception:
checks["database"] = "error"
# Redis ping
try:
await redis_client.ping()
checks["redis"] = "ok"
except Exception:
checks["redis"] = "error"
status = "healthy" if all(v == "ok" for v in checks.values()) else "degraded"
return {"status": status, "checks": checks}Azure Container Apps uses this endpoint for liveness and readiness probes — if it returns non-200, the container is restarted automatically.
Run the Server
uvicorn pharmabot.main:app --reload --port 8000Open these in your browser:
http://localhost:8000/docs— Swagger UI (all endpoints, try them live)http://localhost:8000/health— should return{"status": "healthy"}
Checkpoint
Test the chat endpoint manually (before agents are wired up, it returns a placeholder stream):
curl -N http://localhost:8000/api/chat \
-X POST \
-H "Content-Type: application/json" \
-d '{"message": "What is metformin?", "session_id": "test-001"}'You should see SSE events flowing:
data: Metformin
data: is
data: a
data: biguanide
...
data: [DONE]If you see that, the FastAPI async streaming backbone is working. Next: wiring in prompt engineering.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.