Back to blog
AI Systemsintermediate

Structured Logging with structlog

Replace print() and unstructured logs with structlog for AI services. Learn how to add context, trace IDs, and machine-readable logs that make debugging LLM pipelines trivial.

Asma Hafeez KhanMay 15, 20265 min read
LLMOpsLoggingstructlogObservabilityPython
Share:š•

Why Print Statements Fail in Production

Every AI engineer starts with print(f"Got response: {response}"). It works on your laptop. In production, across 10 container replicas, you get 50,000 lines of text per hour with no way to answer:

  • Which request caused that error?
  • How long did the LLM call take?
  • Which user sent the prompt that triggered a safety violation?

Structured logging replaces free-text messages with JSON objects where every field is queryable.


structlog vs Python's logging Module

| Feature | logging | structlog | |---|---|---| | Output format | Plain text | JSON (configurable) | | Adding context | Manual string formatting | bind() — attach key-value pairs | | Processor pipeline | No | Yes — chain transformations | | Async support | Limited | First-class | | Cloud-native | Needs handlers | Outputs JSON directly |


Installation

Bash
pip install structlog

Basic Configuration

Put this in your main.py before any imports that use logging:

Python
import structlog
import logging

def configure_logging():
    structlog.configure(
        processors=[
            structlog.contextvars.merge_contextvars,       # thread-local context
            structlog.stdlib.add_log_level,                # add "level" field
            structlog.stdlib.add_logger_name,              # add "logger" field
            structlog.processors.TimeStamper(fmt="iso"),   # ISO 8601 timestamps
            structlog.processors.StackInfoRenderer(),
            structlog.processors.format_exc_info,          # format exceptions
            structlog.processors.JSONRenderer(),           # output JSON
        ],
        wrapper_class=structlog.make_filtering_bound_logger(logging.DEBUG),
        context_class=dict,
        logger_factory=structlog.PrintLoggerFactory(),
    )

configure_logging()

Getting a Logger

Python
import structlog

log = structlog.get_logger()

# Basic usage
log.info("app_started", port=8000, environment="production")

# Output:
# {"event": "app_started", "port": 8000, "environment": "production",
#  "level": "info", "timestamp": "2026-05-15T10:23:44.123Z"}

Binding Context to Requests

The most powerful feature: bind request-level context once, have it appear in every log line for that request.

Python
import structlog
import uuid
from fastapi import Request

log = structlog.get_logger()

async def request_logging_middleware(request: Request, call_next):
    request_id = str(uuid.uuid4())[:8]
    
    # Bind context — appears in all logs during this request
    structlog.contextvars.clear_contextvars()
    structlog.contextvars.bind_contextvars(
        request_id=request_id,
        method=request.method,
        path=request.url.path,
    )
    
    log.info("request_started")
    
    import time
    start = time.perf_counter()
    response = await call_next(request)
    duration_ms = round((time.perf_counter() - start) * 1000, 1)
    
    log.info(
        "request_completed",
        status_code=response.status_code,
        duration_ms=duration_ms,
    )
    
    return response

Register the middleware in main.py:

Python
app.middleware("http")(request_logging_middleware)

Now every log line for a request automatically includes request_id, method, and path — without passing them through every function.


Logging LLM Calls

Wrap every OpenAI call with structured logs:

Python
import structlog
import time

log = structlog.get_logger()

async def call_azure_openai(messages: list, model: str = "gpt-4o") -> str:
    bound_log = log.bind(model=model, message_count=len(messages))
    
    bound_log.info("llm_call_started")
    start = time.perf_counter()
    
    try:
        response = await client.chat.completions.create(
            model=model,
            messages=messages,
        )
        
        duration_ms = round((time.perf_counter() - start) * 1000, 1)
        usage = response.usage
        
        bound_log.info(
            "llm_call_completed",
            duration_ms=duration_ms,
            prompt_tokens=usage.prompt_tokens,
            completion_tokens=usage.completion_tokens,
            total_tokens=usage.total_tokens,
            finish_reason=response.choices[0].finish_reason,
        )
        
        return response.choices[0].message.content
        
    except Exception as e:
        duration_ms = round((time.perf_counter() - start) * 1000, 1)
        bound_log.error(
            "llm_call_failed",
            duration_ms=duration_ms,
            error_type=type(e).__name__,
            error=str(e),
        )
        raise

This gives you a log line like:

JSON
{
  "event": "llm_call_completed",
  "model": "gpt-4o",
  "message_count": 3,
  "duration_ms": 1243.7,
  "prompt_tokens": 312,
  "completion_tokens": 87,
  "total_tokens": 399,
  "finish_reason": "stop",
  "request_id": "a3f9b12c",
  "level": "info",
  "timestamp": "2026-05-15T10:23:45.366Z"
}

Every LLM call, every token count, every latency — fully searchable in your logging system.


Logging RAG Pipeline Steps

Python
log = structlog.get_logger()

async def rag_pipeline(query: str) -> str:
    pipeline_log = log.bind(query_hash=hash(query) % 10000)
    
    # Step 1: Embed
    pipeline_log.info("embedding_started")
    embedding = await embed(query)
    pipeline_log.info("embedding_completed", dim=len(embedding))
    
    # Step 2: Retrieve
    pipeline_log.info("retrieval_started")
    docs = await retrieve(embedding, top_k=5)
    pipeline_log.info(
        "retrieval_completed",
        docs_returned=len(docs),
        top_score=round(docs[0].score, 3) if docs else None,
    )
    
    # Step 3: Generate
    answer = await call_azure_openai(build_messages(query, docs))
    
    return answer

Log Levels — What Goes Where

| Level | Use for | |---|---| | debug | Prompt content, full retrieved chunks (dev only) | | info | Request start/end, LLM call start/end, token counts | | warning | Slow LLM call (over 3s), low retrieval score, fallback triggered | | error | LLM API failure, DB connection error, unexpected exception | | critical | Data loss, security event, service completely down |

Python
# Never log prompt content at info in production — it may contain PII
log.debug("prompt_content", prompt=messages)   # filtered out in prod
log.info("prompt_sent", length=len(str(messages)))  # safe metadata only

Filtering Log Levels by Environment

Python
import os

LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO").upper()

structlog.configure(
    wrapper_class=structlog.make_filtering_bound_logger(
        getattr(logging, LOG_LEVEL)
    ),
    ...
)

.env.development: LOG_LEVEL=DEBUG .env.production: LOG_LEVEL=INFO


Sending Logs to Azure Monitor

Use the azure-monitor-opentelemetry exporter — structlog outputs JSON to stdout, and Azure Container Apps forwards stdout to Log Analytics automatically.

Or use the Azure Log Analytics handler:

Python
# In production, just write JSON to stdout
# Azure Container Apps → Log Analytics via diagnostic settings
# Query in Log Analytics:
# ContainerAppConsoleLogs_CL
# | where ContainerName_s == "pharmabot"
# | where message has "llm_call_completed"
# | project TimeGenerated, duration_ms_d, total_tokens_d

Checkpoint

After adding structlog, run your app and make a request. You should see JSON output:

Bash
curl http://localhost:8000/api/chat -d '{"message":"What is ibuprofen?"}' -H "Content-Type: application/json"

Check your terminal for log output like:

JSON
{"event": "request_started", "method": "POST", "path": "/api/chat", "request_id": "f3a1b2c9", "level": "info"}
{"event": "llm_call_started", "model": "gpt-4o", "request_id": "f3a1b2c9", "level": "info"}
{"event": "llm_call_completed", "duration_ms": 1891.2, "total_tokens": 423, "request_id": "f3a1b2c9", "level": "info"}
{"event": "request_completed", "status_code": 200, "duration_ms": 1954.8, "request_id": "f3a1b2c9", "level": "info"}

Every line has a request_id — you can filter all logs for one request instantly.

Enjoyed this article?

Explore the AI Systems learning path for more.

Found this helpful?

Share:š•

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.