Learnixo

LangGraph Agents · Lesson 11 of 17

Annotated State with Custom Reducers

The Annotated Pattern

Annotated[T, metadata] is a standard Python typing construct that attaches metadata to a type without changing it. LangGraph uses this to attach reducer functions to state fields:

Python
from typing import Annotated
import operator

# Base type: list[str]
# Metadata: operator.add (the reducer)
field: Annotated[list[str], operator.add]

When LangGraph updates this field, it calls operator.add(existing_value, new_value) instead of replacing. This makes accumulation declarative in the type definition — no extra logic needed in nodes.


Building a Comprehensive State Schema

A well-designed state schema for a multi-step drug research agent:

Python
from typing import TypedDict, Annotated, Optional
from datetime import datetime
import operator

class DrugResearchState(TypedDict):
    # --- Inputs (set once at start, never accumulated) ---
    drug_name: str
    research_depth: str  # "quick", "comprehensive"
    requester_id: str

    # --- Accumulated fields (grow as pipeline runs) ---
    messages: Annotated[list[dict], operator.add]          # Conversation history
    drug_interactions: Annotated[list[str], operator.add]  # Found interactions
    safety_flags: Annotated[list[str], operator.add]       # Safety concerns
    sources: Annotated[list[str], operator.add]            # Citations

    # --- Replaced fields (latest value only) ---
    current_stage: str           # "research", "safety_check", "synthesis"
    confidence_score: float      # 0.0-1.0
    is_complete: bool
    needs_human_review: bool

    # --- Optional fields (may not be set in all paths) ---
    error_message: Optional[str]
    human_reviewer_id: Optional[str]
    approved_at: Optional[str]

Custom Reducers with Annotated

Define sophisticated merge logic directly in the type:

Python
from typing import Annotated

def merge_dicts_deep(existing: dict, update: dict) -> dict:
    """Deep merge two dicts — nested dicts are merged, not replaced."""
    if not existing:
        return update.copy()
    result = existing.copy()
    for key, value in update.items():
        if key in result and isinstance(result[key], dict) and isinstance(value, dict):
            result[key] = merge_dicts_deep(result[key], value)
        else:
            result[key] = value
    return result

def capped_list(max_size: int):
    """Factory for a reducer that caps list at max_size."""
    def reducer(existing: list, update: list) -> list:
        combined = existing + update
        return combined[-max_size:]  # Keep only latest N items
    return reducer

def unique_list(existing: list, update: list) -> list:
    """Append only items not already in the list."""
    seen = set(str(x) for x in existing)
    return existing + [x for x in update if str(x) not in seen]

class AdvancedState(TypedDict):
    # Deep merge nested dicts
    drug_profile: Annotated[dict, merge_dicts_deep]

    # Only keep last 50 messages (sliding window)
    recent_messages: Annotated[list[str], capped_list(50)]

    # No duplicates in the interactions list
    interactions: Annotated[list[str], unique_list]

Validating State at Initialization

LangGraph doesn't validate state values at runtime by default. Add validation in the entry node:

Python
from langgraph.graph import StateGraph, END

def validate_input(state: DrugResearchState) -> dict:
    """Validate initial state and fail fast if inputs are invalid."""
    errors = []

    if not state.get("drug_name", "").strip():
        errors.append("drug_name is required")

    if state.get("research_depth") not in ("quick", "comprehensive"):
        errors.append("research_depth must be 'quick' or 'comprehensive'")

    if errors:
        return {
            "error_message": "; ".join(errors),
            "is_complete": True,
            "current_stage": "failed_validation",
        }

    return {
        "current_stage": "validated",
        "messages": [{"role": "system", "content": f"Starting research for {state['drug_name']}"}],
    }

State Snapshots and Inspection

Access full state at any point with checkpointing:

Python
from langgraph.checkpoint.memory import MemorySaver

checkpointer = MemorySaver()
app = graph.compile(checkpointer=checkpointer)

config = {"configurable": {"thread_id": "research_session_1"}}

# Run the graph
app.invoke(initial_state, config=config)

# Inspect current state
snapshot = app.get_state(config)
print(f"Current step: {snapshot.values['current_stage']}")
print(f"Interactions found: {snapshot.values['drug_interactions']}")
print(f"Next node: {snapshot.next}")

# Walk through history
for state_snapshot, metadata in app.get_state_history(config):
    print(f"At step {metadata.get('step')}: stage={state_snapshot.values.get('current_stage')}")

Using Pydantic for State Validation

LangGraph also accepts Pydantic models as state, enabling automatic validation:

Python
from pydantic import BaseModel, Field, validator
from typing import Annotated
import operator

class DrugResearchStatePydantic(BaseModel):
    drug_name: str = Field(min_length=1, max_length=200)
    research_depth: str = Field(pattern="^(quick|comprehensive)$")
    messages: list[dict] = Field(default_factory=list)
    drug_interactions: list[str] = Field(default_factory=list)
    is_complete: bool = False
    confidence_score: float = Field(default=0.0, ge=0.0, le=1.0)

    @validator("drug_name")
    def drug_name_must_not_be_blank(cls, v):
        if not v.strip():
            raise ValueError("drug_name cannot be blank")
        return v.strip()

# Use as state schema
graph = StateGraph(DrugResearchStatePydantic)

Pydantic state gives you runtime validation — any node returning invalid values raises an error immediately, making bugs easier to find.


Documenting State Fields

For team codebases, add docstrings using TypedDict with Field descriptions:

Python
from typing import TypedDict

class WellDocumentedState(TypedDict):
    """State for the drug research pipeline.

    Flow: input → research → safety_check → synthesis → [human_review?] → report
    """

    drug_name: str
    """The pharmaceutical drug being researched."""

    current_stage: str
    """Current pipeline stage. Valid values: research, safety_check, synthesis, human_review, report, complete."""

    confidence_score: float
    """LLM's confidence in the research quality, 0.0-1.0. Triggers human review if below 0.7."""

Clear state documentation makes the graph's behavior understandable to the whole team, not just the person who wrote the nodes.