Structured Output and Pydantic Models

Why Structured Output Matters

By default, CrewAI tasks return free-form text. This works for human-readable reports but not for programmatic processing. If your downstream code needs to extract specific fields (drug name, severity level, list of interactions), you need structured output.

CrewAI supports two structured output modes:

output_pydantic: returns a typed Pydantic model instance
output_json: returns a Python dictionary

output_pydantic

Define a Pydantic model for the expected output shape, then assign it to the task:

Python

from crewai import Agent, Task, Crew
from pydantic import BaseModel, Field

# Define the output schema
class DrugInteractionReport(BaseModel):
    drug_a: str = Field(description="First drug name")
    drug_b: str = Field(description="Second drug name")
    severity: str = Field(description="Interaction severity: minor, moderate, major, contraindicated")
    mechanism: str = Field(description="Pharmacological mechanism of the interaction")
    clinical_effects: list[str] = Field(description="List of clinical effects of the interaction")
    management: str = Field(description="How to manage this interaction clinically")
    monitoring_parameters: list[str] = Field(description="What to monitor if both drugs are given")
    recommendation: str = Field(description="Final clinical recommendation")

# Agent
pharmacologist = Agent(
    role="Clinical Pharmacologist",
    goal="Analyze drug interactions accurately",
    backstory="Board-certified pharmacologist specializing in drug interactions",
    verbose=True,
)

# Task with structured output
interaction_task = Task(
    description=(
        "Analyze the drug interaction between {drug_a} and {drug_b}. "
        "Provide a comprehensive interaction report."
    ),
    expected_output=(
        "A complete drug interaction report with all required fields: "
        "severity (minor/moderate/major/contraindicated), mechanism, "
        "clinical effects, management, monitoring parameters, and recommendation."
    ),
    agent=pharmacologist,
    output_pydantic=DrugInteractionReport,  # Force structured output
)

crew = Crew(agents=[pharmacologist], tasks=[interaction_task])

result = crew.kickoff(inputs={"drug_a": "Warfarin", "drug_b": "Ibuprofen"})

# Access structured output
report: DrugInteractionReport = result.pydantic
print(f"Severity: {report.severity}")
print(f"Mechanism: {report.mechanism}")
print(f"Effects: {report.clinical_effects}")
print(f"Recommendation: {report.recommendation}")

output_json

Use output_json=True when you want a dict but don't need a typed model:

Python

class DrugProfile(BaseModel):
    name: str
    drug_class: str
    mechanism: str
    indications: list[str]
    side_effects: list[str]
    interactions: list[str]

profile_task = Task(
    description="Create a drug profile for {drug_name}",
    expected_output="Complete drug profile with all standard pharmaceutical fields",
    agent=researcher,
    output_json=DrugProfile,  # Validates against schema but returns dict
)

result = crew.kickoff(inputs={"drug_name": "Metformin"})

# Access as dictionary
profile_dict: dict = result.json_dict
print(profile_dict["name"])
print(profile_dict["indications"])

Accessing Task vs Crew Output

Python

result = crew.kickoff()

# Crew-level output (from the last task)
print(result.raw)          # String output of final task
print(result.pydantic)     # Pydantic model (if output_pydantic was set on last task)
print(result.json_dict)    # Dict (if output_json was set on last task)
print(result.token_usage)  # Token usage across all tasks

# Task-level output (any task in the crew)
for task in crew.tasks:
    output = task.output
    print(f"\nTask: {task.description[:60]}")
    print(f"  Raw: {output.raw[:100]}")
    if output.pydantic:
        print(f"  Typed: {output.pydantic}")
    if output.json_dict:
        print(f"  Dict: {output.json_dict}")

Multi-Task Pipeline with Typed Interfaces

Typed output between tasks creates a clean, debuggable pipeline:

Python

from pydantic import BaseModel

class ResearchData(BaseModel):
    drug_name: str
    key_facts: list[str]
    safety_concerns: list[str]
    sources: list[str]

class PatientLeaflet(BaseModel):
    title: str
    what_is_it: str
    when_to_use: str
    how_to_use: str
    side_effects: list[str]
    warnings: list[str]
    when_to_see_doctor: str

# Task 1: Research with typed output
research_task = Task(
    description="Research {drug_name} for a patient information leaflet",
    expected_output="Research data as JSON with drug_name, key_facts, safety_concerns, and sources",
    agent=researcher,
    output_pydantic=ResearchData,
)

# Task 2: Write with typed output (context from Task 1)
write_task = Task(
    description=(
        "Write a patient information leaflet for {drug_name} "
        "using the research data provided in context."
    ),
    expected_output=(
        "Complete patient leaflet as JSON with all required fields: "
        "title, what_is_it, when_to_use, how_to_use, side_effects, warnings, when_to_see_doctor"
    ),
    agent=writer,
    context=[research_task],
    output_pydantic=PatientLeaflet,
)

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task],
)

result = crew.kickoff(inputs={"drug_name": "Ibuprofen"})

# Access typed results
research: ResearchData = research_task.output.pydantic
leaflet: PatientLeaflet = result.pydantic

# Use in downstream systems
print(f"Generated leaflet for: {leaflet.title}")
print(f"Side effects: {leaflet.side_effects}")

# Save to database
await db.save_drug_leaflet(
    drug_name=leaflet.title,
    content=leaflet.model_dump(),
    sources=research.sources,
)

output_file: Save Task Output to Disk

Python

# Save task output directly to a file
report_task = Task(
    description="Generate a comprehensive drug safety report for {drug_name}",
    expected_output="Full safety report in markdown format",
    agent=safety_officer,
    output_file="drug_safety_report.md",  # Saved after task completes
)

crew.kickoff(inputs={"drug_name": "Warfarin"})
# File "drug_safety_report.md" is now created

Validation and Error Handling

If the agent doesn't produce output matching the Pydantic schema, CrewAI will attempt to retry. If retries fail, task.output.pydantic will be None and task.output.raw will contain the unstructured response:

Python

result = crew.kickoff()

report = result.pydantic
if report is None:
    # Structured output failed — handle gracefully
    raw_text = result.raw
    report = fallback_parse(raw_text)  # Your custom extraction logic

To improve structured output reliability:

Write very specific expected_output descriptions that describe the schema
Include field names in the expected output description
Use description= on Pydantic fields to guide the agent
Keep schemas focused — fewer fields = higher success rate

Structured Output and Pydantic Models

Why Structured Output Matters

output_pydantic

output_json

Accessing Task vs Crew Output

Multi-Task Pipeline with Typed Interfaces

output_file: Save Task Output to Disk

Validation and Error Handling

Enjoyed this article?

Leave a comment