Structured Output and Pydantic Models
Force CrewAI task output into Pydantic models with output_pydantic, or get raw JSON with output_json. Access typed results for downstream processing.
Why Structured Output Matters
By default, CrewAI tasks return free-form text. This works for human-readable reports but not for programmatic processing. If your downstream code needs to extract specific fields (drug name, severity level, list of interactions), you need structured output.
CrewAI supports two structured output modes:
output_pydantic: returns a typed Pydantic model instanceoutput_json: returns a Python dictionary
output_pydantic
Define a Pydantic model for the expected output shape, then assign it to the task:
from crewai import Agent, Task, Crew
from pydantic import BaseModel, Field
# Define the output schema
class DrugInteractionReport(BaseModel):
drug_a: str = Field(description="First drug name")
drug_b: str = Field(description="Second drug name")
severity: str = Field(description="Interaction severity: minor, moderate, major, contraindicated")
mechanism: str = Field(description="Pharmacological mechanism of the interaction")
clinical_effects: list[str] = Field(description="List of clinical effects of the interaction")
management: str = Field(description="How to manage this interaction clinically")
monitoring_parameters: list[str] = Field(description="What to monitor if both drugs are given")
recommendation: str = Field(description="Final clinical recommendation")
# Agent
pharmacologist = Agent(
role="Clinical Pharmacologist",
goal="Analyze drug interactions accurately",
backstory="Board-certified pharmacologist specializing in drug interactions",
verbose=True,
)
# Task with structured output
interaction_task = Task(
description=(
"Analyze the drug interaction between {drug_a} and {drug_b}. "
"Provide a comprehensive interaction report."
),
expected_output=(
"A complete drug interaction report with all required fields: "
"severity (minor/moderate/major/contraindicated), mechanism, "
"clinical effects, management, monitoring parameters, and recommendation."
),
agent=pharmacologist,
output_pydantic=DrugInteractionReport, # Force structured output
)
crew = Crew(agents=[pharmacologist], tasks=[interaction_task])
result = crew.kickoff(inputs={"drug_a": "Warfarin", "drug_b": "Ibuprofen"})
# Access structured output
report: DrugInteractionReport = result.pydantic
print(f"Severity: {report.severity}")
print(f"Mechanism: {report.mechanism}")
print(f"Effects: {report.clinical_effects}")
print(f"Recommendation: {report.recommendation}")output_json
Use output_json=True when you want a dict but don't need a typed model:
class DrugProfile(BaseModel):
name: str
drug_class: str
mechanism: str
indications: list[str]
side_effects: list[str]
interactions: list[str]
profile_task = Task(
description="Create a drug profile for {drug_name}",
expected_output="Complete drug profile with all standard pharmaceutical fields",
agent=researcher,
output_json=DrugProfile, # Validates against schema but returns dict
)
result = crew.kickoff(inputs={"drug_name": "Metformin"})
# Access as dictionary
profile_dict: dict = result.json_dict
print(profile_dict["name"])
print(profile_dict["indications"])Accessing Task vs Crew Output
result = crew.kickoff()
# Crew-level output (from the last task)
print(result.raw) # String output of final task
print(result.pydantic) # Pydantic model (if output_pydantic was set on last task)
print(result.json_dict) # Dict (if output_json was set on last task)
print(result.token_usage) # Token usage across all tasks
# Task-level output (any task in the crew)
for task in crew.tasks:
output = task.output
print(f"\nTask: {task.description[:60]}")
print(f" Raw: {output.raw[:100]}")
if output.pydantic:
print(f" Typed: {output.pydantic}")
if output.json_dict:
print(f" Dict: {output.json_dict}")Multi-Task Pipeline with Typed Interfaces
Typed output between tasks creates a clean, debuggable pipeline:
from pydantic import BaseModel
class ResearchData(BaseModel):
drug_name: str
key_facts: list[str]
safety_concerns: list[str]
sources: list[str]
class PatientLeaflet(BaseModel):
title: str
what_is_it: str
when_to_use: str
how_to_use: str
side_effects: list[str]
warnings: list[str]
when_to_see_doctor: str
# Task 1: Research with typed output
research_task = Task(
description="Research {drug_name} for a patient information leaflet",
expected_output="Research data as JSON with drug_name, key_facts, safety_concerns, and sources",
agent=researcher,
output_pydantic=ResearchData,
)
# Task 2: Write with typed output (context from Task 1)
write_task = Task(
description=(
"Write a patient information leaflet for {drug_name} "
"using the research data provided in context."
),
expected_output=(
"Complete patient leaflet as JSON with all required fields: "
"title, what_is_it, when_to_use, how_to_use, side_effects, warnings, when_to_see_doctor"
),
agent=writer,
context=[research_task],
output_pydantic=PatientLeaflet,
)
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, write_task],
)
result = crew.kickoff(inputs={"drug_name": "Ibuprofen"})
# Access typed results
research: ResearchData = research_task.output.pydantic
leaflet: PatientLeaflet = result.pydantic
# Use in downstream systems
print(f"Generated leaflet for: {leaflet.title}")
print(f"Side effects: {leaflet.side_effects}")
# Save to database
await db.save_drug_leaflet(
drug_name=leaflet.title,
content=leaflet.model_dump(),
sources=research.sources,
)output_file: Save Task Output to Disk
# Save task output directly to a file
report_task = Task(
description="Generate a comprehensive drug safety report for {drug_name}",
expected_output="Full safety report in markdown format",
agent=safety_officer,
output_file="drug_safety_report.md", # Saved after task completes
)
crew.kickoff(inputs={"drug_name": "Warfarin"})
# File "drug_safety_report.md" is now createdValidation and Error Handling
If the agent doesn't produce output matching the Pydantic schema, CrewAI will attempt to retry. If retries fail, task.output.pydantic will be None and task.output.raw will contain the unstructured response:
result = crew.kickoff()
report = result.pydantic
if report is None:
# Structured output failed ā handle gracefully
raw_text = result.raw
report = fallback_parse(raw_text) # Your custom extraction logicTo improve structured output reliability:
- Write very specific
expected_outputdescriptions that describe the schema - Include field names in the expected output description
- Use
description=on Pydantic fields to guide the agent - Keep schemas focused ā fewer fields = higher success rate
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.