Defining Agents in CrewAI

The Agent Is the Heart of CrewAI

Every capability in a CrewAI system flows through agents. Tools, memory, delegation, output quality — all of it depends on how well your agents are defined. A poorly defined agent with a great task description still produces mediocre results. A well-defined agent with a clear backstory produces consistent, high-quality work.

This lesson covers every parameter of the Agent class and explains the reasoning behind each design choice.

Full Agent Constructor

Python

from crewai import Agent, LLM

agent = Agent(
    # --- Identity ---
    role="Senior Pharmacovigilance Scientist",
    goal=(
        "Identify, assess, and report drug safety signals with scientific rigor, "
        "ensuring patient safety and regulatory compliance."
    ),
    backstory=(
        "You spent 10 years at the EMA reviewing MAAs and post-market safety reports "
        "before joining industry. You apply ICH E2A guidelines instinctively, "
        "always distinguish causality from association, "
        "and write in precise regulatory English. "
        "You never approve a signal without a clinical rationale."
    ),

    # --- Model ---
    llm=LLM(model="gpt-4o"),

    # --- Capabilities ---
    tools=[],                    # List of Tool instances
    function_calling_llm=None,   # Separate LLM for tool calling (optional)

    # --- Behavior ---
    allow_delegation=False,      # Can this agent delegate to others?
    max_iter=15,                 # Max ReAct iterations before giving up
    max_rpm=None,                # Max LLM calls per minute (rate limiting)
    max_execution_time=None,     # Max seconds per task execution

    # --- Memory ---
    memory=True,                 # Enable memory for this agent

    # --- Output ---
    verbose=True,                # Log reasoning steps to console

    # --- Advanced ---
    system_template=None,        # Override the system prompt template
    prompt_template=None,        # Override the prompt template
    response_template=None,      # Override the response template
)

Role: The Job Title

The role is the shortest identity statement for an agent. It appears in the system prompt and is used by the manager (in hierarchical mode) to decide task routing.

Rules for a good role:

Specific, not generic ("Clinical Data Manager" not "Data Person")
A real professional title if possible (LLMs have strong priors about job roles)
Reflects the actual capability you want

Python

# Weak
role="Helper"

# Better
role="Research Assistant"

# Best
role="Clinical Systematic Review Specialist"

Goal: The North Star

The goal states what the agent is optimizing for. It should be specific enough to guide behavior in ambiguous situations.

Python

# Vague goal — agent will do whatever seems "helpful"
goal="Help with tasks."

# Specific goal — agent knows what "good" means
goal=(
    "Produce safety signal assessments that meet ICH E2A standards: "
    "temporally plausible, biologically coherent, and supported by at least "
    "two independent data sources before escalating to the PSUR."
)

The goal is especially important when:

The agent has multiple tools and must choose which to use
The task description is ambiguous
The agent must decide when it has done enough work

Backstory: The System Prompt That Shapes Everything

The backstory is injected into the agent's system prompt. It sets:

Domain expertise and vocabulary
Behavioral tendencies (conservative vs. bold, formal vs. casual)
How the agent handles uncertainty
What it cites and how it structures output

Anatomy of a Strong Backstory

Python

backstory=(
    # 1. Professional history (gives the LLM domain priors to activate)
    "You spent 8 years as a regulatory affairs specialist at a major EU pharma company, "
    "reviewing SmPCs and PILs for biologics. "

    # 2. Specific domain knowledge (more specific = more reliable output)
    "You are deeply familiar with EudraVigilance, the PSUR/PBRER format, "
    "and the EMA's good pharmacovigilance practice (GVP) modules. "

    # 3. Behavioral priors (shapes how the agent handles edge cases)
    "You are systematic and conservative. You never approve a claim without "
    "a regulatory basis. When evidence is ambiguous, you flag it rather than "
    "resolve it in the document. "

    # 4. Output style (shapes formatting and language)
    "You write in precise regulatory English. Your sections are numbered. "
    "You always cite the specific GVP module or CFR section you are applying."
)

allow_delegation: Collaborative vs. Focused Agents

When allow_delegation=True, an agent can hand off a subtask to another agent in the crew if it determines another agent is better suited.

Python

# A generalist orchestrator agent that routes to specialists
orchestrator = Agent(
    role="Research Orchestrator",
    goal="Coordinate research tasks and delegate to specialist agents",
    backstory=(
        "You are a principal scientist who leads a team of specialists. "
        "You know each team member's strengths and route tasks to the right person."
    ),
    allow_delegation=True,   # Can delegate to other crew members
    verbose=True,
)

# Specialist agents that focus and don't delegate
pharmacologist = Agent(
    role="Clinical Pharmacologist",
    goal="Analyze drug mechanisms and PK/PD properties",
    backstory="You are a PharmD with expertise in clinical pharmacokinetics.",
    allow_delegation=False,  # Stays focused on its own work
    verbose=True,
)

Default is False. Only set delegation to True on agents that are explicitly meant to orchestrate. Otherwise agents waste time second-guessing their assignments.

max_iter: Preventing Runaway Agents

Agents use a ReAct (Reasoning + Acting) loop: think → act (tool call) → observe → repeat. Without a limit, an agent could loop indefinitely when stuck.

Python

agent = Agent(
    role="Data Analyst",
    goal="Find the answer",
    backstory="You are thorough and persistent.",
    max_iter=10,   # Stop after 10 reasoning/action cycles
    verbose=True,
)

The default is 15. For simple tasks (no tool use), agents typically finish in 1-3 iterations. For complex tool-using tasks (web search, database queries), 10-20 is reasonable. Higher values = more thorough but more expensive.

function_calling_llm: Splitting Reasoning and Tool Use

You can use a separate, cheaper LLM for tool calling while using a more powerful LLM for reasoning:

Python

from crewai import LLM

reasoning_llm = LLM(model="gpt-4o")          # Strong reasoning
tool_llm = LLM(model="gpt-4o-mini")          # Cheap tool calling

agent = Agent(
    role="Research Analyst",
    goal="Find information efficiently",
    backstory="You are a resourceful researcher.",
    llm=reasoning_llm,
    function_calling_llm=tool_llm,            # Cheaper model handles tool calls
    verbose=True,
)

This can reduce cost significantly for tool-heavy agents where the tool calls themselves are simple (e.g., "search for X" does not need GPT-4o).

Complete Example: Three-Agent Pharmaceutical Pipeline

Python

from crewai import Agent, Task, Crew, Process, LLM
from crewai_tools import SerperDevTool, FileReadTool

llm = LLM(model="gpt-4o")
search_tool = SerperDevTool()
file_tool = FileReadTool()

# Agent 1: Research Analyst
research_analyst = Agent(
    role="Clinical Research Analyst",
    goal=(
        "Find high-quality clinical evidence for pharmaceutical compounds, "
        "prioritizing Phase 3 RCTs and systematic reviews."
    ),
    backstory=(
        "You hold a PhD in clinical pharmacology and have published "
        "systematic reviews in the Cochrane Database. "
        "You evaluate evidence using GRADE criteria and always note "
        "the quality of evidence alongside your findings. "
        "You distrust industry-sponsored studies without independent replication."
    ),
    tools=[search_tool],
    llm=llm,
    allow_delegation=False,
    max_iter=15,
    verbose=True,
)

# Agent 2: Medical Writer
medical_writer = Agent(
    role="Senior Medical Writer",
    goal=(
        "Transform clinical evidence into clear, accurate, audience-appropriate "
        "medical communications that meet AMWA standards."
    ),
    backstory=(
        "You have 12 years of medical writing experience at a top-tier CRO. "
        "You have written over 150 clinical study reports, SmPCs, and MSLs. "
        "You follow AMWA guidelines strictly, avoid promotional language, "
        "and always match tone and complexity to the target audience. "
        "You structure documents with numbered sections and clear headings."
    ),
    tools=[file_tool],
    llm=llm,
    allow_delegation=False,
    max_iter=10,
    verbose=True,
)

# Agent 3: Regulatory Reviewer
regulatory_reviewer = Agent(
    role="Regulatory Affairs Specialist",
    goal=(
        "Ensure all medical communications comply with applicable regulations "
        "and accurately represent the clinical evidence without overstatement."
    ),
    backstory=(
        "You spent 7 years at the FDA's Center for Drug Evaluation and Research "
        "before moving to industry. You apply 21 CFR Part 202 and EMA guidelines "
        "to every document you review. "
        "You are conservative by nature: if a claim is not directly supported "
        "by cited evidence, you flag it for revision. "
        "Your approval means the document is ready for external distribution."
    ),
    tools=[],
    llm=llm,
    allow_delegation=False,
    max_iter=8,
    verbose=True,
)

# Tasks
research_task = Task(
    description=(
        "Research the clinical evidence for tirzepatide in adults with type 2 diabetes. "
        "Focus on the SURPASS trial program. Include efficacy (HbA1c reduction, weight loss) "
        "and safety (GI adverse events, thyroid findings) from the pivotal Phase 3 studies."
    ),
    expected_output=(
        "A structured evidence summary with: "
        "1) Overview of the SURPASS program (trials included), "
        "2) Efficacy table (trial name, comparator, HbA1c reduction, weight loss), "
        "3) Safety summary (adverse events above 5% incidence with rates), "
        "4) Evidence quality assessment using GRADE criteria."
    ),
    agent=research_analyst,
)

writing_task = Task(
    description=(
        "Write a 500-word HCP-facing clinical summary on tirzepatide for type 2 diabetes. "
        "Target audience: endocrinologists at a US academic medical center. "
        "Do not use promotional language. All efficacy claims must reference the SURPASS program."
    ),
    expected_output=(
        "A structured 500-word clinical summary with: "
        "Title, Clinical Background (1 paragraph), "
        "Efficacy Evidence (2 paragraphs, referencing specific SURPASS trials), "
        "Safety Profile (1 paragraph), "
        "Clinical Considerations (1 paragraph with prescribing context). "
        "No superlatives. No promotional tone."
    ),
    agent=medical_writer,
    context=[research_task],
)

review_task = Task(
    description=(
        "Review the tirzepatide clinical summary for regulatory compliance. "
        "Apply 21 CFR Part 202 standards for prescription drug promotional materials. "
        "Specifically check: are all efficacy claims supported by the cited evidence? "
        "Is the safety information presented fairly and prominently? "
        "Is there any promotional language that overstates benefit or understates risk?"
    ),
    expected_output=(
        "A regulatory review report with: "
        "Status (Approved / Approved with Required Changes / Rejected), "
        "Compliance Issues (numbered list — each issue must cite the specific regulation violated), "
        "Required Changes (numbered list of exact changes needed), "
        "Reviewer Notes (any additional observations). "
        "If the document is Approved, Required Changes list should be empty."
    ),
    agent=regulatory_reviewer,
    context=[writing_task],
)

# Crew
crew = Crew(
    agents=[research_analyst, medical_writer, regulatory_reviewer],
    tasks=[research_task, writing_task, review_task],
    process=Process.sequential,
    verbose=True,
)

result = crew.kickoff()
print("=== REGULATORY REVIEW STATUS ===")
print(result.raw)

Agent Design Checklist

Before deploying an agent in production, verify:

[ ] Role is specific enough to activate domain priors in the LLM
[ ] Goal is measurable — the agent knows when it has succeeded
[ ] Backstory sets expertise level, behavioral tendencies, and output style
[ ] Tools are only the tools this agent actually needs
[ ] allow_delegation is False unless the agent is an orchestrator
[ ] max_iter is set appropriate to task complexity
[ ] verbose=True during development, consider disabling in production

Summary

The Agent class has many parameters, but three matter most: role, goal, and backstory. These three define the agent's identity, optimize its behavior, and shape its output style. Everything else — tools, memory, delegation, rate limits — is configuration. Get the identity right first, then tune the configuration.

Defining Agents in CrewAI

The Agent Is the Heart of CrewAI

Full Agent Constructor

Role: The Job Title

Goal: The North Star

Backstory: The System Prompt That Shapes Everything

Anatomy of a Strong Backstory

allow_delegation: Collaborative vs. Focused Agents

max_iter: Preventing Runaway Agents

function_calling_llm: Splitting Reasoning and Tool Use

Complete Example: Three-Agent Pharmaceutical Pipeline

Agent Design Checklist

Summary

Enjoyed this article?

Leave a comment