Learnixo

CrewAI Multi-Agents · Lesson 7 of 16

Agent Memory in CrewAI

Why Memory Matters

By default, each CrewAI task starts fresh. The agent reads the task description, the context from previous tasks (if you declared it), and its own identity — but it has no recollection of previous crew runs.

This is fine for one-shot jobs. But for agents that run repeatedly — daily safety monitoring, ongoing research synthesis, iterative document drafting — memory becomes essential. Without it:

  • The agent re-discovers the same information every run
  • It cannot build on previous conclusions
  • It cannot recognize named entities it has encountered before

CrewAI provides three memory systems that work together to solve these problems.


The Three Memory Systems

Short-Term Memory

Short-term memory persists within a single crew run. It stores the sequence of agent actions and observations, giving agents access to what they (and other agents) have done during the current execution.

Think of it as the agent's working memory for the current session.

Long-Term Memory

Long-term memory persists across crew runs. Results from previous runs are stored in a local SQLite database and retrieved via semantic similarity when relevant to the current task.

This allows an agent to "remember" that it researched metformin last week and pull those findings rather than re-fetching them.

Entity Memory

Entity memory extracts and stores named entities — people, organizations, drugs, compounds, diseases — that appear in the agent's reasoning. It uses embeddings to make entity relationships searchable.

This is useful for agents that process large volumes of documents and need to track what they have learned about specific entities over time.


Enabling Memory

Memory is enabled with a single parameter on the Crew:

Python
from crewai import Crew, Process

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    process=Process.sequential,
    memory=True,         # Enables all three memory systems
    verbose=True,
)

Or per agent:

Python
researcher = Agent(
    role="Research Analyst",
    goal="Find and retain information across sessions",
    backstory="You build on your previous research findings.",
    memory=True,         # This agent has memory; others in the crew may not
    verbose=True,
)

How Long-Term Memory Is Stored

CrewAI uses embeddings + SQLite for long-term memory storage. By default it uses OpenAI embeddings and stores data in a local .crewai/ directory.

Python
import os
from crewai import Crew, Process

# Long-term memory goes here (default: ~/.crewai/)
os.environ["CREWAI_STORAGE_DIR"] = "./crew_memory"

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    process=Process.sequential,
    memory=True,
    verbose=True,
)

Custom Embedder

You can configure which embedding model is used for memory:

Python
crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    process=Process.sequential,
    memory=True,
    embedder={
        "provider": "openai",
        "config": {
            "model": "text-embedding-3-small",  # cheaper than ada-002
        }
    },
    verbose=True,
)

For Azure:

Python
crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    process=Process.sequential,
    memory=True,
    embedder={
        "provider": "azure_openai",
        "config": {
            "model": "text-embedding-ada-002",
            "deployment_name": "your-embedding-deployment",
            "api_base": "https://your-resource.openai.azure.com/",
            "api_key": os.getenv("AZURE_OPENAI_API_KEY"),
        }
    },
    verbose=True,
)

Practical Example: Memory Across Runs

Python
import os
from crewai import Agent, Task, Crew, Process, LLM
from crewai_tools import SerperDevTool

os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["CREWAI_STORAGE_DIR"] = "./pharmacovigilance_memory"

llm = LLM(model="gpt-4o")
search_tool = SerperDevTool()

# This agent's long-term memory accumulates across daily runs
safety_monitor = Agent(
    role="Drug Safety Monitor",
    goal=(
        "Monitor drug safety signals and build institutional knowledge "
        "about adverse event patterns over time."
    ),
    backstory=(
        "You are a pharmacovigilance specialist running daily safety reviews. "
        "You reference your previous findings to spot emerging patterns "
        "and avoid re-investigating signals you have already assessed."
    ),
    tools=[search_tool],
    llm=llm,
    memory=True,      # This agent remembers across runs
    verbose=True,
)

daily_monitoring_task = Task(
    description=(
        "Review today's adverse event reports for GLP-1 receptor agonists. "
        "Check if any signals are new or represent an escalation of previously noted patterns. "
        "Today's date: {today}."
    ),
    expected_output=(
        "A daily safety monitoring report with: "
        "1) New signals identified today (if any), "
        "2) Escalating signals (comparing to previous reports), "
        "3) Signals that have resolved or decreased, "
        "4) Recommended actions."
    ),
    agent=safety_monitor,
)

crew = Crew(
    agents=[safety_monitor],
    tasks=[daily_monitoring_task],
    process=Process.sequential,
    memory=True,
    verbose=True,
)

# Run 1: first day
result_day1 = crew.kickoff(inputs={"today": "2026-05-14"})

# Run 2: next day — agent recalls what it found on day 1
result_day2 = crew.kickoff(inputs={"today": "2026-05-15"})

print("Day 2 Report:")
print(result_day2.raw)

On the second run, the agent's long-term memory is queried for relevant past findings. If it found a nausea signal on day 1, it will reference that when reviewing day 2's data rather than treating it as a new discovery.


Entity Memory Example

Entity memory is most useful when agents process documents that mention many named entities:

Python
from crewai import Agent, Task, Crew, Process, LLM

llm = LLM(model="gpt-4o")

literature_analyst = Agent(
    role="Biomedical Literature Analyst",
    goal=(
        "Analyze biomedical literature and build a knowledge graph "
        "of drug-disease-mechanism relationships."
    ),
    backstory=(
        "You read clinical papers and extract structured relationships. "
        "You track which drugs you have analyzed, what diseases they treat, "
        "and what mechanisms are involved."
    ),
    llm=llm,
    memory=True,    # Entity memory extracts named entities (drugs, diseases, genes)
    verbose=True,
)

analysis_task = Task(
    description=(
        "Analyze the following abstract and extract all drug-disease relationships:\n\n"
        "Tirzepatide, a dual GIP and GLP-1 receptor agonist, demonstrated superior "
        "HbA1c reduction compared to semaglutide in the SURPASS-2 trial. "
        "Patients with type 2 diabetes showed 2.3% reduction with 15mg tirzepatide "
        "vs 1.9% with 1mg semaglutide. GI adverse events were comparable."
    ),
    expected_output=(
        "A structured extraction with: "
        "Drugs mentioned (list), Diseases mentioned (list), "
        "Drug-Disease relationships (table: drug, disease, relationship type, evidence), "
        "Mechanisms mentioned (list)."
    ),
    agent=literature_analyst,
)

crew = Crew(
    agents=[literature_analyst],
    tasks=[analysis_task],
    process=Process.sequential,
    memory=True,
    verbose=True,
)

result = crew.kickoff()
# Entity memory now stores: tirzepatide, semaglutide, GIP, GLP-1, type 2 diabetes, etc.
# Future runs can query: "what do you know about tirzepatide?"

When to Use Memory vs When to Skip It

| Scenario | Use Memory? | Reason | |----------|------------|--------| | One-shot task (run once, done) | No | Overhead not worth it | | Daily recurring monitoring | Yes | Agent builds on previous runs | | Multiple tasks in one run with context | No | Use context=[] parameter instead | | Agent processes 100s of documents over weeks | Yes | Entity + long-term memory pays off | | Tight latency requirements | No | Memory retrieval adds latency | | Sensitive/private data | Carefully | Memory is stored locally, not on CrewAI servers | | Development / iteration | No | Memory from broken runs can pollute future runs |


Memory Overhead and Cost

Memory adds cost in two ways:

  1. Embedding calls: Every task completion triggers an embedding call to store the result in long-term memory. At roughly 1000 tokens per task output, text-embedding-3-small costs about $0.0002 per task — negligible.

  2. Retrieval context: Relevant memories are prepended to the agent's context window. If your agent has a large accumulated memory, this can add hundreds to thousands of tokens per call. Monitor token usage when memory is enabled.

To monitor:

Python
result = crew.kickoff()
print(result.token_usage)
# TokenUsage(total_tokens=14523, prompt_tokens=12100, completion_tokens=2423, ...)

Resetting Memory

During development, stale memories from broken test runs can interfere. Clear them:

Python
# Clear all memory for a crew
crew.reset_memories(command_type="all")

# Or clear specific memory types
crew.reset_memories(command_type="short")
crew.reset_memories(command_type="long")
crew.reset_memories(command_type="entities")

Or delete the storage directory manually:

Bash
rm -rf ./crew_memory

Summary

| Memory Type | Scope | Storage | Best For | |-------------|-------|---------|----------| | Short-term | Current run only | In-memory | Sharing context within a run | | Long-term | Across runs | SQLite + embeddings | Recurring tasks that build knowledge | | Entity memory | Across runs | SQLite + embeddings | Document-heavy pipelines |

Enable memory with memory=True on the Crew or individual Agent. Configure the embedder if you want a different model or provider. Clear memory between development iterations to avoid state pollution.