AssistantAgent vs UserProxyAgent
Deep dive into AutoGen's two core agent types: how AssistantAgent generates responses and how UserProxyAgent executes code and manages human input.
The Two Pillars of AutoGen
Every AutoGen workflow is built from two agent types. Understanding the difference is the most important conceptual leap in learning AutoGen.
| | AssistantAgent | UserProxyAgent | |---|---|---| | Backed by LLM? | Yes | No (by default) | | Role | Think and generate | Execute and relay | | Calls the API? | Every turn | Never (unless configured) | | Executes code? | No | Yes (if configured) | | Represents | An AI specialist | A human or automation layer |
Both inherit from ConversableAgent, which means both can send and receive messages. The difference is what happens when it is their turn to respond.
AssistantAgent In Depth
AssistantAgent is an LLM-backed agent. When it is its turn to speak, it sends the full conversation history to the configured model and returns the generated text as its message.
Full Configuration
import autogen
assistant = autogen.AssistantAgent(
name="python_expert",
# LLM configuration ā which model to call and how
llm_config={
"config_list": [
{
"model": "gpt-4o",
"api_key": "YOUR_OPENAI_API_KEY",
# Optional: base_url for Azure OpenAI or other providers
# "base_url": "https://YOUR_RESOURCE.openai.azure.com/",
# "api_type": "azure",
# "api_version": "2024-02-15-preview",
}
],
"temperature": 0, # 0 = deterministic output (good for code)
"timeout": 60, # seconds before timing out the LLM call
"cache_seed": 42, # cache responses with this seed (speeds up dev)
},
# The agent's persona and instructions
system_message="""You are an expert Python engineer.
Write clean, type-annotated Python code.
Include docstrings for all functions.
After writing code, add a brief explanation.
End your final message with TERMINATE.""",
# Maximum consecutive replies from this agent before forcing a human check
max_consecutive_auto_reply=10,
)What Happens When AssistantAgent Responds
- AutoGen collects the full conversation history as a list of message dicts
- The system message is prepended as
{"role": "system", "content": ...} - The full list is sent to the LLM via the configured API
- The LLM response text becomes the agent's next message
- AutoGen checks if the response contains a code block (
```python ... ```)
If a code block is found and the receiving agent has code execution enabled, AutoGen automatically extracts and queues it for execution.
Using Multiple LLM Configs (Fallback)
AutoGen supports fallback configs ā if the first model fails, it tries the next:
llm_config = {
"config_list": [
{
"model": "gpt-4o",
"api_key": "YOUR_PRIMARY_KEY",
},
{
"model": "gpt-4o-mini",
"api_key": "YOUR_FALLBACK_KEY",
},
],
"temperature": 0,
}AutoGen will automatically fall back to gpt-4o-mini if gpt-4o returns an error (rate limit, timeout, etc.).
UserProxyAgent In Depth
UserProxyAgent is the workhorse. It represents the human in the loop ā or a fully automated substitute for one. Its most important jobs are:
- Sending the initial task message
- Executing code blocks produced by the assistant
- Relaying execution results back to the assistant
- Deciding when to ask a real human for input
Full Configuration
user_proxy = autogen.UserProxyAgent(
name="executor",
# human_input_mode controls when real human input is requested
human_input_mode="NEVER",
# Maximum automated replies before requiring human input (safety limit)
max_consecutive_auto_reply=10,
# Function that returns True when the conversation should end
is_termination_msg=lambda msg: (
isinstance(msg.get("content"), str)
and "TERMINATE" in msg["content"]
),
# Code execution configuration
code_execution_config={
"work_dir": "coding_workspace", # directory where code files are saved and run
"use_docker": False, # True = run in Docker container (safer)
"timeout": 60, # seconds before killing a running process
"last_n_messages": 3, # how many recent messages to scan for code
},
# Default message to send when no human input is provided
# Only used when human_input_mode != "NEVER"
default_auto_reply="Please continue.",
)human_input_mode: The Three Modes Explained
This single parameter controls an enormous amount of behavior. Getting it wrong leads to either an agent that never stops or a workflow that constantly interrupts.
Mode 1: NEVER (Fully Automated)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
code_execution_config={"work_dir": "workspace", "use_docker": False},
)What happens: The agent never pauses to ask a human. It executes code, sends results, and continues until the termination condition is met or max_consecutive_auto_reply is reached.
Use when: Running in a CI/CD pipeline, automated testing, batch jobs, or any context where no human is watching.
Risk: Without a reliable termination condition, conversations can loop until you hit rate limits or token budgets.
Mode 2: TERMINATE (Ask Only at the End)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="TERMINATE",
max_consecutive_auto_reply=10,
is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
code_execution_config={"work_dir": "workspace", "use_docker": False},
)What happens: The agent runs automatically until the conversation ends (or max_consecutive_auto_reply is reached), then asks the human: "Would you like to continue?" The human can type a new instruction to continue, or press Enter to stop.
Use when: You want the agent to work autonomously but want a human to review the result before any further action.
Practical pattern:
[Automated] assistant: Here is the analysis. TERMINATE
[TERMINATE mode pauses here]
> Human prompt: "The analysis looks wrong. Recalculate with Q2 data."
[Automated continues with new instruction]Mode 3: ALWAYS (Human Reviews Every Turn)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="ALWAYS",
code_execution_config={"work_dir": "workspace", "use_docker": False},
)What happens: After every single message from the assistant, the terminal displays the message and waits for human input. The human must type a response before the conversation continues.
Use when: Demoing, teaching, or debugging ā situations where a human needs to approve every step before proceeding.
Warning: This mode is not suitable for production automation. The process will block indefinitely waiting for terminal input.
The Conversation Loop Between the Two Agents
Here is a concrete view of how the two agents alternate in a real workflow:
STEP 1: user_proxy sends initial task
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
user_proxy ā assistant:
"Write a Python function that reads a CSV and returns summary stats."
STEP 2: assistant calls LLM, generates code
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
assistant ā user_proxy:
"Here is the function:
```python
import pandas as pd
def summarise_csv(filepath: str) -> dict:
df = pd.read_csv(filepath)
return df.describe().to_dict()This function reads the CSV into a DataFrame and returns descriptive statistics as a dictionary."
STEP 3: user_proxy detects code block, executes it āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā [AutoGen saves code to workspace/tmp_code.py and runs it] [Output: no output ā function definition only]
user_proxy ā assistant: "exitcode: 0 (execution succeeded) Code output: (empty)"
STEP 4: assistant sees empty output, adds a test āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā assistant ā user_proxy: "The function was defined successfully. Let me add a quick test:
import pandas as pd
import io
# Create sample data
sample_csv = io.StringIO('a,b,c\n1,2,3\n4,5,6\n7,8,9\n')
df = pd.read_csv(sample_csv)
def summarise_csv(df: pd.DataFrame) -> dict:
return df.describe().to_dict()
result = summarise_csv(df)
print(result)
```"
STEP 5: user_proxy executes the test
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
user_proxy ā assistant:
"exitcode: 0 (execution succeeded)
Code output:
{'a': {'count': 3.0, 'mean': 4.0, ...}, 'b': ...}"
STEP 6: assistant sees results, wraps up
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
assistant ā user_proxy:
"The function works correctly. The test confirms it returns
summary statistics for all numeric columns. TERMINATE"
STEP 7: user_proxy checks termination condition
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
is_termination_msg({"content": "... TERMINATE"}) ā True
Conversation ends.Combining Agents: The Complete Pattern
import autogen
import os
llm_config = {
"config_list": [
{
"model": "gpt-4o-mini",
"api_key": os.environ["OPENAI_API_KEY"],
}
],
"temperature": 0,
"cache_seed": 42,
}
# The thinker
assistant = autogen.AssistantAgent(
name="assistant",
llm_config=llm_config,
system_message="""You are a Python expert.
Write clean, tested code.
Always test your code before replying TERMINATE.
Reply TERMINATE only when you are confident the code works.""",
)
# The executor
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=8,
is_termination_msg=lambda msg: (
isinstance(msg.get("content"), str)
and "TERMINATE" in msg["content"]
),
code_execution_config={
"work_dir": "autogen_workspace",
"use_docker": False,
"timeout": 30,
},
)
# Start the conversation
result = user_proxy.initiate_chat(
assistant,
message=(
"Write a Python function that takes a list of numbers and "
"returns the median. Handle edge cases: empty list, single element, "
"even-length list. Include at least 5 test cases."
),
)
# After completion
print("\n--- Conversation complete ---")
print(f"Total messages: {len(user_proxy.chat_messages[assistant])}")When UserProxyAgent Has No LLM
By default, UserProxyAgent has llm_config=False, meaning it does not call any LLM. Instead, it uses simple logic to decide its next message:
- If code was executed: send the execution output as the next message
- If no code was executed and human_input_mode is NEVER: send
default_auto_reply - If human_input_mode is ALWAYS or TERMINATE triggers: wait for human input
You can give UserProxyAgent an LLM config too, making it capable of generating responses without human input. This is how you build agent networks where every participant is LLM-backed:
# UserProxyAgent with LLM ā acts as a critic
critic_proxy = autogen.UserProxyAgent(
name="critic",
llm_config=llm_config, # give it an LLM
human_input_mode="NEVER",
system_message="""You are a strict code reviewer.
When you receive code, point out any bugs, style issues, or missing tests.
If the code is correct, reply TERMINATE.""",
code_execution_config=False, # no code execution ā just reviews
)Summary
AssistantAgentcalls the LLM every turn ā it is the AI thinkerUserProxyAgentexecutes code and manages human interaction ā it is the automation layerhuman_input_modehas three settings:NEVER(fully automated),TERMINATE(ask at end),ALWAYS(ask every turn)- The conversation loop alternates between agents until
is_termination_msgreturnsTrueormax_consecutive_auto_replyis exceeded UserProxyAgentcan optionally be given an LLM config to become a fully autonomous agent
Next lesson: we build a complete, working two-agent conversation from scratch.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.