Human Input Mode: When to Ask the User

Human-in-the-Loop as a First-Class Feature

Most agent frameworks treat human involvement as an afterthought — you either get a fully autonomous agent or you build the approval UI yourself. AutoGen makes human oversight a first-class design decision through the human_input_mode parameter.

The question of "when should a human be involved?" is not just a UX question — it is a safety and reliability question. Autonomous agents make mistakes. The right human_input_mode setting determines whether those mistakes are caught before they cause damage.

The Three Modes

human_input_mode options:
─────────────────────────────────────────────────────────────
NEVER      → agent never pauses for human input
             good for: automated pipelines, testing, batch jobs

TERMINATE  → agent pauses only when the conversation ends
             good for: supervised automation, final review before action

ALWAYS     → agent pauses after every message from the assistant
             good for: demos, debugging, high-risk operations
─────────────────────────────────────────────────────────────

Mode 1: NEVER — Fully Automated

Python

import autogen
import os

llm_config = {
    "config_list": [{"model": "gpt-4o-mini", "api_key": os.environ["OPENAI_API_KEY"]}],
    "temperature": 0,
}

assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
    system_message="""You are a Python code writer.
    Complete the given task, test your code, and reply TERMINATE when done.""",
)

# NEVER mode: fully automated, no human pauses
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    code_execution_config={"work_dir": "workspace", "use_docker": False, "timeout": 30},
)

user_proxy.initiate_chat(
    assistant,
    message="Write a function to check if a string is a palindrome. Test it.",
)

Behaviour: The conversation runs to completion without any interruption. The process never waits for stdin input.

When to use: CI/CD pipelines, scheduled batch jobs, automated testing harnesses, serverless functions. Anywhere a human cannot be at the keyboard.

Critical requirement: You MUST have a reliable termination condition. Without one, the conversation will loop until max_consecutive_auto_reply is hit.

Python

# NEVER mode without a good termination condition is dangerous
# This will run for max_consecutive_auto_reply turns then stop abruptly
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    # Missing: is_termination_msg — agent will always run 10 turns
)

Always pair NEVER with a meaningful is_termination_msg.

Mode 2: TERMINATE — Supervised Automation

Python

# TERMINATE mode: runs automatically, pauses only when conversation would end
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="TERMINATE",
    max_consecutive_auto_reply=10,
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    code_execution_config={"work_dir": "workspace", "use_docker": False, "timeout": 30},
)

user_proxy.initiate_chat(
    assistant,
    message="Analyse the sales data and recommend three actions.",
)

What happens at termination: When the assistant sends a message containing "TERMINATE", instead of ending immediately, AutoGen prompts the human:

Please give feedback to python_engineer. Press enter or type 'exit' to stop the conversation:

The human can:

Press Enter: accept the termination and end the conversation
Type a new instruction: continue the conversation with additional guidance
Type 'exit': explicitly end the conversation

Conversation flow with TERMINATE mode:

[Automated]
user_proxy → assistant: "Analyse the sales data..."
assistant  → user_proxy: "Here is my analysis... TERMINATE"

[PAUSE — human sees the analysis]
> Human prompt: (presses Enter)

[Conversation ends]

Or if the human continues:

[PAUSE — human sees the analysis]
> Human prompt: "This is wrong. Use the Q2 data instead."

[Automated continues]
assistant → user_proxy: "I'll redo the analysis with Q2 data..."
assistant → user_proxy: "Updated analysis: ... TERMINATE"

[PAUSE again]
> Human prompt: (presses Enter to accept)

When to use: Any workflow where a human should review the final result before it is used, logged, or acted upon. Data analysis reports, code generation before deployment, content drafts before publishing.

Mode 3: ALWAYS — Human Reviews Every Turn

Python

# ALWAYS mode: human sees and can redirect every single message
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="ALWAYS",
    code_execution_config={"work_dir": "workspace", "use_docker": False, "timeout": 30},
    # Note: max_consecutive_auto_reply is less meaningful here since
    # the human controls the pace
)

user_proxy.initiate_chat(
    assistant,
    message="Walk me through building a REST API step by step.",
)

What happens: After every assistant message, the terminal displays:

Please give feedback to assistant. Press enter to skip and use auto-reply,
or type 'exit' to stop the conversation:
>

The human can:

Press Enter: let the agent continue automatically with its default reply
Type a message: override the automatic reply with custom input
Type 'exit': end the conversation

When to use:

Pair-programming sessions with an AI
Teaching or demonstration scenarios
Debugging — step through the agent's reasoning one turn at a time
High-stakes operations where every action must be human-approved

Warning: This mode blocks the process on stdin. Never use it in a server, container, or automated pipeline — the process will hang indefinitely.

Controlling Conversation Length with max_turns

max_turns is a parameter on initiate_chat that limits how many turns the conversation can have, independent of max_consecutive_auto_reply:

Python

# max_consecutive_auto_reply limits consecutive automated replies
# max_turns limits the total number of back-and-forth exchanges

user_proxy.initiate_chat(
    assistant,
    message="Tell me three interesting facts about Python.",
    max_turns=3,           # conversation ends after 3 turns regardless
    silent=False,          # print conversation as it happens
)

The difference:

| Setting | Controls | Counts | |---|---|---| | max_consecutive_auto_reply | Consecutive automated replies from user_proxy | Resets when human intervenes | | max_turns | Total exchanges in initiate_chat | Absolute limit |

Python

# Practical: use both for defence in depth
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=15,    # safety limit within the agent
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    code_execution_config={"work_dir": "workspace", "use_docker": False},
)

user_proxy.initiate_chat(
    assistant,
    message="Build and test a complete web scraper.",
    max_turns=20,                      # hard limit on initiate_chat call
)

Designing Multi-Stage Workflows with Human Checkpoints

TERMINATE mode is powerful for workflows where you want automation between checkpoints but human approval at key decision points.

Here is a three-stage example: research → code → deploy, with human approval between each stage.

Python

import autogen
import os

llm_config = {
    "config_list": [{"model": "gpt-4o-mini", "api_key": os.environ["OPENAI_API_KEY"]}],
    "temperature": 0,
}

assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
    system_message="""You complete tasks step by step.
    End each stage with TERMINATE so the human can review before continuing.""",
)


def run_stage(stage_name: str, task: str, user_proxy: autogen.UserProxyAgent) -> str:
    """Run a single stage and return the last assistant message."""
    print(f"\n{'='*60}")
    print(f"STAGE: {stage_name}")
    print(f"{'='*60}")

    user_proxy.initiate_chat(
        assistant,
        message=task,
        clear_history=True,     # start fresh for each stage
    )

    history = user_proxy.chat_messages[assistant]
    # Return the last assistant message (before TERMINATE)
    assistant_messages = [
        msg["content"] for msg in history
        if msg.get("name") == "assistant"
    ]
    return assistant_messages[-1] if assistant_messages else ""


# Stage 1: Research — automated, then human reviews
research_proxy = autogen.UserProxyAgent(
    name="research_proxy",
    human_input_mode="TERMINATE",           # pause at end for human review
    max_consecutive_auto_reply=5,
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    code_execution_config=False,
)

research_output = run_stage(
    "RESEARCH",
    "Research the best Python libraries for parsing CSV files. "
    "Compare: csv module, pandas, polars. List pros/cons. "
    "Give a recommendation. Then say TERMINATE.",
    research_proxy,
)

# User sees the research and either approves or redirects at the TERMINATE pause
# If they press Enter, we continue to the next stage
print("\n[Research stage approved. Moving to code generation...]\n")

# Stage 2: Code generation — automated, then human reviews
code_proxy = autogen.UserProxyAgent(
    name="code_proxy",
    human_input_mode="TERMINATE",           # pause at end for human review
    max_consecutive_auto_reply=8,
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    code_execution_config={"work_dir": "workspace", "use_docker": False, "timeout": 30},
)

code_output = run_stage(
    "CODE GENERATION",
    f"Based on this research:\n{research_output[:500]}\n\n"
    "Write a Python utility function that reads a CSV, handles missing values, "
    "and returns clean data as a list of dicts. Use pandas. Include tests. "
    "Then say TERMINATE.",
    code_proxy,
)

print("\n[Code stage approved. Moving to documentation...]\n")

# Stage 3: Documentation — fully automated (low risk, human already approved code)
docs_proxy = autogen.UserProxyAgent(
    name="docs_proxy",
    human_input_mode="NEVER",              # documentation is low-risk
    max_consecutive_auto_reply=3,
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    code_execution_config=False,
)

run_stage(
    "DOCUMENTATION",
    f"Write clear API documentation for this code:\n{code_output[:800]}\n\n"
    "Format it as a markdown docstring. Keep it concise. Then say TERMINATE.",
    docs_proxy,
)

print("\n[All stages complete.]")

Custom Auto-Reply Logic

In NEVER mode, you can customise what the user proxy says automatically (when no code execution occurred):

Python

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    code_execution_config=False,
    # Default auto-reply when no code is executed and no human is present
    default_auto_reply=(
        "Please continue. If the task is complete, say TERMINATE."
    ),
)

For more dynamic auto-replies, override the generate_reply method:

Python

class SmartUserProxy(autogen.UserProxyAgent):
    """A user proxy that provides context-aware auto-replies."""

    def generate_reply(self, messages=None, sender=None, **kwargs):
        last_msg = messages[-1]["content"] if messages else ""

        # If the assistant is asking for clarification, provide a default answer
        if "?" in last_msg and "TERMINATE" not in last_msg:
            return "Please use your best judgment and continue."

        # If no code was executed and no question was asked, nudge toward completion
        if "exitcode:" not in last_msg:
            return "Please proceed with the next step or say TERMINATE if complete."

        # Default: no reply (lets the base class handle it)
        return super().generate_reply(messages=messages, sender=sender, **kwargs)

Summary

| Mode | Pauses for human? | Best for | |---|---|---| | NEVER | Never | Automated pipelines, CI/CD, batch jobs | | TERMINATE | Only at conversation end | Supervised workflows with final review | | ALWAYS | Every turn | Demos, debugging, high-risk decisions |

Always pair NEVER mode with a reliable is_termination_msg function
max_turns in initiate_chat is a hard ceiling on conversation length
max_consecutive_auto_reply is a safety limit on automated replies within an agent
Use TERMINATE mode to build multi-stage workflows with human approval checkpoints
Never use ALWAYS mode in automated environments — it will block on stdin

Next: we scale up from two agents to group conversations with GroupChat.

Human Input Mode: When to Ask the User

Human-in-the-Loop as a First-Class Feature

The Three Modes

Mode 1: NEVER — Fully Automated

Mode 2: TERMINATE — Supervised Automation

Mode 3: ALWAYS — Human Reviews Every Turn

Controlling Conversation Length with max_turns

Designing Multi-Stage Workflows with Human Checkpoints

Custom Auto-Reply Logic

Summary

Enjoyed this article?

Leave a comment