Plan-and-Execute Pattern

ReAct is great for exploratory tasks where you do not know upfront what steps are needed. But it has a structural limitation: it is inherently sequential. Each step depends on discovering the next one through the Observation. You cannot parallelize ReAct steps because the loop does not know future steps until it gets there.

The Plan-and-Execute pattern fixes this by splitting the agent into two separate roles:

Planner — An LLM that takes the goal and produces a complete list of steps upfront
Executor — One or more LLMs that execute each step, potentially in parallel

This is analogous to how a software engineer writes a design document before writing code, rather than discovering the design as they go.

Architecture

User Goal
    │
    ▼
┌─────────────┐
│   Planner   │  ← LLM call #1: produces N steps
│   (LLM)    │
└──────┬──────┘
       │
       ▼ Step list: [step1, step2, step3, ...]
       │
┌──────┴──────────────────────────────┐
│          Execution Layer            │
│  ┌────────┐ ┌────────┐ ┌────────┐  │
│  │ Exec 1 │ │ Exec 2 │ │ Exec 3 │  │  ← Can run in parallel
│  └────────┘ └────────┘ └────────┘  │
└──────────────────────┬──────────────┘
                       │
                       ▼
              Synthesizer (LLM)
                       │
                       ▼
                 Final Answer

The planner produces the full plan in a single LLM call. Then the executor runs each step — independently or sequentially depending on whether steps have dependencies. A final synthesizer aggregates results into a coherent answer.

Planner vs Executor LLM

The planner and executor do not need to be the same model. In fact, using different models is often the right call:

| Role | Recommended Model | Reason | |---|---|---| | Planner | GPT-4o or Claude Opus | Needs strong reasoning to produce a good plan | | Executor | GPT-4o-mini or Claude Haiku | Each step is simpler; cost savings add up | | Synthesizer | GPT-4o-mini | Aggregation is usually straightforward |

Using a cheaper model for execution can cut costs significantly on plans with many steps.

Planner Implementation

Python

import json
import openai
from dataclasses import dataclass
from typing import List, Optional

client = openai.OpenAI()


@dataclass
class Step:
    """A single step in an execution plan."""
    step_number: int
    description: str
    tool: str
    tool_input: str
    depends_on: List[int]  # Step numbers this step depends on (empty = independent)
    result: Optional[str] = None


def plan(goal: str) -> List[Step]:
    """
    Call the planner LLM to produce a structured list of steps.
    Returns a list of Step objects.
    """
    planner_prompt = f"""You are a planning agent. Given a user goal, produce a
step-by-step execution plan. Each step should be concrete and executable.

Available tools:
- search(query): Search the web for information
- calculate(expression): Evaluate a math expression
- summarize(text): Summarize a long piece of text

Return a JSON array where each element has:
- step_number: integer starting at 1
- description: what this step does
- tool: which tool to use (search, calculate, or summarize)
- tool_input: the exact input to pass to the tool
- depends_on: list of step numbers this step depends on ([] if independent)

Goal: {goal}

Return ONLY valid JSON. No explanation before or after.
"""

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": planner_prompt}],
        response_format={"type": "json_object"},
        temperature=0,
    )

    raw = response.choices[0].message.content
    data = json.loads(raw)

    # Handle both {"steps": [...]} and [...] formats
    steps_data = data.get("steps", data) if isinstance(data, dict) else data

    steps = []
    for s in steps_data:
        steps.append(Step(
            step_number=s["step_number"],
            description=s["description"],
            tool=s["tool"],
            tool_input=s["tool_input"],
            depends_on=s.get("depends_on", []),
        ))

    return steps

Executor Implementation

Python

def search(query: str) -> str:
    """Simulated web search."""
    knowledge = {
        "python": "Python is a high-level programming language known for its simplicity.",
        "rust": "Rust is a systems language focused on safety and performance.",
        "salary software engineer": "Average software engineer salary in the US is $130,000-$180,000.",
        "cost of living san francisco": "San Francisco cost of living index is approximately 94% above the US average.",
        "cost of living austin": "Austin cost of living index is approximately 4% above the US average.",
    }
    query_lower = query.lower()
    for key, value in knowledge.items():
        if key in query_lower:
            return value
    return f"Search result for '{query}': General information found."


def calculate(expression: str) -> str:
    """Safe numeric calculator."""
    import re
    if not re.match(r'^[\d\s\+\-\*/\.\(\)]+$', expression):
        return "Error: only numeric expressions allowed"
    try:
        return str(eval(expression))  # noqa: S307
    except Exception as e:
        return f"Error: {e}"


def summarize(text: str) -> str:
    """Summarize text using an LLM."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "user",
                "content": f"Summarize this in one sentence:\n\n{text}",
            }
        ],
        max_tokens=100,
    )
    return response.choices[0].message.content


TOOL_REGISTRY = {
    "search": search,
    "calculate": calculate,
    "summarize": summarize,
}


def execute_step(step: Step) -> str:
    """Execute a single step using the appropriate tool."""
    tool_fn = TOOL_REGISTRY.get(step.tool)
    if not tool_fn:
        return f"Error: unknown tool '{step.tool}'"
    try:
        result = tool_fn(step.tool_input)
        return result
    except Exception as e:
        return f"Error executing step {step.step_number}: {e}"

Parallel Execution with Dependency Resolution

Independent steps can run concurrently. Steps with dependencies must wait:

Python

import concurrent.futures
from typing import Dict


def execute_plan(steps: List[Step]) -> Dict[int, str]:
    """
    Execute steps in dependency order.
    Independent steps run in parallel; dependent steps wait.

    Returns a dict mapping step_number -> result.
    """
    results: Dict[int, str] = {}
    completed: set = set()

    def is_ready(step: Step) -> bool:
        """True if all dependencies have completed."""
        return all(dep in completed for dep in step.depends_on)

    remaining = list(steps)

    while remaining:
        # Find all steps that are ready to execute
        ready = [s for s in remaining if is_ready(s)]
        if not ready:
            raise RuntimeError(
                f"Circular dependency or unresolvable plan. "
                f"Remaining steps: {[s.step_number for s in remaining]}"
            )

        # Execute ready steps in parallel
        with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
            future_to_step = {
                executor.submit(execute_step, step): step
                for step in ready
            }

            for future in concurrent.futures.as_completed(future_to_step):
                step = future_to_step[future]
                result = future.result()
                results[step.step_number] = result
                step.result = result
                completed.add(step.step_number)
                print(f"Step {step.step_number} complete: {result[:80]}...")

        # Remove completed steps
        remaining = [s for s in remaining if s.step_number not in completed]

    return results

Synthesizer

After all steps complete, the synthesizer aggregates results:

Python

def synthesize(goal: str, steps: List[Step], results: Dict[int, str]) -> str:
    """
    Combine all step results into a coherent final answer.
    """
    # Build a summary of what was done and found
    execution_summary = "\n".join(
        f"Step {s.step_number} ({s.description}):\n  Result: {results.get(s.step_number, 'No result')}"
        for s in steps
    )

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": "Synthesize the following research results into a clear, concise answer.",
            },
            {
                "role": "user",
                "content": f"Original goal: {goal}\n\nResearch results:\n{execution_summary}",
            },
        ],
        temperature=0,
    )
    return response.choices[0].message.content

Full Pipeline

Python

def run_plan_execute(goal: str) -> str:
    """
    Complete Plan-and-Execute agent.
    1. Plan: generate steps
    2. Execute: run steps in parallel where possible
    3. Synthesize: combine results into final answer
    """
    print(f"Goal: {goal}\n")

    # Phase 1: Plan
    print("=== PLANNING ===")
    steps = plan(goal)
    for s in steps:
        deps = f" (depends on: {s.depends_on})" if s.depends_on else ""
        print(f"  Step {s.step_number}: {s.description}{deps}")

    # Phase 2: Execute
    print("\n=== EXECUTING ===")
    results = execute_plan(steps)

    # Phase 3: Synthesize
    print("\n=== SYNTHESIZING ===")
    final_answer = synthesize(goal, steps, results)

    return final_answer


if __name__ == "__main__":
    answer = run_plan_execute(
        "Compare the cost of living in San Francisco vs Austin for a "
        "software engineer earning $150,000 per year. "
        "Calculate how much disposable income they would have in each city "
        "assuming housing costs take 30% of salary."
    )
    print(f"\n=== FINAL ANSWER ===\n{answer}")

Example plan generated:

JSON

{
  "steps": [
    {
      "step_number": 1,
      "description": "Get cost of living data for San Francisco",
      "tool": "search",
      "tool_input": "cost of living san francisco",
      "depends_on": []
    },
    {
      "step_number": 2,
      "description": "Get cost of living data for Austin",
      "tool": "search",
      "tool_input": "cost of living austin",
      "depends_on": []
    },
    {
      "step_number": 3,
      "description": "Calculate housing cost at 30% of $150,000",
      "tool": "calculate",
      "tool_input": "150000 * 0.30",
      "depends_on": []
    },
    {
      "step_number": 4,
      "description": "Calculate remaining income after housing",
      "tool": "calculate",
      "tool_input": "150000 - 45000",
      "depends_on": [3]
    }
  ]
}

Steps 1, 2, and 3 are independent and run in parallel. Step 4 waits for step 3.

Advantages Over ReAct

| Dimension | ReAct | Plan-and-Execute | |---|---|---| | Parallelism | None — strictly sequential | Independent steps run concurrently | | Upfront clarity | No — plan emerges step by step | Yes — full plan visible before execution | | Replanning | Easy — just continue the loop | Harder — requires re-invoking planner | | Best for | Exploratory, unknown steps | Well-structured, decomposable tasks | | Debugging | Trace each iteration | Inspect plan before execution begins |

When to Replan

Sometimes execution reveals that the plan was wrong. A step fails, or a result changes what needs to happen next. You have two options:

Option 1: Static plan — Accept the original plan and fail gracefully if a step fails. Simpler, more predictable.

Option 2: Dynamic replanning — After each step completes, check if the remaining plan still makes sense. If not, call the planner again with the accumulated results as context. More robust but adds latency and cost.

For most production use cases, static plans with good error handling are sufficient. Dynamic replanning is useful for long-running, open-ended research tasks.

Summary

Plan-and-Execute separates the "figure out what to do" step from the "do it" step
The planner uses a strong model to produce a structured list of steps with dependency information
The executor runs independent steps in parallel using ThreadPoolExecutor
A synthesizer aggregates all results into a final coherent answer
Use Plan-and-Execute when tasks are decomposable and you want parallelism
Use ReAct when tasks are exploratory and the next step cannot be known upfront

Next: the Self-Reflection pattern, where an agent evaluates and improves its own output.

Plan-and-Execute: Plan Once, Act Many Times

Plan-and-Execute Pattern

Architecture

Planner vs Executor LLM

Planner Implementation

Executor Implementation

Parallel Execution with Dependency Resolution

Synthesizer

Full Pipeline

Advantages Over ReAct

When to Replan

Summary