What Is Agentic AI?

Most LLM applications you have built so far follow a simple pattern: user sends a message, LLM generates a response, done. That is a single-turn interaction. Agentic AI breaks that mold. An agent runs in a loop, perceives its environment, decides what to do next, takes an action, observes the result, and repeats — until the task is complete or a stopping condition is reached.

This lesson defines what makes a system "agentic," walks through the core ReAct loop, contrasts agents with RAG, and gives you a working Python example of a minimal agent.

The Three Pillars of Agency

An AI agent has three core capabilities:

1. Perception — The agent receives input from its environment. This could be a user message, a tool result, a database query response, or output from another agent.

2. Decision — The agent uses an LLM to decide what to do next. It reasons over what it knows and picks the next action.

3. Action — The agent executes that action. Actions might include calling a web search API, writing to a file, querying a database, or calling another LLM.

These three pillars form a loop. The output of an action becomes new perception, which feeds the next decision. This is fundamentally different from a single-shot LLM call.

Perceive → Decide → Act → Perceive → Decide → Act → ... → Done

The ReAct Loop

ReAct (Reasoning + Acting) is the most widely used agentic pattern. It was introduced in a 2022 paper by Yao et al. and has since become the foundation for most agent frameworks.

The loop looks like this:

Thought: [LLM reasons about what to do next]
Action: [LLM selects a tool and provides arguments]
Observation: [Tool runs and returns a result]
Thought: [LLM reasons about what the result means]
Action: [LLM picks the next action]
...
Thought: I now have enough information.
Final Answer: [LLM provides the final response]

Each "Thought" is the LLM reasoning in plain text before committing to an action. This explicit reasoning trace is what makes ReAct agents interpretable — you can read the trace and understand why the agent did what it did.

Agents vs RAG: What Is the Difference?

Both agents and RAG systems use LLMs augmented with external knowledge. The critical difference is action capability.

| Capability | RAG System | Agent | |---|---|---| | Retrieve documents | Yes | Yes | | Take actions (write, call APIs) | No | Yes | | Multi-step reasoning | Limited | Yes | | Loop until task complete | No | Yes | | Modify its environment | No | Yes |

A RAG system retrieves relevant documents and passes them to an LLM. The LLM then generates a response grounded in those documents. That is it — one retrieval, one generation, done.

An agent can search, read what it finds, decide it needs more information, search again with a refined query, then call a calculation tool, then write the result to a database — all in a single task execution. Agents are stateful, multi-step, and action-capable.

When to Use Agents

Use an agent when:

The task requires multiple steps that cannot be determined upfront
The task requires taking actions in the world (write, send, compute, fetch)
The correct next step depends on the result of the previous step
The task involves reasoning over intermediate results

Good agent use cases:

Automated research: search, read, synthesize, repeat
Code generation with testing: write code, run tests, fix errors, repeat
Customer support with tool access: look up account, check policy, respond
Data pipeline orchestration: query, transform, validate, load

When NOT to Use Agents

Agents are powerful but they come with real costs:

Latency — Each loop iteration adds an LLM call. A 5-step agent is at minimum 5x slower than a single LLM call.
Cost — More LLM calls means more tokens means more money.
Reliability — More steps means more opportunities for failure. Agents can loop forever, hallucinate tool calls, or get stuck.
Debugging complexity — A 10-step agent trace is much harder to debug than a single prompt.

Do NOT use an agent when:

A single LLM call with a well-engineered prompt can answer the question
You need sub-second response times
The task is fully deterministic and can be handled by a pipeline
You cannot tolerate unpredictable costs

A good rule: start with the simplest approach that works. Only add agentic behavior when simpler approaches demonstrably fail.

Minimal Agent in Python

Here is a working minimal agent loop in Python. It uses the OpenAI API directly — no frameworks, no magic. This gives you full visibility into what is happening.

Python

import json
import openai

client = openai.OpenAI()

# Tool registry: name -> callable
def search_web(query: str) -> str:
    """Simulated web search — replace with real Serper/Brave/Tavily call."""
    results = {
        "paracetamol dosage": "Adults: 500mg-1000mg per dose, max 4g per day.",
        "ibuprofen interactions": "Ibuprofen may interact with blood thinners and ACE inhibitors.",
    }
    for key, value in results.items():
        if key.lower() in query.lower():
            return value
    return f"No results found for: {query}"


def calculate(expression: str) -> str:
    """Safe calculator using eval on numeric expressions only."""
    import re
    # Only allow digits and basic operators for safety
    if not re.match(r'^[\d\s\+\-\*/\.\(\)]+$', expression):
        return "Error: only numeric expressions allowed"
    try:
        result = eval(expression)  # noqa: S307
        return str(result)
    except Exception as e:
        return f"Calculation error: {e}"


TOOLS = {
    "search_web": search_web,
    "calculate": calculate,
}

# OpenAI function definitions
TOOL_SCHEMAS = [
    {
        "type": "function",
        "function": {
            "name": "search_web",
            "description": "Search the web for factual information",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "The search query"}
                },
                "required": ["query"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Evaluate a numeric math expression",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "A numeric expression like '2 * 500 * 3'",
                    }
                },
                "required": ["expression"],
            },
        },
    },
]


def run_agent(user_query: str, max_iterations: int = 10) -> str:
    """
    Run a minimal ReAct-style agent loop.
    Returns the final answer string.
    """
    messages = [
        {
            "role": "system",
            "content": (
                "You are a helpful assistant with access to tools. "
                "Use tools to look up factual information. "
                "When you have enough information, provide a final answer."
            ),
        },
        {"role": "user", "content": user_query},
    ]

    for iteration in range(max_iterations):
        print(f"\n--- Iteration {iteration + 1} ---")

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=TOOL_SCHEMAS,
            tool_choice="auto",
        )

        message = response.choices[0].message

        # If no tool calls, the agent is done
        if not message.tool_calls:
            print(f"Final answer: {message.content}")
            return message.content

        # Process each tool call
        messages.append(message)  # Add assistant message with tool calls

        for tool_call in message.tool_calls:
            tool_name = tool_call.function.name
            tool_args = json.loads(tool_call.function.arguments)

            print(f"Tool call: {tool_name}({tool_args})")

            if tool_name in TOOLS:
                result = TOOLS[tool_name](**tool_args)
            else:
                result = f"Error: unknown tool '{tool_name}'"

            print(f"Tool result: {result}")

            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result,
            })

    return "Max iterations reached without a final answer."


if __name__ == "__main__":
    answer = run_agent(
        "What is the maximum daily dose of paracetamol for an adult, "
        "and how many 500mg tablets is that?"
    )
    print(f"\nFinal Answer: {answer}")

Running this produces output like:

--- Iteration 1 ---
Tool call: search_web({'query': 'paracetamol dosage adult'})
Tool result: Adults: 500mg-1000mg per dose, max 4g per day.

--- Iteration 2 ---
Tool call: calculate({'expression': '4000 / 500'})
Tool result: 8.0

--- Iteration 3 ---
Final answer: The maximum daily dose of paracetamol for an adult is 4g (4000mg).
At 500mg per tablet, that is 8 tablets per day. However, most guidelines
recommend not exceeding 8 tablets per day and spacing doses at least 4 hours apart.

Key Concepts

Stopping condition — Every agent loop needs a way to stop. In the example above, stopping happens when the model returns no tool calls. Always set a max_iterations limit as a safety net.

Tool registry — Tools are just Python functions. The agent picks which one to call based on the schema descriptions. Write clear descriptions — the LLM uses them to decide which tool is appropriate.

Message history — The agent maintains a growing list of messages. Each tool result gets appended as a tool role message. This gives the LLM full context of everything that has happened.

Stateless LLM, stateful loop — The LLM itself has no memory between calls. All state lives in the messages list. The loop is what creates the illusion of a thinking, acting agent.

Summary

An agent perceives, decides, and acts in a loop — unlike single-shot LLM calls
The ReAct pattern interleaves reasoning (Thought) with action (Action) and observation (Observation)
Agents can take actions in the world; RAG systems can only retrieve
Use agents for multi-step, action-requiring tasks; avoid them when a single LLM call suffices
Every agent needs a stopping condition and a maximum iteration limit

In the next lesson, you will implement the full ReAct pattern from scratch, including the explicit Thought/Action/Observation text format.

What Makes AI 'Agentic'?