Agentic AI Patterns · Lesson 1 of 15
What Makes AI 'Agentic'?
What Is Agentic AI?
Most LLM applications you have built so far follow a simple pattern: user sends a message, LLM generates a response, done. That is a single-turn interaction. Agentic AI breaks that mold. An agent runs in a loop, perceives its environment, decides what to do next, takes an action, observes the result, and repeats — until the task is complete or a stopping condition is reached.
This lesson defines what makes a system "agentic," walks through the core ReAct loop, contrasts agents with RAG, and gives you a working Python example of a minimal agent.
The Three Pillars of Agency
An AI agent has three core capabilities:
1. Perception — The agent receives input from its environment. This could be a user message, a tool result, a database query response, or output from another agent.
2. Decision — The agent uses an LLM to decide what to do next. It reasons over what it knows and picks the next action.
3. Action — The agent executes that action. Actions might include calling a web search API, writing to a file, querying a database, or calling another LLM.
These three pillars form a loop. The output of an action becomes new perception, which feeds the next decision. This is fundamentally different from a single-shot LLM call.
Perceive → Decide → Act → Perceive → Decide → Act → ... → DoneThe ReAct Loop
ReAct (Reasoning + Acting) is the most widely used agentic pattern. It was introduced in a 2022 paper by Yao et al. and has since become the foundation for most agent frameworks.
The loop looks like this:
Thought: [LLM reasons about what to do next]
Action: [LLM selects a tool and provides arguments]
Observation: [Tool runs and returns a result]
Thought: [LLM reasons about what the result means]
Action: [LLM picks the next action]
...
Thought: I now have enough information.
Final Answer: [LLM provides the final response]Each "Thought" is the LLM reasoning in plain text before committing to an action. This explicit reasoning trace is what makes ReAct agents interpretable — you can read the trace and understand why the agent did what it did.
Agents vs RAG: What Is the Difference?
Both agents and RAG systems use LLMs augmented with external knowledge. The critical difference is action capability.
| Capability | RAG System | Agent | |---|---|---| | Retrieve documents | Yes | Yes | | Take actions (write, call APIs) | No | Yes | | Multi-step reasoning | Limited | Yes | | Loop until task complete | No | Yes | | Modify its environment | No | Yes |
A RAG system retrieves relevant documents and passes them to an LLM. The LLM then generates a response grounded in those documents. That is it — one retrieval, one generation, done.
An agent can search, read what it finds, decide it needs more information, search again with a refined query, then call a calculation tool, then write the result to a database — all in a single task execution. Agents are stateful, multi-step, and action-capable.
When to Use Agents
Use an agent when:
- The task requires multiple steps that cannot be determined upfront
- The task requires taking actions in the world (write, send, compute, fetch)
- The correct next step depends on the result of the previous step
- The task involves reasoning over intermediate results
Good agent use cases:
- Automated research: search, read, synthesize, repeat
- Code generation with testing: write code, run tests, fix errors, repeat
- Customer support with tool access: look up account, check policy, respond
- Data pipeline orchestration: query, transform, validate, load
When NOT to Use Agents
Agents are powerful but they come with real costs:
- Latency — Each loop iteration adds an LLM call. A 5-step agent is at minimum 5x slower than a single LLM call.
- Cost — More LLM calls means more tokens means more money.
- Reliability — More steps means more opportunities for failure. Agents can loop forever, hallucinate tool calls, or get stuck.
- Debugging complexity — A 10-step agent trace is much harder to debug than a single prompt.
Do NOT use an agent when:
- A single LLM call with a well-engineered prompt can answer the question
- You need sub-second response times
- The task is fully deterministic and can be handled by a pipeline
- You cannot tolerate unpredictable costs
A good rule: start with the simplest approach that works. Only add agentic behavior when simpler approaches demonstrably fail.
Minimal Agent in Python
Here is a working minimal agent loop in Python. It uses the OpenAI API directly — no frameworks, no magic. This gives you full visibility into what is happening.
import json
import openai
client = openai.OpenAI()
# Tool registry: name -> callable
def search_web(query: str) -> str:
"""Simulated web search — replace with real Serper/Brave/Tavily call."""
results = {
"paracetamol dosage": "Adults: 500mg-1000mg per dose, max 4g per day.",
"ibuprofen interactions": "Ibuprofen may interact with blood thinners and ACE inhibitors.",
}
for key, value in results.items():
if key.lower() in query.lower():
return value
return f"No results found for: {query}"
def calculate(expression: str) -> str:
"""Safe calculator using eval on numeric expressions only."""
import re
# Only allow digits and basic operators for safety
if not re.match(r'^[\d\s\+\-\*/\.\(\)]+$', expression):
return "Error: only numeric expressions allowed"
try:
result = eval(expression) # noqa: S307
return str(result)
except Exception as e:
return f"Calculation error: {e}"
TOOLS = {
"search_web": search_web,
"calculate": calculate,
}
# OpenAI function definitions
TOOL_SCHEMAS = [
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the web for factual information",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The search query"}
},
"required": ["query"],
},
},
},
{
"type": "function",
"function": {
"name": "calculate",
"description": "Evaluate a numeric math expression",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "A numeric expression like '2 * 500 * 3'",
}
},
"required": ["expression"],
},
},
},
]
def run_agent(user_query: str, max_iterations: int = 10) -> str:
"""
Run a minimal ReAct-style agent loop.
Returns the final answer string.
"""
messages = [
{
"role": "system",
"content": (
"You are a helpful assistant with access to tools. "
"Use tools to look up factual information. "
"When you have enough information, provide a final answer."
),
},
{"role": "user", "content": user_query},
]
for iteration in range(max_iterations):
print(f"\n--- Iteration {iteration + 1} ---")
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=TOOL_SCHEMAS,
tool_choice="auto",
)
message = response.choices[0].message
# If no tool calls, the agent is done
if not message.tool_calls:
print(f"Final answer: {message.content}")
return message.content
# Process each tool call
messages.append(message) # Add assistant message with tool calls
for tool_call in message.tool_calls:
tool_name = tool_call.function.name
tool_args = json.loads(tool_call.function.arguments)
print(f"Tool call: {tool_name}({tool_args})")
if tool_name in TOOLS:
result = TOOLS[tool_name](**tool_args)
else:
result = f"Error: unknown tool '{tool_name}'"
print(f"Tool result: {result}")
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result,
})
return "Max iterations reached without a final answer."
if __name__ == "__main__":
answer = run_agent(
"What is the maximum daily dose of paracetamol for an adult, "
"and how many 500mg tablets is that?"
)
print(f"\nFinal Answer: {answer}")Running this produces output like:
--- Iteration 1 ---
Tool call: search_web({'query': 'paracetamol dosage adult'})
Tool result: Adults: 500mg-1000mg per dose, max 4g per day.
--- Iteration 2 ---
Tool call: calculate({'expression': '4000 / 500'})
Tool result: 8.0
--- Iteration 3 ---
Final answer: The maximum daily dose of paracetamol for an adult is 4g (4000mg).
At 500mg per tablet, that is 8 tablets per day. However, most guidelines
recommend not exceeding 8 tablets per day and spacing doses at least 4 hours apart.Key Concepts
Stopping condition — Every agent loop needs a way to stop. In the example above, stopping happens when the model returns no tool calls. Always set a max_iterations limit as a safety net.
Tool registry — Tools are just Python functions. The agent picks which one to call based on the schema descriptions. Write clear descriptions — the LLM uses them to decide which tool is appropriate.
Message history — The agent maintains a growing list of messages. Each tool result gets appended as a tool role message. This gives the LLM full context of everything that has happened.
Stateless LLM, stateful loop — The LLM itself has no memory between calls. All state lives in the messages list. The loop is what creates the illusion of a thinking, acting agent.
Summary
- An agent perceives, decides, and acts in a loop — unlike single-shot LLM calls
- The ReAct pattern interleaves reasoning (Thought) with action (Action) and observation (Observation)
- Agents can take actions in the world; RAG systems can only retrieve
- Use agents for multi-step, action-requiring tasks; avoid them when a single LLM call suffices
- Every agent needs a stopping condition and a maximum iteration limit
In the next lesson, you will implement the full ReAct pattern from scratch, including the explicit Thought/Action/Observation text format.