Returning Tool Results to the LLM
Master the message flow for feeding tool results back to the LLM ā correct role, format, ID matching, large result handling, and the full execution loop.
The Message Flow
Tool calling introduces a four-message sequence into your conversation:
1. User message role: "user"
2. Assistant tool call role: "assistant" (contains tool_calls)
3. Tool result role: "tool" (contains tool result)
4. Assistant answer role: "assistant" (final text response)The LLM sees all four messages when generating the final answer. Get the format of message 3 wrong and the model either ignores the result, hallucinates, or produces an API error.
The role: "tool" Message Format
{
"role": "tool",
"tool_call_id": "call_abc123", # Must match the id from the tool_call request
"content": json.dumps(result) # Always a string ā serialize to JSON
}Three things matter:
- role must be exactly
"tool"ā not"function", not"system", not"user" - tool_call_id must match the
idfield on thetool_callobject from the assistant message - content must be a string ā serialize your result dict with
json.dumps()
The tool_call_id is how the LLM knows which tool call produced which result. This matters especially for parallel tool calls where multiple results come back in sequence.
Minimal Correct Implementation
import json
import openai
client = openai.OpenAI()
tools = [
{
"type": "function",
"function": {
"name": "get_patient_record",
"description": "Retrieve a patient's medical record by patient ID.",
"parameters": {
"type": "object",
"properties": {
"patient_id": {
"type": "string",
"description": "The patient's unique identifier, e.g. 'P-00123'."
}
},
"required": ["patient_id"]
}
}
}
]
def get_patient_record(patient_id: str) -> dict:
"""Mock patient record lookup."""
records = {
"P-00123": {
"name": "Jane Doe",
"dob": "1975-03-14",
"conditions": ["Type 2 Diabetes", "Hypertension"],
"current_medications": ["Metformin 500mg", "Lisinopril 10mg"],
"allergies": ["Penicillin"]
}
}
record = records.get(patient_id)
if not record:
return {"error": f"Patient {patient_id} not found"}
return record
def run_agent(user_message: str) -> str:
messages = [
{
"role": "system",
"content": "You are a clinical assistant. Use tools to look up patient records."
},
{"role": "user", "content": user_message}
]
# First LLM call
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto"
)
assistant_message = response.choices[0].message
# If no tool call, return the direct answer
if not assistant_message.tool_calls:
return assistant_message.content
# Append the assistant's message (containing the tool_call request)
messages.append(assistant_message)
# Execute each tool call and append the result
for tool_call in assistant_message.tool_calls:
fn_name = tool_call.function.name
fn_args = json.loads(tool_call.function.arguments)
# Execute
if fn_name == "get_patient_record":
result = get_patient_record(**fn_args)
else:
result = {"error": f"Unknown tool: {fn_name}"}
# Append the tool result message
messages.append({
"role": "tool",
"tool_call_id": tool_call.id, # Critical: must match
"content": json.dumps(result) # Must be a string
})
# Second LLM call ā now the model has the real data
final_response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)
return final_response.choices[0].message.content
print(run_agent("What medications is patient P-00123 currently taking?"))Formatting Tool Results: JSON vs Plain Text vs Structured Error
JSON (preferred for structured data)
# Good ā the LLM can parse and reason about structured fields
result = {
"patient_id": "P-00123",
"medications": ["Metformin 500mg", "Lisinopril 10mg"],
"last_updated": "2026-05-01"
}
content = json.dumps(result)Plain Text (acceptable for simple results)
# Acceptable for simple scalar results
result = "Patient P-00123: Jane Doe, DOB 1975-03-14"
content = result # Already a stringStructured Error
# Errors should be structured too ā the LLM reads them
result = {
"error": "Patient not found",
"patient_id": "P-99999",
"suggestion": "Verify the patient ID and try again"
}
content = json.dumps(result)The LLM will incorporate error information into its response ā e.g., "I couldn't find patient P-99999 in the system. Could you double-check the ID?"
Handling Large Tool Results
LLMs have context limits. A tool that returns a 50,000-token database dump will either fail or crowd out useful context. Three strategies:
Strategy 1: Truncate at the tool level
def search_medical_literature(query: str, max_chars: int = 4000) -> dict:
"""Search and return truncated results."""
results = database.search(query)
full_text = format_results(results)
if len(full_text) > max_chars:
truncated = full_text[:max_chars]
return {
"results": truncated,
"truncated": True,
"total_results": len(results),
"returned_chars": max_chars,
"note": "Results truncated. Ask for a more specific query for complete data."
}
return {"results": full_text, "truncated": False, "total_results": len(results)}Strategy 2: Paginate
def get_patient_history(
patient_id: str,
page: int = 1,
page_size: int = 10
) -> dict:
"""Paginated patient history."""
all_events = database.get_events(patient_id)
total = len(all_events)
start = (page - 1) * page_size
end = start + page_size
return {
"patient_id": patient_id,
"page": page,
"page_size": page_size,
"total_events": total,
"total_pages": (total + page_size - 1) // page_size,
"events": all_events[start:end],
"has_more": end < total
}The LLM can call get_patient_history with page=2 on the next turn if it needs more data.
Strategy 3: Summarize inside the tool
import openai
summarizer = openai.OpenAI()
def get_research_summary(topic: str) -> dict:
"""Fetch research papers and return an LLM-generated summary."""
raw_papers = fetch_papers_from_pubmed(topic, limit=20)
full_text = "\n\n".join(p["abstract"] for p in raw_papers)
# Use a separate, cheap LLM call to summarize before returning
summary_response = summarizer.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "user",
"content": f"Summarize these research abstracts in under 500 words:\n\n{full_text}"
}
],
max_tokens=600
)
return {
"topic": topic,
"papers_found": len(raw_papers),
"summary": summary_response.choices[0].message.content
}The Complete Multi-Turn Agent Loop
A robust agent handles multiple rounds of tool calls ā the LLM may call a tool, receive the result, and then call another tool before giving a final answer.
import json
import openai
from typing import Callable
client = openai.OpenAI()
def run_agent_loop(
user_message: str,
tools: list,
tool_map: dict[str, Callable],
system_prompt: str = "You are a helpful assistant.",
max_iterations: int = 10
) -> str:
"""
General-purpose agentic loop.
Continues calling the LLM until:
- It returns a text response (no tool calls)
- max_iterations is reached
"""
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_message}
]
for iteration in range(max_iterations):
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto"
)
msg = response.choices[0].message
# If no tool calls, the LLM is done ā return the answer
if not msg.tool_calls:
return msg.content or ""
# Append the assistant message with tool_calls
messages.append(msg)
# Execute all tool calls in this batch
for tool_call in msg.tool_calls:
fn_name = tool_call.function.name
fn_args = json.loads(tool_call.function.arguments)
print(f"[Iteration {iteration + 1}] Calling {fn_name}({fn_args})")
if fn_name in tool_map:
try:
result = tool_map[fn_name](**fn_args)
except Exception as e:
result = {"error": str(e), "tool": fn_name}
else:
result = {"error": f"Unknown tool: {fn_name}"}
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# If we hit max_iterations, return a fallback
return "I was unable to complete the request within the allowed number of steps."
# Example usage with drug lookup agent
drug_tools = [
{
"type": "function",
"function": {
"name": "get_drug_info",
"description": "Get drug information including dosage and interactions.",
"parameters": {
"type": "object",
"properties": {
"drug_name": {"type": "string", "description": "Drug name."},
"info_type": {
"type": "string",
"enum": ["dosage", "interactions", "all"],
"description": "Type of information needed."
}
},
"required": ["drug_name", "info_type"]
}
}
},
{
"type": "function",
"function": {
"name": "get_patient_allergies",
"description": "Get a patient's known drug allergies.",
"parameters": {
"type": "object",
"properties": {
"patient_id": {"type": "string", "description": "Patient ID."}
},
"required": ["patient_id"]
}
}
}
]
def get_drug_info(drug_name: str, info_type: str) -> dict:
return {
"drug": drug_name,
"dosage": "10mg once daily",
"interactions": ["Warfarin ā increased bleeding risk"]
}
def get_patient_allergies(patient_id: str) -> dict:
return {
"patient_id": patient_id,
"allergies": ["Penicillin", "Sulfonamides"]
}
tool_map = {
"get_drug_info": get_drug_info,
"get_patient_allergies": get_patient_allergies
}
answer = run_agent_loop(
user_message="Is Atorvastatin safe for patient P-00123? Check their allergies first.",
tools=drug_tools,
tool_map=tool_map,
system_prompt=(
"You are a clinical safety assistant. "
"Always check patient allergies before confirming drug safety."
)
)
print(answer)Common Mistakes and How to Fix Them
Mistake: Not appending the assistant message before the tool result
# Wrong ā missing the assistant message
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# Correct ā assistant message comes first
messages.append(assistant_message) # This must come before tool results
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})Mistake: Passing a dict instead of a string as content
# Wrong ā content must be a string
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result # This is a dict ā will cause an API error
})
# Correct
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result) # Serialize to string
})Mistake: Using a hardcoded or wrong tool_call_id
# Wrong ā hardcoded ID
messages.append({
"role": "tool",
"tool_call_id": "my_fixed_id", # Won't match the actual tool_call.id
"content": json.dumps(result)
})
# Correct ā always use the id from the tool_call object
for tool_call in msg.tool_calls:
messages.append({
"role": "tool",
"tool_call_id": tool_call.id, # From the response object
"content": json.dumps(execute_tool(tool_call))
})Summary
| Step | What To Do |
|---|---|
| Receive tool call | Read msg.tool_calls ā each has .id, .function.name, .function.arguments |
| Append assistant msg | Add msg to messages before any tool results |
| Execute tool | Call your function with json.loads(tc.function.arguments) |
| Return result | Append {"role": "tool", "tool_call_id": tc.id, "content": json.dumps(result)} |
| Get final answer | Call the LLM again with the full updated message list |
| Large results | Truncate, paginate, or pre-summarize inside the tool function |
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.