How the LLM Decides Which Tool to Call
Understand the mechanism behind tool selection ā how descriptions, context, and tool_choice settings influence which function gets called and when.
The Selection Mechanism
The LLM does not run a keyword-matching algorithm to pick tools. It performs full semantic reasoning across the entire conversation context ā the system prompt, conversation history, user message, and every tool description ā and decides which action (if any) is most appropriate.
In practice, three factors dominate the decision:
- Description relevance ā Does this tool description match what the user is asking for?
- Conversation context ā What has been established in earlier turns that makes one tool more appropriate than another?
- tool_choice setting ā The explicit constraint you set in the API call
The tool_choice Parameter
tool_choice is your primary lever for controlling whether and which tool gets called.
import openai
client = openai.OpenAI()
# Option 1: auto ā LLM decides
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto" # Default. Model may or may not call a tool.
)
# Option 2: none ā LLM cannot call tools in this turn
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="none" # Forces a text response. Use for summaries, clarifications.
)
# Option 3: required ā LLM must call at least one tool
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="required" # Use when you always need structured output.
)
# Option 4: specific tool ā Force the LLM to call one specific tool
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice={
"type": "function",
"function": {"name": "get_drug_info"} # Must call this exact tool
}
)When to use each:
| Setting | Use Case |
|---|---|
| "auto" | Most agent interactions ā let the model reason |
| "none" | You want a summary or explanation after tool results are in |
| "required" | You need structured JSON output (use tools as structured output) |
| specific tool | Data extraction where you always need one particular schema |
How Description Matching Works: A Demo
import json
import openai
client = openai.OpenAI()
# Two tools with overlapping domains but different scopes
tools = [
{
"type": "function",
"function": {
"name": "search_drug_interactions",
"description": (
"Check for known interactions between two or more drugs. "
"Use this when the user asks whether it is safe to combine medications, "
"or asks about drug-drug interactions. Do NOT use for general drug information."
),
"parameters": {
"type": "object",
"properties": {
"drug_a": {
"type": "string",
"description": "Name of the first drug."
},
"drug_b": {
"type": "string",
"description": "Name of the second drug."
}
},
"required": ["drug_a", "drug_b"]
}
}
},
{
"type": "function",
"function": {
"name": "get_drug_dosage",
"description": (
"Retrieve the standard dosage and administration guidelines for a single drug. "
"Use this when the user asks how much of a medication to take, how often, "
"or how to administer it. Do NOT use for interaction questions."
),
"parameters": {
"type": "object",
"properties": {
"drug_name": {
"type": "string",
"description": "The name of the drug."
},
"patient_weight_kg": {
"type": "number",
"description": "Optional. Patient weight for weight-based dosing."
}
},
"required": ["drug_name"]
}
}
}
]
def check_which_tool_is_called(user_message: str) -> str:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": user_message}],
tools=tools,
tool_choice="auto"
)
msg = response.choices[0].message
if msg.tool_calls:
tc = msg.tool_calls[0]
return f"Tool: {tc.function.name} | Args: {tc.function.arguments}"
return f"No tool called. Response: {msg.content[:100]}"
# Query that clearly maps to search_drug_interactions
print(check_which_tool_is_called("Is it safe to take Metformin and Ibuprofen together?"))
# Tool: search_drug_interactions | Args: {"drug_a": "Metformin", "drug_b": "Ibuprofen"}
# Query that clearly maps to get_drug_dosage
print(check_which_tool_is_called("What's the usual dose of Lisinopril for hypertension?"))
# Tool: get_drug_dosage | Args: {"drug_name": "Lisinopril"}
# Ambiguous query ā which tool wins?
print(check_which_tool_is_called("Tell me about Aspirin."))
# Likely: No tool called (or get_drug_dosage if descriptions are clear)What Happens When Multiple Tools Could Match
When a query could plausibly invoke multiple tools, the LLM picks the one whose description most closely matches the intent. You can influence this by:
1. Making descriptions mutually exclusive:
# Bad: both tools could handle "tell me about Metformin"
"get_drug_info": "Provides information about drugs."
"get_drug_dosage": "Provides information about drug dosing."
# Good: descriptions carve out distinct territory
"get_drug_info": (
"Returns the clinical profile of a drug: mechanism of action, "
"approved indications, side effect profile, and pharmacokinetics. "
"Do NOT use for dosage questions ā use get_drug_dosage instead."
)
"get_drug_dosage": (
"Returns dosage and administration guidelines only: how much, how often, "
"and how to take a drug. Do NOT use for general drug information ā "
"use get_drug_info instead."
)2. Cross-referencing other tools in descriptions:
Telling the LLM which tool to use instead is remarkably effective. When the model reads "Do NOT use for X ā use tool_Y instead," it treats that as a hard constraint.
3. Using parallel tool calls for genuinely ambiguous multi-part queries:
# User: "What's the dose of Metformin and does it interact with alcohol?"
# This legitimately needs both tools
# The LLM with gpt-4o will often call both in parallel:
# tool_calls: [
# {function: {name: "get_drug_dosage", arguments: '{"drug_name": "Metformin"}'}},
# {function: {name: "search_drug_interactions", arguments: '{"drug_a": "Metformin", "drug_b": "Alcohol"}'}}
# ]Parallel Tool Calls
OpenAI models can return multiple tool calls in a single response. This happens when the LLM determines that several independent pieces of information are needed to answer the query.
import asyncio
import json
import openai
client = openai.OpenAI()
tools = [
{
"type": "function",
"function": {
"name": "get_drug_dosage",
"description": "Get dosage information for a drug.",
"parameters": {
"type": "object",
"properties": {
"drug_name": {"type": "string", "description": "Drug name."}
},
"required": ["drug_name"]
}
}
},
{
"type": "function",
"function": {
"name": "search_drug_interactions",
"description": "Check interactions between two drugs.",
"parameters": {
"type": "object",
"properties": {
"drug_a": {"type": "string"},
"drug_b": {"type": "string"}
},
"required": ["drug_a", "drug_b"]
}
}
}
]
def get_drug_dosage(drug_name: str) -> dict:
return {"drug": drug_name, "dose": "500mg twice daily", "route": "oral"}
def search_drug_interactions(drug_a: str, drug_b: str) -> dict:
return {
"drug_a": drug_a,
"drug_b": drug_b,
"interaction": "Monitor blood glucose ā NSAIDs may impair renal metformin clearance",
"severity": "moderate"
}
TOOL_MAP = {
"get_drug_dosage": get_drug_dosage,
"search_drug_interactions": search_drug_interactions,
}
def handle_parallel_tool_calls(user_message: str) -> str:
messages = [{"role": "user", "content": user_message}]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto"
)
msg = response.choices[0].message
if not msg.tool_calls:
return msg.content
print(f"LLM requested {len(msg.tool_calls)} tool call(s)")
messages.append(msg)
# Execute all tool calls and collect results
for tc in msg.tool_calls:
fn_name = tc.function.name
fn_args = json.loads(tc.function.arguments)
print(f" Executing: {fn_name}({fn_args})")
result = TOOL_MAP[fn_name](**fn_args)
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"content": json.dumps(result)
})
# All results are now in the message list ā get final answer
final = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)
return final.choices[0].message.content
# This query needs both tools
result = handle_parallel_tool_calls(
"What's the dose of Metformin, and does it interact with Ibuprofen?"
)
print(result)Expected output:
LLM requested 2 tool call(s)
Executing: get_drug_dosage({'drug_name': 'Metformin'})
Executing: search_drug_interactions({'drug_a': 'Metformin', 'drug_b': 'Ibuprofen'})
Metformin is typically taken at 500mg twice daily by mouth. Regarding interactions:
there is a moderate interaction between Metformin and Ibuprofen ā NSAIDs may impair
renal metformin clearance, so blood glucose should be monitored if both are used.Handling Ambiguous Queries
Ambiguous queries are the main source of wrong tool selections. Examples:
- "Tell me about Aspirin" ā general info? dosage? interactions? history?
- "Is Metformin safe?" ā safe for whom? safe with what drug? side effects?
- "Check my prescription" ā check what exactly?
Strategy 1: Ask for clarification before calling tools
Add to your system prompt:
system_prompt = """
You are a clinical pharmacist assistant.
If a user's query is ambiguous about what type of drug information they need,
ask one clarifying question before calling any tool. For example:
- If they ask "tell me about [drug]", ask whether they want dosage, interactions, or side effects.
- If they ask "is [drug] safe", ask whether they're asking about general safety or a specific interaction.
Once you understand the intent, use the appropriate tool.
"""Strategy 2: Default to the most useful tool for your domain
If your app is primarily a dosage checker, have get_drug_dosage as the default and note in its description that it's the primary reference for general drug questions.
Strategy 3: Use a routing tool
Add a meta-tool that classifies intent before calling a specialized tool:
tools = [
{
"type": "function",
"function": {
"name": "classify_drug_query",
"description": (
"Use this first for any drug-related question. "
"Classifies the user's intent so the right specialized tool can be called next."
),
"parameters": {
"type": "object",
"properties": {
"intent": {
"type": "string",
"enum": ["dosage", "interaction", "side_effects", "mechanism", "other"],
"description": "The primary intent of the user's drug question."
},
"drugs_mentioned": {
"type": "array",
"items": {"type": "string"},
"description": "All drug names mentioned by the user."
}
},
"required": ["intent", "drugs_mentioned"]
}
}
}
# ... other specialized tools
]Context Influences Tool Selection
Earlier messages in the conversation affect which tool the LLM picks. This is important for multi-turn agents.
messages = [
{"role": "system", "content": "You are a pharmacy assistant."},
{"role": "user", "content": "I was just prescribed Metformin."},
{"role": "assistant", "content": "Congratulations on starting Metformin! It's a first-line treatment for type 2 diabetes. Do you have any questions about it?"},
# The next message ā context establishes we're discussing Metformin
{"role": "user", "content": "What should I watch out for?"}
# LLM now knows "what should I watch out for?" means Metformin side effects/interactions
# It will call the drug info tool with drug_name="Metformin", not ask for clarification
]This is both a feature and a risk. If a user introduces a new drug mid-conversation, make sure your conversation management resets or correctly propagates context.
Key Takeaways
- The LLM selects tools through semantic reasoning, not keyword matching
tool_choice="auto"is correct for most use cases; use"required"or specific tool when you always need structure- Write descriptions that explicitly say what each tool does NOT handle and refer to sibling tools
- Parallel tool calls happen automatically when the query needs multiple independent answers
- Ambiguous queries are best handled by prompting the LLM to ask a clarifying question before tool use
- Conversation context influences tool selection ā design multi-turn flows with this in mind
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.