How the LLM Decides Which Tool to Call — Agents & Tools Interview Prep | Learnixo

The Selection Mechanism

The LLM does not run a keyword-matching algorithm to pick tools. It performs full semantic reasoning across the entire conversation context — the system prompt, conversation history, user message, and every tool description — and decides which action (if any) is most appropriate.

In practice, three factors dominate the decision:

Description relevance — Does this tool description match what the user is asking for?
Conversation context — What has been established in earlier turns that makes one tool more appropriate than another?
tool_choice setting — The explicit constraint you set in the API call

The tool_choice Parameter

tool_choice is your primary lever for controlling whether and which tool gets called.

Python

import openai

client = openai.OpenAI()

# Option 1: auto — LLM decides
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    tool_choice="auto"  # Default. Model may or may not call a tool.
)

# Option 2: none — LLM cannot call tools in this turn
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    tool_choice="none"  # Forces a text response. Use for summaries, clarifications.
)

# Option 3: required — LLM must call at least one tool
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    tool_choice="required"  # Use when you always need structured output.
)

# Option 4: specific tool — Force the LLM to call one specific tool
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    tool_choice={
        "type": "function",
        "function": {"name": "get_drug_info"}  # Must call this exact tool
    }
)

When to use each:

| Setting | Use Case | |---|---| | "auto" | Most agent interactions — let the model reason | | "none" | You want a summary or explanation after tool results are in | | "required" | You need structured JSON output (use tools as structured output) | | specific tool | Data extraction where you always need one particular schema |

How Description Matching Works: A Demo

Python

import json
import openai

client = openai.OpenAI()

# Two tools with overlapping domains but different scopes
tools = [
    {
        "type": "function",
        "function": {
            "name": "search_drug_interactions",
            "description": (
                "Check for known interactions between two or more drugs. "
                "Use this when the user asks whether it is safe to combine medications, "
                "or asks about drug-drug interactions. Do NOT use for general drug information."
            ),
            "parameters": {
                "type": "object",
                "properties": {
                    "drug_a": {
                        "type": "string",
                        "description": "Name of the first drug."
                    },
                    "drug_b": {
                        "type": "string",
                        "description": "Name of the second drug."
                    }
                },
                "required": ["drug_a", "drug_b"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_drug_dosage",
            "description": (
                "Retrieve the standard dosage and administration guidelines for a single drug. "
                "Use this when the user asks how much of a medication to take, how often, "
                "or how to administer it. Do NOT use for interaction questions."
            ),
            "parameters": {
                "type": "object",
                "properties": {
                    "drug_name": {
                        "type": "string",
                        "description": "The name of the drug."
                    },
                    "patient_weight_kg": {
                        "type": "number",
                        "description": "Optional. Patient weight for weight-based dosing."
                    }
                },
                "required": ["drug_name"]
            }
        }
    }
]

def check_which_tool_is_called(user_message: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": user_message}],
        tools=tools,
        tool_choice="auto"
    )
    msg = response.choices[0].message
    if msg.tool_calls:
        tc = msg.tool_calls[0]
        return f"Tool: {tc.function.name} | Args: {tc.function.arguments}"
    return f"No tool called. Response: {msg.content[:100]}"

# Query that clearly maps to search_drug_interactions
print(check_which_tool_is_called("Is it safe to take Metformin and Ibuprofen together?"))
# Tool: search_drug_interactions | Args: {"drug_a": "Metformin", "drug_b": "Ibuprofen"}

# Query that clearly maps to get_drug_dosage
print(check_which_tool_is_called("What's the usual dose of Lisinopril for hypertension?"))
# Tool: get_drug_dosage | Args: {"drug_name": "Lisinopril"}

# Ambiguous query — which tool wins?
print(check_which_tool_is_called("Tell me about Aspirin."))
# Likely: No tool called (or get_drug_dosage if descriptions are clear)

What Happens When Multiple Tools Could Match

When a query could plausibly invoke multiple tools, the LLM picks the one whose description most closely matches the intent. You can influence this by:

1. Making descriptions mutually exclusive:

Python

# Bad: both tools could handle "tell me about Metformin"
"get_drug_info": "Provides information about drugs."
"get_drug_dosage": "Provides information about drug dosing."

# Good: descriptions carve out distinct territory
"get_drug_info": (
    "Returns the clinical profile of a drug: mechanism of action, "
    "approved indications, side effect profile, and pharmacokinetics. "
    "Do NOT use for dosage questions — use get_drug_dosage instead."
)
"get_drug_dosage": (
    "Returns dosage and administration guidelines only: how much, how often, "
    "and how to take a drug. Do NOT use for general drug information — "
    "use get_drug_info instead."
)

2. Cross-referencing other tools in descriptions:

Telling the LLM which tool to use instead is remarkably effective. When the model reads "Do NOT use for X — use tool_Y instead," it treats that as a hard constraint.

3. Using parallel tool calls for genuinely ambiguous multi-part queries:

Python

# User: "What's the dose of Metformin and does it interact with alcohol?"
# This legitimately needs both tools

# The LLM with gpt-4o will often call both in parallel:
# tool_calls: [
#   {function: {name: "get_drug_dosage", arguments: '{"drug_name": "Metformin"}'}},
#   {function: {name: "search_drug_interactions", arguments: '{"drug_a": "Metformin", "drug_b": "Alcohol"}'}}
# ]

Parallel Tool Calls

OpenAI models can return multiple tool calls in a single response. This happens when the LLM determines that several independent pieces of information are needed to answer the query.

Python

import asyncio
import json
import openai

client = openai.OpenAI()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_drug_dosage",
            "description": "Get dosage information for a drug.",
            "parameters": {
                "type": "object",
                "properties": {
                    "drug_name": {"type": "string", "description": "Drug name."}
                },
                "required": ["drug_name"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_drug_interactions",
            "description": "Check interactions between two drugs.",
            "parameters": {
                "type": "object",
                "properties": {
                    "drug_a": {"type": "string"},
                    "drug_b": {"type": "string"}
                },
                "required": ["drug_a", "drug_b"]
            }
        }
    }
]

def get_drug_dosage(drug_name: str) -> dict:
    return {"drug": drug_name, "dose": "500mg twice daily", "route": "oral"}

def search_drug_interactions(drug_a: str, drug_b: str) -> dict:
    return {
        "drug_a": drug_a,
        "drug_b": drug_b,
        "interaction": "Monitor blood glucose — NSAIDs may impair renal metformin clearance",
        "severity": "moderate"
    }

TOOL_MAP = {
    "get_drug_dosage": get_drug_dosage,
    "search_drug_interactions": search_drug_interactions,
}

def handle_parallel_tool_calls(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tools,
        tool_choice="auto"
    )

    msg = response.choices[0].message

    if not msg.tool_calls:
        return msg.content

    print(f"LLM requested {len(msg.tool_calls)} tool call(s)")
    messages.append(msg)

    # Execute all tool calls and collect results
    for tc in msg.tool_calls:
        fn_name = tc.function.name
        fn_args = json.loads(tc.function.arguments)

        print(f"  Executing: {fn_name}({fn_args})")
        result = TOOL_MAP[fn_name](**fn_args)

        messages.append({
            "role": "tool",
            "tool_call_id": tc.id,
            "content": json.dumps(result)
        })

    # All results are now in the message list — get final answer
    final = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tools
    )
    return final.choices[0].message.content

# This query needs both tools
result = handle_parallel_tool_calls(
    "What's the dose of Metformin, and does it interact with Ibuprofen?"
)
print(result)

Expected output:

LLM requested 2 tool call(s)
  Executing: get_drug_dosage({'drug_name': 'Metformin'})
  Executing: search_drug_interactions({'drug_a': 'Metformin', 'drug_b': 'Ibuprofen'})

Metformin is typically taken at 500mg twice daily by mouth. Regarding interactions:
there is a moderate interaction between Metformin and Ibuprofen — NSAIDs may impair
renal metformin clearance, so blood glucose should be monitored if both are used.

Handling Ambiguous Queries

Ambiguous queries are the main source of wrong tool selections. Examples:

"Tell me about Aspirin" — general info? dosage? interactions? history?
"Is Metformin safe?" — safe for whom? safe with what drug? side effects?
"Check my prescription" — check what exactly?

Strategy 1: Ask for clarification before calling tools

Add to your system prompt:

Python

system_prompt = """
You are a clinical pharmacist assistant.

If a user's query is ambiguous about what type of drug information they need,
ask one clarifying question before calling any tool. For example:
- If they ask "tell me about [drug]", ask whether they want dosage, interactions, or side effects.
- If they ask "is [drug] safe", ask whether they're asking about general safety or a specific interaction.

Once you understand the intent, use the appropriate tool.
"""

Strategy 2: Default to the most useful tool for your domain

If your app is primarily a dosage checker, have get_drug_dosage as the default and note in its description that it's the primary reference for general drug questions.

Strategy 3: Use a routing tool

Add a meta-tool that classifies intent before calling a specialized tool:

Python

tools = [
    {
        "type": "function",
        "function": {
            "name": "classify_drug_query",
            "description": (
                "Use this first for any drug-related question. "
                "Classifies the user's intent so the right specialized tool can be called next."
            ),
            "parameters": {
                "type": "object",
                "properties": {
                    "intent": {
                        "type": "string",
                        "enum": ["dosage", "interaction", "side_effects", "mechanism", "other"],
                        "description": "The primary intent of the user's drug question."
                    },
                    "drugs_mentioned": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "All drug names mentioned by the user."
                    }
                },
                "required": ["intent", "drugs_mentioned"]
            }
        }
    }
    # ... other specialized tools
]

Context Influences Tool Selection

Earlier messages in the conversation affect which tool the LLM picks. This is important for multi-turn agents.

Python

messages = [
    {"role": "system", "content": "You are a pharmacy assistant."},
    {"role": "user", "content": "I was just prescribed Metformin."},
    {"role": "assistant", "content": "Congratulations on starting Metformin! It's a first-line treatment for type 2 diabetes. Do you have any questions about it?"},
    # The next message — context establishes we're discussing Metformin
    {"role": "user", "content": "What should I watch out for?"}
    # LLM now knows "what should I watch out for?" means Metformin side effects/interactions
    # It will call the drug info tool with drug_name="Metformin", not ask for clarification
]

This is both a feature and a risk. If a user introduces a new drug mid-conversation, make sure your conversation management resets or correctly propagates context.

Key Takeaways

The LLM selects tools through semantic reasoning, not keyword matching
tool_choice="auto" is correct for most use cases; use "required" or specific tool when you always need structure
Write descriptions that explicitly say what each tool does NOT handle and refer to sibling tools
Parallel tool calls happen automatically when the query needs multiple independent answers
Ambiguous queries are best handled by prompting the LLM to ask a clarifying question before tool use
Conversation context influences tool selection — design multi-turn flows with this in mind