Learnixo

Agents & Tools Interview Prep · Lesson 2 of 12

Tool Schema: Name, Description, Parameters

Why Schemas Are the Most Important Part

When an LLM decides whether to call your tool and what arguments to pass, it reads exactly two things:

  1. The tool's description — a plain English explanation of what the tool does and when to use it
  2. The parameter definitions — what arguments the tool accepts, their types, and their descriptions

The model has no other information about your tool. It cannot see your implementation. A vague or incomplete schema produces wrong tool calls. A precise schema produces correct ones.

This lesson covers how to write schemas that work.


The JSON Schema Structure

Every tool you define for OpenAI's API follows this structure:

JSON
{
  "type": "function",
  "function": {
    "name": "function_name",
    "description": "What this function does and when to call it.",
    "parameters": {
      "type": "object",
      "properties": {
        "param_name": {
          "type": "string",
          "description": "What this parameter is and what values are valid."
        }
      },
      "required": ["param_name"]
    }
  }
}

The parameters field follows the JSON Schema specification (draft 7 subset). OpenAI supports the most commonly used keywords.


Supported Parameter Types

String

JSON
{
  "type": "string",
  "description": "The patient's full name as it appears in the record."
}

Use enum to restrict valid values:

JSON
{
  "type": "string",
  "enum": ["mg", "mcg", "ml", "units"],
  "description": "The unit for the dosage amount."
}

Number and Integer

JSON
{
  "type": "number",
  "description": "The dosage amount. Must be a positive value."
}
JSON
{
  "type": "integer",
  "description": "Maximum number of results to return. Between 1 and 50."
}

Boolean

JSON
{
  "type": "boolean",
  "description": "Set to true to include discontinued drugs in the results."
}

Array

JSON
{
  "type": "array",
  "items": {
    "type": "string"
  },
  "description": "List of drug names to search for interactions."
}

Object (Nested)

JSON
{
  "type": "object",
  "description": "Patient demographics for the query.",
  "properties": {
    "age": {
      "type": "integer",
      "description": "Patient age in years."
    },
    "weight_kg": {
      "type": "number",
      "description": "Patient weight in kilograms."
    }
  },
  "required": ["age"]
}

Required vs Optional Parameters

List required parameters in the required array. Parameters not listed are optional — the LLM may omit them, and your function must handle that.

Python
tools = [
    {
        "type": "function",
        "function": {
            "name": "search_drugs",
            "description": "Search the drug database by name or active ingredient.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The drug name or active ingredient to search for."
                    },
                    "include_generics": {
                        "type": "boolean",
                        "description": "Whether to include generic equivalents. Defaults to true if not specified."
                    },
                    "max_results": {
                        "type": "integer",
                        "description": "Maximum number of results to return. Defaults to 10."
                    }
                },
                "required": ["query"]  # Only query is required
            }
        }
    }
]

def search_drugs(query: str, include_generics: bool = True, max_results: int = 10) -> dict:
    # Function handles optional params with defaults
    print(f"Searching for: {query}, generics={include_generics}, limit={max_results}")
    return {"results": [], "query": query}

Bad Schema vs Good Schema

The quality of your description directly determines whether the LLM calls the right tool with the right arguments. Here are side-by-side comparisons.

Bad: Vague tool description

JSON
{
  "name": "get_info",
  "description": "Gets information.",
  "parameters": {
    "type": "object",
    "properties": {
      "q": {
        "type": "string",
        "description": "The query."
      }
    },
    "required": ["q"]
  }
}

Problems:

  • "Gets information" — what kind? The LLM doesn't know when to call this vs other tools.
  • Parameter named q — ambiguous. Is it a drug name? A patient ID? A SQL query?
  • No guidance on what values are valid.

Good: Precise tool description

JSON
{
  "name": "get_drug_info",
  "description": "Retrieve detailed information about a pharmaceutical drug including dosage, interactions, and contraindications. Use this when the user asks about a specific medication's properties, safety profile, or prescribing information.",
  "parameters": {
    "type": "object",
    "properties": {
      "drug_name": {
        "type": "string",
        "description": "The brand name or generic name of the drug, e.g. 'Metformin' or 'Glucophage'. Case-insensitive."
      },
      "info_type": {
        "type": "string",
        "enum": ["dosage", "interactions", "contraindications", "all"],
        "description": "The type of information to retrieve. Use 'all' if unsure which section the user needs."
      }
    },
    "required": ["drug_name", "info_type"]
  }
}

Improvements:

  • Description says when to call this tool (user asks about medication properties)
  • Parameter name is self-documenting (drug_name)
  • Example values given ("Metformin" or "Glucophage")
  • info_type uses enum so the LLM can't hallucinate an invalid value
  • Enum includes guidance for the ambiguous case ("use 'all' if unsure")

Complete Example: get_drug_info Tool

Python
import json
import openai
from typing import Optional

client = openai.OpenAI()

# Full schema definition
GET_DRUG_INFO_SCHEMA = {
    "type": "function",
    "function": {
        "name": "get_drug_info",
        "description": (
            "Retrieve pharmaceutical information about a drug including dosage guidelines, "
            "drug-drug interactions, and contraindications. Use this tool whenever the user "
            "asks about a specific medication. Do not guess drug information — always call "
            "this tool for any factual claims about medications."
        ),
        "parameters": {
            "type": "object",
            "properties": {
                "drug_name": {
                    "type": "string",
                    "description": (
                        "The name of the drug (brand name or generic name). "
                        "Examples: 'Metformin', 'Lisinopril', 'Aspirin', 'Atorvastatin'."
                    )
                },
                "info_type": {
                    "type": "string",
                    "enum": ["dosage", "interactions", "contraindications", "all"],
                    "description": (
                        "Which information category to retrieve. "
                        "'dosage' for dose and administration, "
                        "'interactions' for drug-drug interactions, "
                        "'contraindications' for conditions where the drug is unsafe, "
                        "'all' for complete drug profile."
                    )
                },
                "patient_age": {
                    "type": "integer",
                    "description": (
                        "Optional. Patient age in years. Providing this enables "
                        "age-appropriate dosing recommendations (pediatric vs adult vs geriatric)."
                    )
                }
            },
            "required": ["drug_name", "info_type"]
        }
    }
}

# The implementation
def get_drug_info(
    drug_name: str,
    info_type: str,
    patient_age: Optional[int] = None
) -> dict:
    """Retrieve drug information from the pharmacy database."""

    # Mock data  replace with real DB query
    drug_database = {
        "metformin": {
            "brand_names": ["Glucophage", "Fortamet"],
            "dosage": {
                "adult": "500mg twice daily with meals, titrate to 2000mg/day max",
                "pediatric": "500mg once daily for ages 10 and up",
                "geriatric": "Start at 500mg, monitor renal function closely"
            },
            "interactions": [
                "Iodinated contrast media — hold 48h before imaging",
                "Alcohol — increased lactic acidosis risk",
                "Cimetidine — increases metformin levels by ~40%"
            ],
            "contraindications": [
                "eGFR under 30 mL/min/1.73m²",
                "Active or history of lactic acidosis",
                "Acute heart failure",
                "Hepatic impairment"
            ]
        }
    }

    normalized_name = drug_name.lower().strip()
    drug = drug_database.get(normalized_name)

    if not drug:
        return {
            "error": f"Drug '{drug_name}' not found in database",
            "suggestion": "Check spelling or try the generic name"
        }

    result = {"drug_name": drug_name, "brand_names": drug.get("brand_names", [])}

    if info_type in ("dosage", "all"):
        dosage_info = drug["dosage"]
        if patient_age is not None:
            if patient_age < 18:
                result["dosage"] = dosage_info.get("pediatric", "See adult dosing")
            elif patient_age >= 65:
                result["dosage"] = dosage_info.get("geriatric", dosage_info["adult"])
            else:
                result["dosage"] = dosage_info["adult"]
        else:
            result["dosage"] = dosage_info

    if info_type in ("interactions", "all"):
        result["interactions"] = drug["interactions"]

    if info_type in ("contraindications", "all"):
        result["contraindications"] = drug["contraindications"]

    return result

# Wire it together
def ask_about_drug(user_question: str) -> str:
    messages = [
        {
            "role": "system",
            "content": (
                "You are a clinical pharmacist assistant. Always use the get_drug_info tool "
                "for any factual claims about medications. Never guess drug information."
            )
        },
        {"role": "user", "content": user_question}
    ]

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=[GET_DRUG_INFO_SCHEMA],
        tool_choice="auto"
    )

    msg = response.choices[0].message

    if not msg.tool_calls:
        return msg.content

    messages.append(msg)

    for tc in msg.tool_calls:
        args = json.loads(tc.function.arguments)
        result = get_drug_info(**args)
        messages.append({
            "role": "tool",
            "tool_call_id": tc.id,
            "content": json.dumps(result)
        })

    final = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=[GET_DRUG_INFO_SCHEMA]
    )
    return final.choices[0].message.content

# Example calls
print(ask_about_drug("What is the standard dose of Metformin for a 70-year-old patient?"))
print(ask_about_drug("Can Metformin be taken before an MRI with contrast?"))

Schema Writing Checklist

Before registering a tool, verify each point:

Tool description:

  • [ ] States what the tool does in one sentence
  • [ ] Specifies when to call it (not just what it does)
  • [ ] Differentiates it from similar tools if multiple tools exist
  • [ ] Includes a note like "always call this tool for X" if hallucination is a risk

Parameters:

  • [ ] Each parameter has a plain English name (not q, p1, v)
  • [ ] Each description explains the meaning AND the format
  • [ ] Examples given for string parameters where format matters
  • [ ] Enums used where only specific values are valid
  • [ ] Optional parameters have defaults documented in the description
  • [ ] required array matches which params are truly mandatory

Tips for Writing Good Descriptions

Be specific about scope. "Search the internal drug formulary" is better than "search drugs." The LLM has to differentiate your tool from web search, a patient database search, and other tools.

Tell the LLM when NOT to call. If your drug info tool only covers medications approved after 2010, say so. Otherwise the LLM will call it for old drugs and get confused by the empty result.

Use examples in descriptions. "e.g. 'Oslo' or 'New York'" costs three tokens and saves many wrong tool calls.

Enum is your friend. If a parameter has fewer than 20 valid values, use enum. The model literally cannot pass an invalid value.

Distinguish similar tools. If you have both get_patient_demographics and get_patient_medications, their descriptions should explicitly say what each one does NOT cover.


Summary

| Schema Element | Key Advice | |---|---| | Tool description | Say what it does AND when to call it | | Parameter names | Self-documenting, not abbreviated | | Parameter description | Meaning + format + examples | | enum | Use whenever valid values are finite | | required | Only list truly mandatory params | | Optional defaults | Document them in the description |