Validating Tool Inputs and Outputs
LLMs can hallucinate invalid arguments. Learn to validate tool inputs with Pydantic, validate outputs against expected schemas, and re-prompt on failure.
Why Validation Is Non-Negotiable
An LLM can request a tool call with any arguments it invents. The JSON Schema you provide constrains format and types — but only loosely. The model can still:
- Pass a negative number for a field that should be positive
- Pass a patient ID that doesn't match your ID format
- Pass a string where an enum is expected but hallucinate a value not in the enum
- Pass empty strings for required fields
- Omit required nested fields inside object parameters
- Pass a date string in the wrong format ("15-05-2026" vs "2026-05-15")
None of these trigger a schema validation error at the API level. They arrive at your tool function and may silently produce wrong results, corrupt data, or crash.
Validation at the tool boundary is mandatory. Pydantic makes it straightforward.
Layer 1: Pydantic Input Validation
Define a Pydantic model for every tool's input. Parse the LLM's arguments through it before executing anything.
from pydantic import BaseModel, Field, field_validator, model_validator
from typing import Optional
from datetime import date
from enum import Enum
import re
class AppointmentType(str, Enum):
consultation = "consultation"
followup = "followup"
procedure = "procedure"
class BookAppointmentInput(BaseModel):
"""
Input model for the book_appointment tool.
Validates all LLM-provided arguments before any DB operation.
"""
patient_id: str = Field(..., pattern=r"^P-\d{5}$") # e.g. P-00123
appointment_date: date
appointment_type: AppointmentType
provider_id: str = Field(..., min_length=3, max_length=20)
notes: Optional[str] = Field(default=None, max_length=500)
duration_minutes: int = Field(default=30, ge=15, le=120)
@field_validator("appointment_date")
@classmethod
def date_must_be_future(cls, v: date) -> date:
if v <= date.today():
raise ValueError(f"Appointment date must be in the future. Got: {v}")
return v
@field_validator("notes")
@classmethod
def sanitize_notes(cls, v: Optional[str]) -> Optional[str]:
if v is None:
return None
# Remove characters that could cause issues in downstream systems
sanitized = re.sub(r"[<>\"'%;()&+]", "", v)
return sanitized.strip() or None
@model_validator(mode="after")
def procedure_requires_longer_slot(self) -> "BookAppointmentInput":
if self.appointment_type == AppointmentType.procedure and self.duration_minutes < 60:
raise ValueError(
f"Procedures require at least 60 minutes. "
f"Got {self.duration_minutes} minutes."
)
return selfUsing the Validator in Your Tool
import json
from pydantic import ValidationError
def book_appointment_tool(raw_args: dict) -> dict:
"""
Tool function called by the agent loop.
raw_args comes directly from json.loads(tool_call.function.arguments)
"""
# Validate first — before ANY database or external calls
try:
validated = BookAppointmentInput(**raw_args)
except ValidationError as e:
# Return a structured error the LLM can understand and relay to the user
error_messages = []
for error in e.errors():
field = " -> ".join(str(loc) for loc in error["loc"])
error_messages.append(f"{field}: {error['msg']}")
return {
"success": False,
"error": "Invalid appointment parameters",
"validation_errors": error_messages,
"hint": "Please correct the parameters and try again."
}
# Now it's safe to proceed
return create_appointment_in_db(
patient_id=validated.patient_id,
date=validated.appointment_date,
type=validated.appointment_type.value,
provider=validated.provider_id,
duration=validated.duration_minutes,
notes=validated.notes
)
def create_appointment_in_db(
patient_id: str,
date: date,
type: str,
provider: str,
duration: int,
notes: Optional[str]
) -> dict:
"""Actual DB write — only called after validation passes."""
# Real implementation here
return {
"success": True,
"appointment_id": "APT-789012",
"patient_id": patient_id,
"scheduled_for": str(date),
"duration_minutes": duration
}Layer 2: Output Schema Validation
Tools should also validate their own output before returning it. A bug in your tool could return the wrong fields, the wrong types, or missing required data — and the LLM will hallucinate based on that corrupt output.
from pydantic import BaseModel
from typing import Optional
from datetime import datetime
class DrugInfoOutput(BaseModel):
"""Expected shape of get_drug_info output."""
drug_id: str
name: str
generic_name: str
dosage_adult: str
dosage_unit: str
interactions: list[str]
contraindications: list[str]
last_updated: datetime
def get_drug_info(drug_name: str) -> dict:
"""Returns validated drug information."""
# Fetch from DB
raw = database.fetch_drug(drug_name)
if raw is None:
return {"error": f"Drug '{drug_name}' not found"}
# Validate output before returning to the LLM
try:
validated_output = DrugInfoOutput(**raw)
return validated_output.model_dump(mode="json")
except ValidationError as e:
# Our own data has a problem — log it and return a safe error
logger.error(
"Drug info output validation failed",
extra={"drug_name": drug_name, "errors": e.errors()}
)
return {
"error": "Data quality issue — drug record is incomplete",
"drug_name": drug_name,
"action": "Contact the pharmacy data team"
}Layer 3: Re-prompting on Validation Failure
When validation fails, the best response is to return the error as a tool result — not to crash the loop. The LLM reads the error and either:
- Corrects its arguments and calls the tool again
- Asks the user for the missing information
- Explains why the request can't be completed
import json
import openai
from pydantic import ValidationError
client = openai.OpenAI()
TOOL_MAP = {
"book_appointment": book_appointment_tool,
"get_drug_info": get_drug_info,
}
def run_agent_with_validation(user_message: str, tools: list) -> str:
messages = [
{
"role": "system",
"content": (
"You are a clinical scheduling assistant. "
"If a tool returns a validation error, correct the parameters "
"and retry the tool call. Ask the user for clarification only if "
"you cannot determine the correct value yourself."
)
},
{"role": "user", "content": user_message}
]
for _ in range(5): # Allow up to 5 iterations (tool calls + retries)
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto"
)
msg = response.choices[0].message
if not msg.tool_calls:
return msg.content or ""
messages.append(msg)
for tc in msg.tool_calls:
fn_name = tc.function.name
raw_args = json.loads(tc.function.arguments)
if fn_name in TOOL_MAP:
result = TOOL_MAP[fn_name](raw_args)
else:
result = {"error": f"Unknown tool: {fn_name}"}
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"content": json.dumps(result)
})
return "Unable to complete request after multiple attempts."When book_appointment_tool returns a validation error like:
{
"success": false,
"error": "Invalid appointment parameters",
"validation_errors": ["appointment_date: Appointment date must be in the future. Got: 2026-05-01"],
"hint": "Please correct the parameters and try again."
}The LLM will read this, understand the date was in the past, and retry with a corrected date — all without user intervention.
Complete Example: Drug Prescription Validation
This example shows a more complete validation pattern for a prescription tool:
from pydantic import BaseModel, Field, field_validator
from typing import Optional
from enum import Enum
import re
from datetime import date
class RouteOfAdministration(str, Enum):
oral = "oral"
iv = "iv"
topical = "topical"
inhaled = "inhaled"
subcutaneous = "subcutaneous"
class DosageUnit(str, Enum):
mg = "mg"
mcg = "mcg"
ml = "ml"
units = "units"
puffs = "puffs"
class CreatePrescriptionInput(BaseModel):
patient_id: str = Field(..., pattern=r"^P-\d{5}$")
drug_id: str = Field(..., pattern=r"^D-\d{3,6}$")
dose_amount: float = Field(..., gt=0, le=10000)
dose_unit: DosageUnit
frequency_per_day: int = Field(..., ge=1, le=8)
route: RouteOfAdministration
duration_days: Optional[int] = Field(default=None, ge=1, le=365)
prescriber_id: str = Field(..., min_length=5, max_length=15)
start_date: date
@field_validator("start_date")
@classmethod
def start_date_not_too_far_future(cls, v: date) -> date:
days_ahead = (v - date.today()).days
if days_ahead > 90:
raise ValueError(
f"Start date cannot be more than 90 days in the future. Got {days_ahead} days ahead."
)
return v
@model_validator(mode="after")
def validate_iv_requires_duration(self) -> "CreatePrescriptionInput":
"""IV medications must have a specified duration for safety."""
if self.route == RouteOfAdministration.iv and self.duration_days is None:
raise ValueError(
"IV medications require an explicit duration_days for patient safety."
)
return self
class PrescriptionOutput(BaseModel):
prescription_id: str
status: str
patient_id: str
drug_name: str
instructions: str
warnings: list[str]
created_at: str
def create_prescription(raw_args: dict) -> dict:
# Validate input
try:
inp = CreatePrescriptionInput(**raw_args)
except ValidationError as e:
return {
"success": False,
"error": "Prescription validation failed",
"validation_errors": [
{"field": ".".join(str(l) for l in err["loc"]), "message": err["msg"]}
for err in e.errors()
]
}
# Business logic checks beyond format validation
drug = get_drug_by_id(inp.drug_id)
if not drug:
return {"success": False, "error": f"Drug ID {inp.drug_id} not found in formulary"}
patient = get_patient(inp.patient_id)
if not patient:
return {"success": False, "error": f"Patient {inp.patient_id} not found"}
# Check allergy cross-reference
if drug["generic_name"].lower() in [a.lower() for a in patient.get("allergies", [])]:
return {
"success": False,
"error": "ALLERGY ALERT",
"message": f"Patient {inp.patient_id} has a documented allergy to {drug['generic_name']}",
"action": "Do not prescribe. Contact prescriber."
}
# Create the prescription
raw_output = write_prescription_to_db(inp, drug, patient)
# Validate output
try:
output = PrescriptionOutput(**raw_output)
return {"success": True, **output.model_dump()}
except ValidationError as e:
logger.error("Prescription output validation failed: %s", e)
return {
"success": False,
"error": "Internal error creating prescription record",
"action": "Contact IT support"
}Validation Checklist
For inputs:
- [ ] Use Pydantic models with Field constraints (min/max, pattern, gt/lt)
- [ ] Use Enum for any finite set of valid values
- [ ] Add field-level validators for domain rules (dates must be future, IDs must match format)
- [ ] Add model-level validators for cross-field rules
- [ ] Return structured error dicts on failure — never raise unhandled exceptions to the agent loop
For outputs:
- [ ] Define an output Pydantic model for every tool
- [ ] Validate your tool's own data before returning it
- [ ] Log validation failures — they indicate data quality issues in your system
- [ ] Return a safe error dict if output validation fails
For the agent loop:
- [ ] Give the LLM a system prompt instruction to retry on validation errors
- [ ] Cap iterations (5 is usually enough)
- [ ] Log every validation failure with the raw args for debugging
Summary
| Layer | What to Validate | Tool | |---|---|---| | LLM arguments | Types, formats, ranges, enums, domain rules | Pydantic input model | | Business logic | Entity existence, allergy checks, permissions | Custom code after input validation | | Tool output | Expected fields and types before returning | Pydantic output model | | Agent loop | Retry on structured errors, cap iterations | System prompt + loop logic |
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.