Learnixo
Back to blog
AI Systemsintermediate

Validating Tool Inputs and Outputs

LLMs can hallucinate invalid arguments. Learn to validate tool inputs with Pydantic, validate outputs against expected schemas, and re-prompt on failure.

Asma Hafeez KhanMay 15, 20268 min read
Tool CallingPydanticValidationPythonAI Agents
Share:𝕏

Why Validation Is Non-Negotiable

An LLM can request a tool call with any arguments it invents. The JSON Schema you provide constrains format and types — but only loosely. The model can still:

  • Pass a negative number for a field that should be positive
  • Pass a patient ID that doesn't match your ID format
  • Pass a string where an enum is expected but hallucinate a value not in the enum
  • Pass empty strings for required fields
  • Omit required nested fields inside object parameters
  • Pass a date string in the wrong format ("15-05-2026" vs "2026-05-15")

None of these trigger a schema validation error at the API level. They arrive at your tool function and may silently produce wrong results, corrupt data, or crash.

Validation at the tool boundary is mandatory. Pydantic makes it straightforward.


Layer 1: Pydantic Input Validation

Define a Pydantic model for every tool's input. Parse the LLM's arguments through it before executing anything.

Python
from pydantic import BaseModel, Field, field_validator, model_validator
from typing import Optional
from datetime import date
from enum import Enum
import re

class AppointmentType(str, Enum):
    consultation = "consultation"
    followup = "followup"
    procedure = "procedure"

class BookAppointmentInput(BaseModel):
    """
    Input model for the book_appointment tool.
    Validates all LLM-provided arguments before any DB operation.
    """
    patient_id: str = Field(..., pattern=r"^P-\d{5}$")  # e.g. P-00123
    appointment_date: date
    appointment_type: AppointmentType
    provider_id: str = Field(..., min_length=3, max_length=20)
    notes: Optional[str] = Field(default=None, max_length=500)
    duration_minutes: int = Field(default=30, ge=15, le=120)

    @field_validator("appointment_date")
    @classmethod
    def date_must_be_future(cls, v: date) -> date:
        if v <= date.today():
            raise ValueError(f"Appointment date must be in the future. Got: {v}")
        return v

    @field_validator("notes")
    @classmethod
    def sanitize_notes(cls, v: Optional[str]) -> Optional[str]:
        if v is None:
            return None
        # Remove characters that could cause issues in downstream systems
        sanitized = re.sub(r"[<>\"'%;()&+]", "", v)
        return sanitized.strip() or None

    @model_validator(mode="after")
    def procedure_requires_longer_slot(self) -> "BookAppointmentInput":
        if self.appointment_type == AppointmentType.procedure and self.duration_minutes < 60:
            raise ValueError(
                f"Procedures require at least 60 minutes. "
                f"Got {self.duration_minutes} minutes."
            )
        return self

Using the Validator in Your Tool

Python
import json
from pydantic import ValidationError

def book_appointment_tool(raw_args: dict) -> dict:
    """
    Tool function called by the agent loop.
    raw_args comes directly from json.loads(tool_call.function.arguments)
    """
    # Validate first  before ANY database or external calls
    try:
        validated = BookAppointmentInput(**raw_args)
    except ValidationError as e:
        # Return a structured error the LLM can understand and relay to the user
        error_messages = []
        for error in e.errors():
            field = " -> ".join(str(loc) for loc in error["loc"])
            error_messages.append(f"{field}: {error['msg']}")

        return {
            "success": False,
            "error": "Invalid appointment parameters",
            "validation_errors": error_messages,
            "hint": "Please correct the parameters and try again."
        }

    # Now it's safe to proceed
    return create_appointment_in_db(
        patient_id=validated.patient_id,
        date=validated.appointment_date,
        type=validated.appointment_type.value,
        provider=validated.provider_id,
        duration=validated.duration_minutes,
        notes=validated.notes
    )

def create_appointment_in_db(
    patient_id: str,
    date: date,
    type: str,
    provider: str,
    duration: int,
    notes: Optional[str]
) -> dict:
    """Actual DB write — only called after validation passes."""
    # Real implementation here
    return {
        "success": True,
        "appointment_id": "APT-789012",
        "patient_id": patient_id,
        "scheduled_for": str(date),
        "duration_minutes": duration
    }

Layer 2: Output Schema Validation

Tools should also validate their own output before returning it. A bug in your tool could return the wrong fields, the wrong types, or missing required data — and the LLM will hallucinate based on that corrupt output.

Python
from pydantic import BaseModel
from typing import Optional
from datetime import datetime

class DrugInfoOutput(BaseModel):
    """Expected shape of get_drug_info output."""
    drug_id: str
    name: str
    generic_name: str
    dosage_adult: str
    dosage_unit: str
    interactions: list[str]
    contraindications: list[str]
    last_updated: datetime

def get_drug_info(drug_name: str) -> dict:
    """Returns validated drug information."""
    # Fetch from DB
    raw = database.fetch_drug(drug_name)

    if raw is None:
        return {"error": f"Drug '{drug_name}' not found"}

    # Validate output before returning to the LLM
    try:
        validated_output = DrugInfoOutput(**raw)
        return validated_output.model_dump(mode="json")
    except ValidationError as e:
        # Our own data has a problem  log it and return a safe error
        logger.error(
            "Drug info output validation failed",
            extra={"drug_name": drug_name, "errors": e.errors()}
        )
        return {
            "error": "Data quality issue — drug record is incomplete",
            "drug_name": drug_name,
            "action": "Contact the pharmacy data team"
        }

Layer 3: Re-prompting on Validation Failure

When validation fails, the best response is to return the error as a tool result — not to crash the loop. The LLM reads the error and either:

  • Corrects its arguments and calls the tool again
  • Asks the user for the missing information
  • Explains why the request can't be completed
Python
import json
import openai
from pydantic import ValidationError

client = openai.OpenAI()

TOOL_MAP = {
    "book_appointment": book_appointment_tool,
    "get_drug_info": get_drug_info,
}

def run_agent_with_validation(user_message: str, tools: list) -> str:
    messages = [
        {
            "role": "system",
            "content": (
                "You are a clinical scheduling assistant. "
                "If a tool returns a validation error, correct the parameters "
                "and retry the tool call. Ask the user for clarification only if "
                "you cannot determine the correct value yourself."
            )
        },
        {"role": "user", "content": user_message}
    ]

    for _ in range(5):  # Allow up to 5 iterations (tool calls + retries)
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools,
            tool_choice="auto"
        )
        msg = response.choices[0].message

        if not msg.tool_calls:
            return msg.content or ""

        messages.append(msg)

        for tc in msg.tool_calls:
            fn_name = tc.function.name
            raw_args = json.loads(tc.function.arguments)

            if fn_name in TOOL_MAP:
                result = TOOL_MAP[fn_name](raw_args)
            else:
                result = {"error": f"Unknown tool: {fn_name}"}

            messages.append({
                "role": "tool",
                "tool_call_id": tc.id,
                "content": json.dumps(result)
            })

    return "Unable to complete request after multiple attempts."

When book_appointment_tool returns a validation error like:

JSON
{
  "success": false,
  "error": "Invalid appointment parameters",
  "validation_errors": ["appointment_date: Appointment date must be in the future. Got: 2026-05-01"],
  "hint": "Please correct the parameters and try again."
}

The LLM will read this, understand the date was in the past, and retry with a corrected date — all without user intervention.


Complete Example: Drug Prescription Validation

This example shows a more complete validation pattern for a prescription tool:

Python
from pydantic import BaseModel, Field, field_validator
from typing import Optional
from enum import Enum
import re
from datetime import date

class RouteOfAdministration(str, Enum):
    oral = "oral"
    iv = "iv"
    topical = "topical"
    inhaled = "inhaled"
    subcutaneous = "subcutaneous"

class DosageUnit(str, Enum):
    mg = "mg"
    mcg = "mcg"
    ml = "ml"
    units = "units"
    puffs = "puffs"

class CreatePrescriptionInput(BaseModel):
    patient_id: str = Field(..., pattern=r"^P-\d{5}$")
    drug_id: str = Field(..., pattern=r"^D-\d{3,6}$")
    dose_amount: float = Field(..., gt=0, le=10000)
    dose_unit: DosageUnit
    frequency_per_day: int = Field(..., ge=1, le=8)
    route: RouteOfAdministration
    duration_days: Optional[int] = Field(default=None, ge=1, le=365)
    prescriber_id: str = Field(..., min_length=5, max_length=15)
    start_date: date

    @field_validator("start_date")
    @classmethod
    def start_date_not_too_far_future(cls, v: date) -> date:
        days_ahead = (v - date.today()).days
        if days_ahead > 90:
            raise ValueError(
                f"Start date cannot be more than 90 days in the future. Got {days_ahead} days ahead."
            )
        return v

    @model_validator(mode="after")
    def validate_iv_requires_duration(self) -> "CreatePrescriptionInput":
        """IV medications must have a specified duration for safety."""
        if self.route == RouteOfAdministration.iv and self.duration_days is None:
            raise ValueError(
                "IV medications require an explicit duration_days for patient safety."
            )
        return self

class PrescriptionOutput(BaseModel):
    prescription_id: str
    status: str
    patient_id: str
    drug_name: str
    instructions: str
    warnings: list[str]
    created_at: str

def create_prescription(raw_args: dict) -> dict:
    # Validate input
    try:
        inp = CreatePrescriptionInput(**raw_args)
    except ValidationError as e:
        return {
            "success": False,
            "error": "Prescription validation failed",
            "validation_errors": [
                {"field": ".".join(str(l) for l in err["loc"]), "message": err["msg"]}
                for err in e.errors()
            ]
        }

    # Business logic checks beyond format validation
    drug = get_drug_by_id(inp.drug_id)
    if not drug:
        return {"success": False, "error": f"Drug ID {inp.drug_id} not found in formulary"}

    patient = get_patient(inp.patient_id)
    if not patient:
        return {"success": False, "error": f"Patient {inp.patient_id} not found"}

    # Check allergy cross-reference
    if drug["generic_name"].lower() in [a.lower() for a in patient.get("allergies", [])]:
        return {
            "success": False,
            "error": "ALLERGY ALERT",
            "message": f"Patient {inp.patient_id} has a documented allergy to {drug['generic_name']}",
            "action": "Do not prescribe. Contact prescriber."
        }

    # Create the prescription
    raw_output = write_prescription_to_db(inp, drug, patient)

    # Validate output
    try:
        output = PrescriptionOutput(**raw_output)
        return {"success": True, **output.model_dump()}
    except ValidationError as e:
        logger.error("Prescription output validation failed: %s", e)
        return {
            "success": False,
            "error": "Internal error creating prescription record",
            "action": "Contact IT support"
        }

Validation Checklist

For inputs:

  • [ ] Use Pydantic models with Field constraints (min/max, pattern, gt/lt)
  • [ ] Use Enum for any finite set of valid values
  • [ ] Add field-level validators for domain rules (dates must be future, IDs must match format)
  • [ ] Add model-level validators for cross-field rules
  • [ ] Return structured error dicts on failure — never raise unhandled exceptions to the agent loop

For outputs:

  • [ ] Define an output Pydantic model for every tool
  • [ ] Validate your tool's own data before returning it
  • [ ] Log validation failures — they indicate data quality issues in your system
  • [ ] Return a safe error dict if output validation fails

For the agent loop:

  • [ ] Give the LLM a system prompt instruction to retry on validation errors
  • [ ] Cap iterations (5 is usually enough)
  • [ ] Log every validation failure with the raw args for debugging

Summary

| Layer | What to Validate | Tool | |---|---|---| | LLM arguments | Types, formats, ranges, enums, domain rules | Pydantic input model | | Business logic | Entity existence, allergy checks, permissions | Custom code after input validation | | Tool output | Expected fields and types before returning | Pydantic output model | | Agent loop | Retry on structured errors, cap iterations | System prompt + loop logic |

Enjoyed this article?

Explore the AI Systems learning path for more.

Found this helpful?

Share:𝕏

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.