Getting Reliable JSON Output

The Problem

LLMs generate free text. For production pipelines that need structured data, unstructured output breaks downstream code:

Expected: {"diagnosis": "Atrial fibrillation", "medications": ["Warfarin"]}

What the model might return:
  "The patient has atrial fibrillation and is taking Warfarin."
  "Diagnosis: Atrial fibrillation\nMedications: Warfarin"
  {"diagnosis": "Atrial fibrillation", "medications": "Warfarin"}  ← wrong type
  ```json\n{"diagnosis": "AF", ...}\n```  ← markdown code block wrapper
  {"diagnosis": "Atrial fibrillation",  ← truncated (no closing brace)

Prompt-Level Techniques

1. Explicit schema in the prompt:

"Respond ONLY with valid JSON matching this exact schema. No explanation,
 no markdown, no code blocks. Just the raw JSON object:

{
  'diagnosis': string,
  'medications': [string],
  'urgency': 'low' | 'medium' | 'high'
}"

2. Show the output format with a filled-in example:

"Return a JSON object like this:

{
  'diagnosis': 'Atrial fibrillation',
  'medications': ['Warfarin 5mg'],
  'urgency': 'medium'
}

Now process the note below and return the JSON:"

3. Start the assistant turn with {:

Some APIs allow prefilling the assistant's response start. Starting with { forces JSON generation:

Python

# Anthropic API — prefill assistant turn
response = client.messages.create(
    model="claude-sonnet-4-6",
    messages=[
        {"role": "user", "content": prompt},
        {"role": "assistant", "content": "{"}  # prefill forces JSON
    ]
)
result = "{" + response.content[0].text  # prepend the { we prefilled

JSON Mode (OpenAI)

OpenAI's JSON mode guarantees valid JSON output:

Python

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    response_format={"type": "json_object"},
    messages=[
        {"role": "system", "content": "Extract structured data as JSON."},
        {"role": "user", "content": "Patient has AF and takes Warfarin 5mg."}
    ]
)
import json
data = json.loads(response.choices[0].message.content)

Note: JSON mode guarantees valid JSON syntax but not schema compliance — the model may still return {"medications": "Warfarin"} instead of {"medications": ["Warfarin"]}. You still need schema validation.

Structured Outputs (OpenAI) and Tool Use (Anthropic)

More reliable than JSON mode — enforces a specific schema:

Python

# OpenAI Structured Outputs (schema enforced at sampling level)
from pydantic import BaseModel
from openai import OpenAI

class ClinicalSummary(BaseModel):
    diagnosis: str
    medications: list[str]
    urgency: str

client = OpenAI()
response = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[{"role": "user", "content": "Patient has AF, takes Warfarin 5mg."}],
    response_format=ClinicalSummary,
)
summary = response.choices[0].message.parsed  # typed ClinicalSummary instance

Parsing and Fallback

Always wrap JSON parsing in error handling with a retry fallback:

Python

import json
import re
from anthropic import Anthropic

client = Anthropic()

def extract_json_from_response(text: str) -> dict | None:
    """Try to extract JSON even if wrapped in markdown or has trailing text."""
    # Strip markdown code blocks
    text = re.sub(r"```(?:json)?\n?(.*?)\n?```", r"\1", text, flags=re.DOTALL)

    # Try direct parse first
    try:
        return json.loads(text.strip())
    except json.JSONDecodeError:
        pass

    # Try finding the first { ... } block
    match = re.search(r"\{.*\}", text, re.DOTALL)
    if match:
        try:
            return json.loads(match.group())
        except json.JSONDecodeError:
            pass

    return None

def get_structured_output(prompt: str, schema_description: str, retries: int = 2) -> dict:
    for attempt in range(retries + 1):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=512,
            messages=[{"role": "user", "content": prompt}],
        )
        text = response.content[0].text
        result = extract_json_from_response(text)

        if result is not None:
            return result

        if attempt < retries:
            # Retry with a correction prompt
            prompt = f"""Your previous response was not valid JSON. 
Return ONLY the JSON object with schema: {schema_description}
No explanation, no markdown. Just the JSON."""

    raise ValueError("Failed to get valid JSON after retries")

Schema Validation

Even with JSON mode, validate the schema:

Python

from pydantic import BaseModel, ValidationError

class MedicationEntry(BaseModel):
    name: str
    dose: str
    frequency: str

class ClinicalSummary(BaseModel):
    primary_diagnosis: str
    medications: list[MedicationEntry]
    urgency: str

def validate_clinical_summary(raw: dict) -> ClinicalSummary | None:
    try:
        return ClinicalSummary(**raw)
    except ValidationError as e:
        print(f"Schema validation failed: {e}")
        return None

Interview Answer

"Getting reliable JSON from LLMs requires layers: in the prompt, specify the exact schema, use XML-delimited examples, and say 'respond only with valid JSON.' API-level JSON mode (OpenAI) guarantees syntactic validity but not schema compliance. Structured Outputs (OpenAI, via Pydantic) and tool use (Anthropic) enforce the schema at the sampling level — most reliable. Always parse with error handling, strip markdown code blocks, and attempt regex extraction as a fallback. Validate the parsed object against a Pydantic schema. For critical pipelines, add a retry loop that re-prompts with the error on failure."

Getting Reliable JSON Output

The Problem

Prompt-Level Techniques

JSON Mode (OpenAI)

Structured Outputs (OpenAI) and Tool Use (Anthropic)

Parsing and Fallback

Schema Validation

Interview Answer

Enjoyed this article?

Leave a comment