Types of Hallucinations
A taxonomy of LLM hallucination types — factual, entity, logical, and instruction hallucinations — with real before/after examples and detection strategies for each.
Why a Taxonomy Matters
Not all hallucinations are the same. A wrong date in a document summary is a different failure mode from a model ignoring part of your prompt. Understanding the type of hallucination tells you which mitigation to apply.
This lesson covers four major categories:
- Factual hallucinations — incorrect dates, names, statistics
- Entity hallucinations — invented people, companies, citations
- Logical hallucinations — correct facts, wrong reasoning
- Instruction hallucinations — prompt constraints ignored or forgotten
Category 1: Factual Hallucinations
Factual hallucinations occur when a model states a verifiable fact incorrectly. The fact could have a correct answer — the model just got it wrong.
Common subtypes
- Date errors: wrong year, wrong century
- Statistic errors: wrong percentage, wrong count
- Name errors: real person, wrong detail (wrong role, wrong institution)
- Geographic errors: wrong country, wrong capital, wrong measurement
Before/After Examples
Example 1 — Date error
BEFORE (hallucination):
Prompt: "When was the Python programming language first released publicly?"
Model: "Python was first released publicly in 1995 by Guido van Rossum."
← WRONG. Python 1.0 was released in January 1994.
Python 0.9.0 was released in 1991.
AFTER (with explicit grounding instruction):
Prompt: "When was the Python programming language first released publicly?
If you are not certain, say so and offer to look it up."
Model: "Python was first released publicly in January 1994 (version 1.0).
Guido van Rossum had begun the project in 1989 and released version
0.9.0 in February 1991. Please verify with the official Python history
page for authoritative dates."Example 2 — Statistic error
BEFORE (hallucination):
Prompt: "What percentage of the world uses Python?"
Model: "Python is used by approximately 48% of all developers worldwide,
making it the most popular programming language."
← PLAUSIBLE-SOUNDING but likely fabricated. Survey data varies
significantly by source and year.
AFTER (uncertainty-aware prompt):
Prompt: "What percentage of developers use Python? Cite your source
and note your training cutoff."
Model: "According to the Stack Overflow Developer Survey from around
my training data (cutoff ~2023), Python was used by roughly
45-49% of respondents, though this varies by survey methodology.
I cannot guarantee this is the most current figure — please check
the latest Stack Overflow or JetBrains surveys directly."Detection Strategy
# Factual hallucination detection using entity extraction + knowledge base lookup
from anthropic import Anthropic
client = Anthropic()
KNOWN_FACTS = {
"python_release_year": 1994,
"python_creator": "Guido van Rossum",
"python_first_version": "1.0",
}
def extract_and_verify_facts(model_output: str) -> dict:
"""
Use a second LLM call to extract claims, then check against known facts.
This is a simplified pattern — in production you'd use a knowledge base.
"""
extraction_prompt = f"""
Extract all factual claims from the following text as a JSON list.
Each claim should be: {{"claim": "...", "subject": "...", "value": "..."}}
Text: {model_output}
Return only valid JSON, no markdown.
"""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=500,
messages=[{"role": "user", "content": extraction_prompt}]
)
return {
"raw_claims": response.content[0].text,
"note": "Compare extracted claims against your knowledge base"
}Category 2: Entity Hallucinations
Entity hallucinations involve the model inventing non-existent entities: people who do not exist, papers that were never written, companies that were never founded, URLs that lead nowhere.
This is especially prevalent in:
- Academic citation generation
- Legal case citation
- Software library documentation
- Medical study references
Before/After Examples
Example 1 — Invented academic paper
BEFORE (hallucination):
Prompt: "Give me three academic papers on transformer attention mechanisms."
Model: "1. Vaswani et al. (2017) 'Attention Is All You Need' — NIPS 2017
2. Chen, Liu & Zhang (2019) 'Multi-Scale Attention for NLP Tasks'
— ACL 2019, pp. 1842-1851
3. Rodriguez, Kim & Park (2021) 'Hierarchical Attention Networks
for Document Classification' — EMNLP 2021"
← Paper 1 is real. Papers 2 and 3 appear to be fabricated.
The authors, titles, pages, and venues are plausible-sounding
but do not correspond to real publications.
AFTER (with grounding instruction):
Prompt: "Give me academic papers on transformer attention mechanisms.
ONLY include papers you are highly confident are real.
For any paper, note: 'Please verify on Google Scholar or
Semantic Scholar before citing.'"
Model: "Here are papers I am fairly confident exist, though please
verify before citing:
- 'Attention Is All You Need' (Vaswani et al., 2017) — very
well known, I am highly confident this is accurate
- 'BERT: Pre-training of Deep Bidirectional Transformers'
(Devlin et al., 2018) — highly confident
I strongly recommend using Semantic Scholar (semanticscholar.org)
to find verified citations rather than relying on my output."Example 2 — Invented library function
BEFORE (hallucination):
Prompt: "How do I parse JSON with the pandas json_normalize_schema function?"
Model: "You can use pandas.json_normalize_schema() as follows:
import pandas as pd
df = pd.json_normalize_schema(data, schema_path='records')"
← json_normalize_schema does not exist in pandas.
The real function is pd.json_normalize().
AFTER (code-verified response):
Prompt: "How do I parse nested JSON into a DataFrame with pandas?
Only use functions that exist in the pandas public API."
Model: "Use pd.json_normalize() — this is the correct function:
import pandas as pd
data = [{'id': 1, 'address': {'city': 'Oslo', 'zip': '0150'}}]
df = pd.json_normalize(data, sep='.')
# Result: columns ['id', 'address.city', 'address.zip']"Detection Strategy
import ast
import importlib
import inspect
def verify_python_function_exists(module_name: str, function_name: str) -> dict:
"""
Check if a function actually exists in a Python module.
Use this to catch hallucinated API references.
"""
try:
module = importlib.import_module(module_name)
func = getattr(module, function_name, None)
if func is None:
return {
"exists": False,
"module": module_name,
"function": function_name,
"verdict": "HALLUCINATION LIKELY — function not found in module"
}
return {
"exists": True,
"module": module_name,
"function": function_name,
"signature": str(inspect.signature(func)),
"verdict": "VERIFIED"
}
except ImportError:
return {
"exists": None,
"module": module_name,
"error": "Module not installed in this environment"
}
# Example usage
result = verify_python_function_exists("pandas", "json_normalize_schema")
print(result)
# {'exists': False, 'verdict': 'HALLUCINATION LIKELY — function not found in module'}
result2 = verify_python_function_exists("pandas", "json_normalize")
print(result2)
# {'exists': True, 'signature': '(data, ...)', 'verdict': 'VERIFIED'}Category 3: Logical Hallucinations
Logical hallucinations are subtler. The model states individual facts correctly but draws wrong conclusions from them. The reasoning chain contains an error even though the premises are accurate.
Before/After Examples
Example 1 — Wrong inference from correct facts
BEFORE (logical hallucination):
Prompt: "If a patient has a penicillin allergy, can they take amoxicillin?"
Model: "Penicillin and amoxicillin are both antibiotics. Penicillin is
derived from Penicillium mold. Amoxicillin is a widely used
antibiotic. Therefore, amoxicillin is generally safe for patients
with penicillin allergies."
← DANGEROUS LOGICAL HALLUCINATION.
The individual facts are accurate, but the conclusion is wrong.
Amoxicillin IS a penicillin-type antibiotic (aminopenicillin).
Patients with penicillin allergy have significant cross-reactivity risk.
AFTER (forced chain-of-thought with domain flag):
Prompt: "If a patient has a penicillin allergy, can they take amoxicillin?
Show your reasoning step by step. Flag if this is a medical question
requiring clinical judgment."
Model: "⚠️ Medical question — a pharmacist or physician should confirm.
Reasoning:
Step 1: What is amoxicillin? Amoxicillin is an aminopenicillin —
a subclass of penicillin antibiotics.
Step 2: What does penicillin allergy mean? Allergy to the core
beta-lactam ring shared by penicillins.
Step 3: Does amoxicillin share this ring? Yes — amoxicillin
contains the same beta-lactam ring.
Conclusion: Patients with penicillin allergy are at risk of
cross-reactivity with amoxicillin. This is generally
contraindicated. Consult a clinician."Example 2 — Math reasoning error
BEFORE (logical hallucination):
Prompt: "If I invest $10,000 at 7% annual return for 10 years,
then withdraw $5,000, how much do I have?"
Model: "At 7% for 10 years, your investment grows to $19,672.
After withdrawing $5,000, you have $14,672."
← The final subtraction is presented as the final answer,
but the question should trigger clarification:
Is the withdrawal at the END of year 10 or at some point during?
The compound growth calculation also may vary by compounding frequency.
AFTER:
Prompt: "Same question — assume annual compounding, withdrawal at end of
year 10 AFTER compounding. Show each step."
Model: "Step 1: Compound growth formula: P × (1 + r)^t
= $10,000 × (1.07)^10
= $10,000 × 1.96715
= $19,671.51
Step 2: Withdrawal at end of year 10: $19,671.51 - $5,000
= $14,671.51
Answer: $14,671.51 assuming annual compounding and end-of-period
withdrawal."Category 4: Instruction Hallucinations
Instruction hallucinations occur when the model ignores or partially ignores parts of the prompt. This is especially common with:
- Long prompts where constraints appear late
- Multiple competing instructions
- Format requirements
- Negative instructions ("do not include X")
Before/After Examples
Example 1 — Ignored format constraint
BEFORE (instruction hallucination):
Prompt: "List the top 5 programming languages in 2024.
Respond ONLY as a JSON array of strings. No prose."
Model: "Here are the top 5 programming languages in 2024:
['Python', 'JavaScript', 'Java', 'C++', 'TypeScript']
These languages are widely used across..."
← The model included prose before and after the JSON,
violating the "ONLY as a JSON array" constraint.
AFTER (constraint reinforcement):
Prompt: "List the top 5 programming languages in 2024.
YOUR ENTIRE RESPONSE must be a single JSON array of strings.
Example of valid output: [\"A\", \"B\", \"C\", \"D\", \"E\"]
Do not add any other text before or after the array."
Model: ["Python", "JavaScript", "Java", "C++", "TypeScript"]Example 2 — Negative instruction ignored
BEFORE (instruction hallucination):
Prompt: "Summarize this article. Do NOT include any medical advice
or recommendations. Only describe what the study did."
Model: "The study examined 500 patients. The researchers found that...
Based on these findings, patients with condition X should
consider consulting their doctor about..."
← The model added a recommendation despite explicit prohibition.
AFTER (with output schema check):
System: "You summarize research articles. You NEVER give medical advice.
If you find yourself about to write a recommendation, stop and
instead write '[CLINICAL JUDGMENT REQUIRED]'."
Model: "The study examined 500 patients over 12 months. Researchers
measured outcomes using validated scales. Results showed a 23%
improvement in the intervention group. [CLINICAL JUDGMENT REQUIRED]"Detection and Recovery
import re
from anthropic import Anthropic
client = Anthropic()
def check_instruction_compliance(
prompt: str,
model_output: str,
required_format: str = None,
forbidden_patterns: list[str] = None
) -> dict:
"""
Check if model output complies with prompt instructions.
"""
violations = []
# Check format requirement (e.g., must be JSON)
if required_format == "json":
try:
import json
json.loads(model_output.strip())
except Exception:
violations.append("OUTPUT_NOT_JSON: model was asked for JSON but output is not valid JSON")
# Check forbidden content patterns
if forbidden_patterns:
for pattern in forbidden_patterns:
if re.search(pattern, model_output, re.IGNORECASE):
violations.append(f"FORBIDDEN_PATTERN_FOUND: '{pattern}' appeared in output")
# Check length constraints (simplified)
word_count = len(model_output.split())
return {
"compliant": len(violations) == 0,
"violations": violations,
"word_count": word_count,
"action": "RETRY with stronger constraints" if violations else "PASS"
}
# Example usage
output = '["Python", "JavaScript", "Java", "C++", "TypeScript"]'
result = check_instruction_compliance(
prompt="Return JSON array only",
model_output=output,
required_format="json",
forbidden_patterns=["should", "recommend", "consult"]
)
print(result)
# {'compliant': True, 'violations': [], 'action': 'PASS'}Summary: Hallucination Type Comparison
| Type | What Goes Wrong | Most Dangerous In | Primary Mitigation | |---|---|---|---| | Factual | Wrong date, stat, name | Healthcare, legal, finance | RAG with citations | | Entity | Invented paper, person, function | Academic, code generation | External lookup, verification | | Logical | Wrong conclusion from right facts | Medical advice, math, legal | Chain-of-thought prompting | | Instruction | Ignores prompt constraints | Format-sensitive apps, safety constraints | Output validation, schema enforcement |
Each type requires a different defence. The next lessons cover mitigation techniques in depth.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.