How RAG Reduces Hallucinations
Understand how Retrieval-Augmented Generation grounds LLM answers in real documents, enforces citations, handles missing knowledge gracefully, and how to evaluate faithfulness.
The Core Problem RAG Solves
A bare LLM generates text based on patterns learned during training. It has no access to:
- Your organization's internal documents
- Data published after its training cutoff
- Private databases or knowledge bases
When asked about any of these, the model either says "I don't know" (good) or, more dangerously, fabricates a plausible-sounding answer (hallucination).
RAG (Retrieval-Augmented Generation) solves this by retrieving relevant documents at query time and injecting them into the model's context. The model is then instructed to answer only from those documents.
WITHOUT RAG:
User: "What is our company's vacation policy?"
LLM: [searches internal weights]
"Most companies offer 15-20 days of paid vacation..." ← hallucinated/generic
WITH RAG:
User: "What is our company's vacation policy?"
Step 1: Embed query → search vector DB → retrieve vacation_policy.pdf chunk
Step 2: Inject retrieved text into context
LLM: [reads actual policy from context]
"According to your policy document: 'Full-time employees accrue
20 days of paid vacation per year...' [Source: vacation_policy.pdf]"Ground Truth: Anchoring Answers to Retrieved Documents
The fundamental RAG mechanism:
from anthropic import Anthropic
from sentence_transformers import SentenceTransformer
import numpy as np
import json
client = Anthropic()
embedder = SentenceTransformer("all-MiniLM-L6-v2")
# Simplified in-memory vector store for illustration
class SimpleVectorStore:
def __init__(self):
self.documents = []
self.embeddings = []
def add(self, text: str, metadata: dict):
embedding = embedder.encode(text)
self.documents.append({"text": text, "metadata": metadata})
self.embeddings.append(embedding)
def search(self, query: str, top_k: int = 3) -> list[dict]:
query_emb = embedder.encode(query)
scores = [
np.dot(query_emb, doc_emb) /
(np.linalg.norm(query_emb) * np.linalg.norm(doc_emb))
for doc_emb in self.embeddings
]
top_indices = np.argsort(scores)[::-1][:top_k]
return [
{**self.documents[i], "score": float(scores[i])}
for i in top_indices
]
store = SimpleVectorStore()
# Add company documents
store.add(
"Employees accrue 20 days of paid vacation per calendar year. "
"Unused vacation days can be carried over up to a maximum of 10 days.",
{"source": "hr_policy.pdf", "section": "Vacation"}
)
store.add(
"Remote work is permitted up to 3 days per week with manager approval. "
"Fully remote arrangements require VP-level approval.",
{"source": "hr_policy.pdf", "section": "Remote Work"}
)
def rag_answer(user_question: str) -> dict:
# Step 1: Retrieve relevant chunks
results = store.search(user_question, top_k=2)
# Step 2: Filter by relevance threshold
MIN_RELEVANCE = 0.3
relevant = [r for r in results if r["score"] >= MIN_RELEVANCE]
if not relevant:
return {
"answer": "I don't have information about that in the available documents.",
"sources": [],
"grounded": False
}
# Step 3: Build context string
context = "\n\n".join([
f"[Source: {r['metadata']['source']}, Section: {r['metadata']['section']}]\n{r['text']}"
for r in relevant
])
# Step 4: Prompt the model to answer ONLY from context
system_prompt = """You are a helpful HR assistant.
Answer ONLY using the provided document excerpts.
If the excerpts do not contain the answer, say "This information is not in the provided documents."
Always cite the source document."""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=400,
system=system_prompt,
messages=[{
"role": "user",
"content": f"Documents:\n{context}\n\nQuestion: {user_question}"
}]
)
return {
"answer": response.content[0].text,
"sources": [r["metadata"]["source"] for r in relevant],
"relevance_scores": [round(r["score"], 3) for r in relevant],
"grounded": True
}
result = rag_answer("How many vacation days do I get?")
print(json.dumps(result, indent=2))Citation Enforcement: Requiring the Model to Quote Its Source
Asking the model to answer from context is not enough. You must force citation so answers can be verified. Two approaches:
Approach 1: Inline citation markers
CITATION_SYSTEM_PROMPT = """
You are a document assistant. Rules:
1. Answer ONLY from the provided document excerpts.
2. After EVERY factual claim, add a citation: [Doc: <filename>, Para: <n>]
3. If a claim cannot be cited from the documents, do not make it.
4. End your answer with a "Sources" section listing all cited documents.
Example of correct citation format:
"Employees receive 20 vacation days per year [Doc: hr_policy.pdf, Para: 1].
These can roll over up to 10 days [Doc: hr_policy.pdf, Para: 1]."
"""
def rag_with_citations(question: str, context_chunks: list[dict]) -> str:
numbered_context = "\n\n".join([
f"[Para {i+1}, Source: {chunk['source']}]\n{chunk['text']}"
for i, chunk in enumerate(context_chunks)
])
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=600,
system=CITATION_SYSTEM_PROMPT,
messages=[{
"role": "user",
"content": f"Documents:\n{numbered_context}\n\nQuestion: {question}"
}]
)
return response.content[0].textApproach 2: Structured output with source field
import json
STRUCTURED_SYSTEM_PROMPT = """
You are a document assistant. Respond ONLY with valid JSON matching this schema:
{
"answer": "your answer here",
"confidence": "high|medium|low",
"citations": [
{"source": "filename", "quote": "exact quote from document supporting this answer"}
],
"not_in_documents": true/false
}
Answer only from the provided documents. Set not_in_documents=true if you cannot answer.
"""
def rag_structured_output(question: str, context: str) -> dict:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=600,
system=STRUCTURED_SYSTEM_PROMPT,
messages=[{
"role": "user",
"content": f"Documents:\n{context}\n\nQuestion: {question}"
}]
)
raw = response.content[0].text.strip()
# Strip markdown code fences if present
if raw.startswith("```"):
raw = raw.split("```")[1]
if raw.startswith("json"):
raw = raw[4:]
try:
return json.loads(raw)
except json.JSONDecodeError:
return {"error": "Model did not return valid JSON", "raw": raw}"I Don't Know" Behavior: When No Relevant Document Is Found
One of the most important safety behaviors in a RAG system: if the retrieved context does not answer the question, the model should refuse to answer rather than hallucinate.
def rag_with_fallback(question: str, store: SimpleVectorStore) -> dict:
"""
RAG pipeline with explicit 'I don't know' behavior.
"""
results = store.search(question, top_k=3)
# Hard threshold: if best match score is below this, declare "no answer"
CONFIDENCE_THRESHOLD = 0.35
best_score = results[0]["score"] if results else 0.0
if best_score < CONFIDENCE_THRESHOLD:
return {
"answer": (
"I don't have reliable information about that in the "
"available documents. Please consult the relevant team "
"or check the source documentation directly."
),
"grounded": False,
"best_retrieval_score": round(best_score, 3),
"action": "ESCALATE_TO_HUMAN"
}
context = "\n\n".join([
f"[{r['metadata']['source']}]\n{r['text']}"
for r in results
if r["score"] >= CONFIDENCE_THRESHOLD
])
system = """Answer only from the provided documents.
If the documents don't contain enough information to answer confidently,
respond with: "The available documents do not contain sufficient information
to answer this question reliably."
Never invent information not present in the documents."""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=400,
system=system,
messages=[{"role": "user", "content": f"Documents:\n{context}\n\nQuestion: {question}"}]
)
return {
"answer": response.content[0].text,
"grounded": True,
"best_retrieval_score": round(best_score, 3),
"action": "ANSWERED"
}Remaining Risks: RAG Does Not Eliminate All Hallucinations
RAG significantly reduces hallucinations but does not eliminate them. Remaining risks:
Risk 1: Context window hallucination
The model can still hallucinate within the provided context — misquoting, paraphrasing incorrectly, or blending two separate passages.
Document says: "Vacation accrual begins after 90 days of employment."
Model says: "Vacation accrual begins after 60 days of employment."
← The context was correctly retrieved, but the model misread it.Risk 2: Context contamination
If the retrieved chunks contain conflicting information (old policy vs new policy), the model may blend them incorrectly.
Risk 3: Out-of-context generalization
The model may use retrieved context as a springboard to add general knowledge not in the documents.
# Detecting out-of-context additions using NLI
from transformers import pipeline
nli_classifier = pipeline("text-classification", model="cross-encoder/nli-deberta-v3-small")
def check_answer_stays_in_context(answer: str, context: str) -> dict:
"""
Use NLI to check if every sentence in the answer is entailed by the context.
A sentence that contradicts or is not entailed by context may be hallucinated.
"""
sentences = [s.strip() for s in answer.split(".") if s.strip()]
results = []
for sentence in sentences:
result = nli_classifier(f"{context} [SEP] {sentence}")
label = result[0]["label"]
score = result[0]["score"]
results.append({
"sentence": sentence,
"nli_label": label,
"confidence": round(score, 3),
"flag": label == "CONTRADICTION" or (label == "NEUTRAL" and score > 0.7)
})
flagged = [r for r in results if r["flag"]]
return {
"total_sentences": len(sentences),
"flagged_sentences": len(flagged),
"details": results,
"verdict": "POSSIBLE_HALLUCINATION" if flagged else "APPEARS_FAITHFUL"
}Faithfulness Evaluation: Does the Answer Match the Context?
Faithfulness measures whether the model's answer is fully supported by the retrieved context. It is distinct from:
- Relevance — did we retrieve the right documents?
- Correctness — is the answer factually true in the world?
An answer can be faithful (grounded in context) but still wrong (if the context itself was wrong). Faithfulness evaluation only checks internal consistency.
from anthropic import Anthropic
client = Anthropic()
def evaluate_faithfulness(
question: str,
context: str,
answer: str
) -> dict:
"""
LLM-as-judge faithfulness evaluation.
Score: 1 (fully faithful) to 5 (highly unfaithful).
"""
eval_prompt = f"""
You are a faithfulness evaluator for a RAG system.
QUESTION: {question}
RETRIEVED CONTEXT:
{context}
GENERATED ANSWER:
{answer}
Evaluate whether the generated answer is faithful to the retrieved context.
A faithful answer:
- Makes only claims that are supported by the context
- Does not add information not present in the context
- Does not contradict the context
- Correctly quotes or paraphrases the context
Rate faithfulness on a scale of 1 to 5:
1 = Fully faithful — every claim is directly supported
2 = Mostly faithful — minor paraphrasing that is still accurate
3 = Partially faithful — some claims supported, some added
4 = Mostly unfaithful — significant additions or distortions
5 = Not faithful — answer contradicts or ignores the context
Respond with JSON: {{"score": <1-5>, "reasoning": "...", "problematic_claims": [...]}}
"""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=400,
messages=[{"role": "user", "content": eval_prompt}]
)
raw = response.content[0].text.strip()
try:
import json
result = json.loads(raw)
result["faithful"] = result["score"] <= 2
return result
except Exception:
return {"raw": raw, "parse_error": True}
# Example evaluation
eval_result = evaluate_faithfulness(
question="How many vacation days do employees get?",
context="Employees accrue 20 days of paid vacation per calendar year.",
answer="Employees receive 20 vacation days annually, and unused days roll over."
)
print(eval_result)
# {"score": 3, "reasoning": "The rollover claim is not in the context", ...}Summary
| RAG Property | Effect on Hallucination | |---|---| | Retrieval grounding | Eliminates knowledge-gap hallucinations | | Citation enforcement | Makes hallucinations detectable | | "I don't know" threshold | Prevents hallucinations when knowledge is absent | | Faithfulness evaluation | Catches hallucinations within retrieved context |
RAG is the single most effective technique for reducing hallucinations in production AI systems. But it must be implemented with explicit citation requirements, relevance thresholds, and faithfulness evaluation — not just retrieval alone.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.