Advanced RAG · Lesson 5 of 14
Query Rewriting: Improve Before Retrieving
Why Raw Queries Fail
User queries are often poor retrieval queries:
User query (conversational): "the drug my mum takes for her heart"
→ Low information density, ambiguous, won't match "Warfarin anticoagulation AF"
User query (abbreviation): "INR target for AF pts on VKA therapy?"
→ Abbreviations may not match spelled-out terms in the knowledge base
User query (vague): "What about the interaction?"
→ Follow-up in a conversation — requires context to understand
User query (wrong terminology): "blood thinner for stroke prevention"
→ Lay term; medical literature uses "anticoagulation" or "antithrombotic"Query rewriting transforms the user's query into a better retrieval query before searching.
Rewriting Strategies
1. Expansion: add synonyms, related terms, formal equivalents
"blood thinner" → "anticoagulant, antithrombotic, Warfarin, apixaban"
2. Abbreviation expansion:
"AF VKA INR" → "atrial fibrillation vitamin K antagonist international normalised ratio"
3. Keyword extraction (for BM25):
Conversational → keyword form
"What is the recommended dose of Warfarin for an elderly patient?"
→ "Warfarin dose elderly patient recommendation"
4. Question decomposition:
"Does Warfarin interact with NSAIDs and does it affect INR?"
→ Query 1: "Warfarin NSAID drug interaction"
→ Query 2: "Warfarin effect on INR monitoring"
5. Contextual resolution:
"What about the interaction?" + previous context about Warfarin
→ "Warfarin drug-drug interactions"LLM-Based Query Rewriting
from anthropic import Anthropic
client = Anthropic()
def rewrite_query(
original_query: str,
conversation_history: list[dict] | None = None,
domain: str = "clinical medicine"
) -> str:
"""Rewrite a user query for better retrieval."""
context = ""
if conversation_history:
context = "\n".join(
f"{msg['role'].upper()}: {msg['content']}"
for msg in conversation_history[-3:] # last 3 turns for context
)
prompt = f"""Rewrite the following user query for retrieval in a {domain} knowledge base.
Goals:
- Expand abbreviations (e.g., AF → atrial fibrillation, INR → international normalised ratio)
- Replace lay terms with medical terminology (e.g., blood thinner → anticoagulant)
- Resolve pronouns and references using conversation context
- Make the query self-contained and specific
Return ONLY the rewritten query. No explanation.
{f"Conversation context:{chr(10)}{context}{chr(10)}" if context else ""}
Original query: {original_query}
Rewritten query:"""
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=200,
messages=[{"role": "user", "content": prompt}]
)
return response.content[0].text.strip()
# Example
query = "What's the target for AF patients on VKA?"
rewritten = rewrite_query(query, domain="clinical pharmacology")
# → "What is the recommended INR (international normalised ratio) target range
# for patients with atrial fibrillation treated with vitamin K antagonists?"HyDE: Hypothetical Document Embeddings
An alternative approach: instead of rewriting the query, generate a hypothetical answer and embed that:
def hyde_query(query: str, client, embedder) -> np.ndarray:
"""Generate a hypothetical document, then embed it for retrieval."""
# Step 1: Generate a hypothetical answer
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=200,
messages=[{"role": "user", "content":
f"Write a short factual paragraph answering: {query}"}]
)
hypothetical_doc = response.content[0].text
# Step 2: Embed the hypothetical document
# This embedding is in the "document space" rather than "query space"
return embedder.encode(hypothetical_doc)
# The hypothetical doc uses the vocabulary and style of actual documents
# → better alignment with the embedding space of the knowledge baseQuery Expansion with Synonyms
For clinical domains, a controlled medical vocabulary helps:
MEDICAL_SYNONYMS = {
"heart attack": ["myocardial infarction", "MI", "acute coronary syndrome", "NSTEMI", "STEMI"],
"blood thinner": ["anticoagulant", "antithrombotic", "Warfarin", "apixaban", "heparin"],
"af": ["atrial fibrillation", "AF"],
"inr": ["international normalised ratio", "prothrombin time"],
"stroke": ["cerebrovascular accident", "CVA", "TIA", "transient ischaemic attack"],
}
def expand_with_synonyms(query: str) -> str:
query_lower = query.lower()
expansions = []
for term, synonyms in MEDICAL_SYNONYMS.items():
if term in query_lower:
expansions.extend(synonyms)
if expansions:
return query + " " + " ".join(set(expansions))
return queryInterview Answer
"Query rewriting transforms a user's natural language query into a better retrieval query before searching. Key techniques: abbreviation expansion (AF → atrial fibrillation), lay-to-medical terminology conversion (blood thinner → anticoagulant), contextual reference resolution (what about the interaction? → Warfarin NSAID drug interaction), and keyword extraction for sparse retrieval. I use a fast small LLM (Haiku, GPT-4o mini) to perform the rewrite — it's cheap and adds ~50ms latency. For multi-part questions, query decomposition generates multiple sub-queries and retrieves results for each independently. HyDE is an alternative: generate a hypothetical answer and embed it rather than the query — this aligns with the document embedding space."