Query Rewriting
How rewriting user queries before retrieval improves RAG recall ā expanding abbreviations, correcting spelling, converting to keyword form, and step-back prompting.
Why Raw Queries Fail
User queries are often poor retrieval queries:
User query (conversational): "the drug my mum takes for her heart"
ā Low information density, ambiguous, won't match "Warfarin anticoagulation AF"
User query (abbreviation): "INR target for AF pts on VKA therapy?"
ā Abbreviations may not match spelled-out terms in the knowledge base
User query (vague): "What about the interaction?"
ā Follow-up in a conversation ā requires context to understand
User query (wrong terminology): "blood thinner for stroke prevention"
ā Lay term; medical literature uses "anticoagulation" or "antithrombotic"Query rewriting transforms the user's query into a better retrieval query before searching.
Rewriting Strategies
1. Expansion: add synonyms, related terms, formal equivalents
"blood thinner" ā "anticoagulant, antithrombotic, Warfarin, apixaban"
2. Abbreviation expansion:
"AF VKA INR" ā "atrial fibrillation vitamin K antagonist international normalised ratio"
3. Keyword extraction (for BM25):
Conversational ā keyword form
"What is the recommended dose of Warfarin for an elderly patient?"
ā "Warfarin dose elderly patient recommendation"
4. Question decomposition:
"Does Warfarin interact with NSAIDs and does it affect INR?"
ā Query 1: "Warfarin NSAID drug interaction"
ā Query 2: "Warfarin effect on INR monitoring"
5. Contextual resolution:
"What about the interaction?" + previous context about Warfarin
ā "Warfarin drug-drug interactions"LLM-Based Query Rewriting
from anthropic import Anthropic
client = Anthropic()
def rewrite_query(
original_query: str,
conversation_history: list[dict] | None = None,
domain: str = "clinical medicine"
) -> str:
"""Rewrite a user query for better retrieval."""
context = ""
if conversation_history:
context = "\n".join(
f"{msg['role'].upper()}: {msg['content']}"
for msg in conversation_history[-3:] # last 3 turns for context
)
prompt = f"""Rewrite the following user query for retrieval in a {domain} knowledge base.
Goals:
- Expand abbreviations (e.g., AF ā atrial fibrillation, INR ā international normalised ratio)
- Replace lay terms with medical terminology (e.g., blood thinner ā anticoagulant)
- Resolve pronouns and references using conversation context
- Make the query self-contained and specific
Return ONLY the rewritten query. No explanation.
{f"Conversation context:{chr(10)}{context}{chr(10)}" if context else ""}
Original query: {original_query}
Rewritten query:"""
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=200,
messages=[{"role": "user", "content": prompt}]
)
return response.content[0].text.strip()
# Example
query = "What's the target for AF patients on VKA?"
rewritten = rewrite_query(query, domain="clinical pharmacology")
# ā "What is the recommended INR (international normalised ratio) target range
# for patients with atrial fibrillation treated with vitamin K antagonists?"HyDE: Hypothetical Document Embeddings
An alternative approach: instead of rewriting the query, generate a hypothetical answer and embed that:
def hyde_query(query: str, client, embedder) -> np.ndarray:
"""Generate a hypothetical document, then embed it for retrieval."""
# Step 1: Generate a hypothetical answer
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=200,
messages=[{"role": "user", "content":
f"Write a short factual paragraph answering: {query}"}]
)
hypothetical_doc = response.content[0].text
# Step 2: Embed the hypothetical document
# This embedding is in the "document space" rather than "query space"
return embedder.encode(hypothetical_doc)
# The hypothetical doc uses the vocabulary and style of actual documents
# ā better alignment with the embedding space of the knowledge baseQuery Expansion with Synonyms
For clinical domains, a controlled medical vocabulary helps:
MEDICAL_SYNONYMS = {
"heart attack": ["myocardial infarction", "MI", "acute coronary syndrome", "NSTEMI", "STEMI"],
"blood thinner": ["anticoagulant", "antithrombotic", "Warfarin", "apixaban", "heparin"],
"af": ["atrial fibrillation", "AF"],
"inr": ["international normalised ratio", "prothrombin time"],
"stroke": ["cerebrovascular accident", "CVA", "TIA", "transient ischaemic attack"],
}
def expand_with_synonyms(query: str) -> str:
query_lower = query.lower()
expansions = []
for term, synonyms in MEDICAL_SYNONYMS.items():
if term in query_lower:
expansions.extend(synonyms)
if expansions:
return query + " " + " ".join(set(expansions))
return queryInterview Answer
"Query rewriting transforms a user's natural language query into a better retrieval query before searching. Key techniques: abbreviation expansion (AF ā atrial fibrillation), lay-to-medical terminology conversion (blood thinner ā anticoagulant), contextual reference resolution (what about the interaction? ā Warfarin NSAID drug interaction), and keyword extraction for sparse retrieval. I use a fast small LLM (Haiku, GPT-4o mini) to perform the rewrite ā it's cheap and adds ~50ms latency. For multi-part questions, query decomposition generates multiple sub-queries and retrieves results for each independently. HyDE is an alternative: generate a hypothetical answer and embed it rather than the query ā this aligns with the document embedding space."
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.