Learnixo

Advanced RAG · Lesson 5 of 14

Query Rewriting: Improve Before Retrieving

Why Raw Queries Fail

User queries are often poor retrieval queries:

User query (conversational): "the drug my mum takes for her heart"
  → Low information density, ambiguous, won't match "Warfarin anticoagulation AF"

User query (abbreviation): "INR target for AF pts on VKA therapy?"
  → Abbreviations may not match spelled-out terms in the knowledge base

User query (vague): "What about the interaction?"
  → Follow-up in a conversation — requires context to understand

User query (wrong terminology): "blood thinner for stroke prevention"
  → Lay term; medical literature uses "anticoagulation" or "antithrombotic"

Query rewriting transforms the user's query into a better retrieval query before searching.


Rewriting Strategies

1. Expansion: add synonyms, related terms, formal equivalents
   "blood thinner" → "anticoagulant, antithrombotic, Warfarin, apixaban"

2. Abbreviation expansion:
   "AF VKA INR" → "atrial fibrillation vitamin K antagonist international normalised ratio"

3. Keyword extraction (for BM25):
   Conversational → keyword form
   "What is the recommended dose of Warfarin for an elderly patient?"
   → "Warfarin dose elderly patient recommendation"

4. Question decomposition:
   "Does Warfarin interact with NSAIDs and does it affect INR?"
   → Query 1: "Warfarin NSAID drug interaction"
   → Query 2: "Warfarin effect on INR monitoring"

5. Contextual resolution:
   "What about the interaction?" + previous context about Warfarin
   → "Warfarin drug-drug interactions"

LLM-Based Query Rewriting

Python
from anthropic import Anthropic

client = Anthropic()

def rewrite_query(
    original_query: str,
    conversation_history: list[dict] | None = None,
    domain: str = "clinical medicine"
) -> str:
    """Rewrite a user query for better retrieval."""
    context = ""
    if conversation_history:
        context = "\n".join(
            f"{msg['role'].upper()}: {msg['content']}"
            for msg in conversation_history[-3:]  # last 3 turns for context
        )

    prompt = f"""Rewrite the following user query for retrieval in a {domain} knowledge base.
Goals:
  - Expand abbreviations (e.g., AF → atrial fibrillation, INR → international normalised ratio)
  - Replace lay terms with medical terminology (e.g., blood thinner → anticoagulant)
  - Resolve pronouns and references using conversation context
  - Make the query self-contained and specific

Return ONLY the rewritten query. No explanation.

{f"Conversation context:{chr(10)}{context}{chr(10)}" if context else ""}
Original query: {original_query}
Rewritten query:"""

    response = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=200,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text.strip()

# Example
query = "What's the target for AF patients on VKA?"
rewritten = rewrite_query(query, domain="clinical pharmacology")
#  "What is the recommended INR (international normalised ratio) target range
#    for patients with atrial fibrillation treated with vitamin K antagonists?"

HyDE: Hypothetical Document Embeddings

An alternative approach: instead of rewriting the query, generate a hypothetical answer and embed that:

Python
def hyde_query(query: str, client, embedder) -> np.ndarray:
    """Generate a hypothetical document, then embed it for retrieval."""
    # Step 1: Generate a hypothetical answer
    response = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=200,
        messages=[{"role": "user", "content":
            f"Write a short factual paragraph answering: {query}"}]
    )
    hypothetical_doc = response.content[0].text

    # Step 2: Embed the hypothetical document
    # This embedding is in the "document space" rather than "query space"
    return embedder.encode(hypothetical_doc)

# The hypothetical doc uses the vocabulary and style of actual documents
#  better alignment with the embedding space of the knowledge base

Query Expansion with Synonyms

For clinical domains, a controlled medical vocabulary helps:

Python
MEDICAL_SYNONYMS = {
    "heart attack": ["myocardial infarction", "MI", "acute coronary syndrome", "NSTEMI", "STEMI"],
    "blood thinner": ["anticoagulant", "antithrombotic", "Warfarin", "apixaban", "heparin"],
    "af": ["atrial fibrillation", "AF"],
    "inr": ["international normalised ratio", "prothrombin time"],
    "stroke": ["cerebrovascular accident", "CVA", "TIA", "transient ischaemic attack"],
}

def expand_with_synonyms(query: str) -> str:
    query_lower = query.lower()
    expansions = []
    for term, synonyms in MEDICAL_SYNONYMS.items():
        if term in query_lower:
            expansions.extend(synonyms)
    if expansions:
        return query + " " + " ".join(set(expansions))
    return query

Interview Answer

"Query rewriting transforms a user's natural language query into a better retrieval query before searching. Key techniques: abbreviation expansion (AF → atrial fibrillation), lay-to-medical terminology conversion (blood thinner → anticoagulant), contextual reference resolution (what about the interaction? → Warfarin NSAID drug interaction), and keyword extraction for sparse retrieval. I use a fast small LLM (Haiku, GPT-4o mini) to perform the rewrite — it's cheap and adds ~50ms latency. For multi-part questions, query decomposition generates multiple sub-queries and retrieves results for each independently. HyDE is an alternative: generate a hypothetical answer and embed it rather than the query — this aligns with the document embedding space."