RAG Systems · Lesson 18 of 24
Reciprocal Rank Fusion (RRF) Explained
The Fusion Problem
Hybrid retrieval runs two or more search systems in parallel and must combine their results:
BM25 results: Dense results:
rank 1: Doc A rank 1: Doc B
rank 2: Doc C rank 2: Doc A
rank 3: Doc B rank 3: Doc D
rank 4: Doc D rank 4: Doc C
Naive merge: hard — BM25 scores and cosine similarities are on different scales
BM25 score 12.3 vs cosine similarity 0.87 — incomparable
Linear combination: score = α × bm25 + (1 - α) × cosine
Requires normalisation (what's the max BM25 score? varies per corpus)
α is a fragile hyperparameter that may need retuning when corpus changesRRF solves this by using only ranks, not raw scores.
RRF Formula
RRF_score(doc) = Σ_retriever 1 / (k + rank(doc, retriever))
Where:
k = 60 (constant — reduces sensitivity to very top ranks)
rank(doc, retriever) = 1-based rank in that retriever's result list
Σ = sum over all retrieval systems
If a document is not in a retriever's results, its contribution is 0.
Example (k=60):
Doc A: BM25 rank 1 + dense rank 2 = 1/61 + 1/62 = 0.01639 + 0.01613 = 0.03252
Doc B: BM25 rank 3 + dense rank 1 = 1/63 + 1/61 = 0.01587 + 0.01639 = 0.03226
Doc C: BM25 rank 2 + dense rank 4 = 1/62 + 1/64 = 0.01613 + 0.01563 = 0.03175
Doc D: BM25 rank 4 + dense rank 3 = 1/64 + 1/63 = 0.01563 + 0.01587 = 0.03149
Final ranking: A, B, C, DImplementation
from collections import defaultdict
def reciprocal_rank_fusion(
result_lists: list[list[dict]],
top_k: int = 10,
k: int = 60,
id_key: str = "id", # unique document identifier key
) -> list[dict]:
"""
result_lists: list of ranked result lists from different retrievers
Each result is a dict with at least an id_key field.
"""
scores = defaultdict(float)
doc_map = {} # id → full document dict
for results in result_lists:
for rank, doc in enumerate(results, start=1):
doc_id = doc[id_key]
scores[doc_id] += 1.0 / (k + rank)
doc_map[doc_id] = doc
# Sort by RRF score descending
sorted_ids = sorted(scores.keys(), key=lambda d: scores[d], reverse=True)
return [
{**doc_map[doc_id], "rrf_score": scores[doc_id]}
for doc_id in sorted_ids[:top_k]
]
# Full hybrid retrieval example
def hybrid_retrieve(
query: str,
dense_collection,
bm25_retriever,
embedder,
top_k: int = 5,
fetch_k: int = 20,
) -> list[dict]:
# Run both retrievers
query_embedding = embedder.encode([query])[0].tolist()
dense_results = dense_collection.query(
query_embeddings=[query_embedding],
n_results=fetch_k,
include=["documents", "metadatas", "distances"],
)
dense_docs = [
{
"id": meta["chunk_id"],
"content": doc,
"metadata": meta,
"similarity": 1 - dist,
}
for doc, meta, dist in zip(
dense_results["documents"][0],
dense_results["metadatas"][0],
dense_results["distances"][0],
)
]
bm25_docs = bm25_retriever.retrieve(query, top_k=fetch_k)
for i, doc in enumerate(bm25_docs):
doc["id"] = doc["metadata"]["chunk_id"]
# Fuse with RRF
fused = reciprocal_rank_fusion(
result_lists=[dense_docs, bm25_docs],
top_k=top_k,
k=60,
)
return fusedRRF with Three Systems
# Add a third retriever (e.g., title-specific BM25 on document titles)
title_bm25 = BM25Retriever(all_titles, all_metas)
title_results = title_bm25.retrieve(query, top_k=fetch_k)
for doc in title_results:
doc["id"] = doc["metadata"]["chunk_id"]
fused_3 = reciprocal_rank_fusion(
result_lists=[dense_docs, bm25_docs, title_results],
top_k=top_k,
k=60,
)RRF vs Linear Combination
RRF advantages:
No score normalisation needed — uses ranks only
Robust to scale differences between retrievers
k=60 is a good default that rarely needs tuning
Naturally handles documents missing from one retriever
Linear combination advantages:
Can weight one retriever more heavily than another
Requires score normalisation (max-min or z-score)
α is a tunable hyperparameter
When to use each:
RRF: default choice, production systems, when corpus changes frequently
Linear: when you have offline evaluation data to tune α precisely
Azure AI Search, Qdrant, and Weaviate all support RRF nativelyWhy k=60
k controls sensitivity to top ranks:
Without k: 1/rank — rank 1 = 1.0, rank 2 = 0.5, rank 3 = 0.33
Very sensitive to being rank 1 vs rank 2
With k=60: rank 1 = 1/61 = 0.0164, rank 2 = 1/62 = 0.0161
Differences are small — rewards consensus across retrievers
A document at rank 10 in both systems beats rank 1 in only one
Typical range: 20–100
Lower k: winner-takes-all behaviour
Higher k: more equal weighting across ranksInterview Answer
"Reciprocal Rank Fusion combines ranked result lists from multiple retrievers using position only — each document scores 1/(k+rank) per retriever, summed across all retrievers. The k=60 constant dampens the winner-takes-all effect so that a document ranking highly in both BM25 and dense search beats one ranking first in only one. The key advantage over linear score combination is that no normalisation is needed — BM25 and cosine similarity are on incompatible scales, but their ranks are directly comparable. I use RRF as the default fusion strategy for hybrid retrieval: it requires no per-corpus tuning and naturally handles documents missing from one retriever."