Cosine vs Dot Product Similarity
When to use cosine similarity versus dot product in vector search, how they differ mathematically, and which embedding models require which metric.
The Two Common Metrics
Cosine similarity:
cos(A, B) = (A ยท B) / (โAโ ร โBโ)
Range: [-1, 1] (for unit vectors: [0, 1] for typical text)
Measures: ANGLE between vectors โ ignores magnitude
Dot product:
A ยท B = ฮฃ aแตข ร bแตข
Range: unbounded
Measures: magnitude AND angle combinedFor unit-normalised vectors, they are identical: If โAโ = โBโ = 1, then cos(A, B) = A ยท B.
When They Differ
Cosine is magnitude-invariant:
embed("cat") = [0.5, 0.3, ...] (small magnitude)
embed("The cat sat on the mat. It was a large cat.") = [1.0, 0.6, ...]
Cosine similarity between them: high (same topic)
Dot product: lower (small magnitude doc pulls it down)
Use cosine when: magnitude should not affect ranking
Use cosine for: most RAG applications
Dot product rewards high-magnitude vectors:
If a model is trained with dot product and long documents
deliberately get higher magnitude, dot product captures
"importance" (document is highly about this topic)
Use dot product when: the embedding model was explicitly
trained with dot product objective (e.g., OpenAI's models)Which Metric for Which Model
Model | Recommended metric | Why
----------------------------|--------------------|---------------------------------
text-embedding-3-small | cosine or dot | normalised by default
text-embedding-3-large | cosine or dot | normalised by default
all-MiniLM-L6-v2 | cosine | trained with cosine objective
all-mpnet-base-v2 | cosine | normalised embeddings
MedCPT-Query-Encoder | cosine | FAISS cosine during training
BGE models (BAAI) | dot product | trained with inner product
E5 models | cosine | normalised embeddings
Rule: check the model card. Using the wrong metric degrades retrieval quality.Implementation
import numpy as np
from numpy.linalg import norm
def cosine_similarity(a: np.ndarray, b: np.ndarray) -> float:
return np.dot(a, b) / (norm(a) * norm(b))
def dot_product(a: np.ndarray, b: np.ndarray) -> float:
return np.dot(a, b)
# Batch cosine (normalise then dot product)
def cosine_similarity_batch(
query: np.ndarray, # shape: (d,)
docs: np.ndarray, # shape: (n, d)
) -> np.ndarray:
query_norm = query / norm(query)
docs_norm = docs / norm(docs, axis=1, keepdims=True)
return docs_norm @ query_norm # shape: (n,)
# Pre-normalise at index time (recommended)
def normalise(embeddings: np.ndarray) -> np.ndarray:
norms = norm(embeddings, axis=1, keepdims=True)
return embeddings / norms
# Then at query time: dot product == cosine (faster)
query_norm = normalise(query_embedding.reshape(1, -1))[0]
scores = doc_embeddings_normalised @ query_normChroma / FAISS Configuration
import chromadb
import faiss
# Chroma: specify space in collection metadata
collection = client.get_or_create_collection(
name="docs",
metadata={"hnsw:space": "cosine"} # or "ip" (inner product) or "l2"
)
# FAISS: choose index type
d = 768
# Cosine: normalise vectors, use inner product index
index_cosine = faiss.IndexFlatIP(d) # inner product on normalised = cosine
# Normalise before adding:
faiss.normalize_L2(embeddings) # in-place normalisation
index_cosine.add(embeddings)
# Pure dot product (no normalisation)
index_ip = faiss.IndexFlatIP(d)
index_ip.add(embeddings) # raw embeddings
# L2 (Euclidean) โ rarely used for text
index_l2 = faiss.IndexFlatL2(d)Distance vs Similarity
Vector databases often return distance, not similarity:
Cosine distance = 1 - cosine_similarity
cosine_sim = 0.9 โ distance = 0.1 (very similar)
cosine_sim = 0.5 โ distance = 0.5 (moderately similar)
cosine_sim = 0.0 โ distance = 1.0 (unrelated)
L2 distance: Euclidean distance โ not the same as cosine distance
For normalised vectors: L2ยฒ = 2 ร (1 - cosine_sim)
So L2 ranking == cosine ranking on normalised vectors
Chroma returns "distances" โ lower is more similar (cosine distance)
Convert: similarity = 1 - distancedef retrieve_with_similarity(query_embedding, collection, top_k=5):
results = collection.query(
query_embeddings=[query_embedding],
n_results=top_k,
include=["documents", "metadatas", "distances"]
)
return [
{
"content": doc,
"metadata": meta,
"similarity": 1 - dist, # convert distance to similarity
}
for doc, meta, dist in zip(
results["documents"][0],
results["metadatas"][0],
results["distances"][0],
)
]Interview Answer
"Cosine similarity measures the angle between vectors (magnitude-invariant), while dot product measures both angle and magnitude. For normalised unit vectors they are identical. In RAG, cosine is the default choice because it's robust to varying text lengths. The important thing is to match the metric to the embedding model's training objective โ BGE models use dot product; MiniLM and E5 use cosine. Mismatching degrades retrieval quality. Practically, pre-normalising embeddings at index time and then using dot product is faster than computing cosine at query time."
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.