PostgreSQL Deep Dive · Lesson 1 of 4
Indexing & Query Performance
Most PostgreSQL performance problems are caused by the same small set of issues: missing indexes, wrong index types, stale statistics, or queries that accidentally bypass the indexes that exist. This guide gives you the mental model and the tools to diagnose and fix all of them.
How B-Tree Indexes Actually Work
A B-tree index is a balanced tree where each node is a page (8KB by default). Leaf nodes contain (key_value, heap_tuple_id) pairs pointing to the actual row location in the table (the heap).
B-tree for orders(customer_id):
[500]
/ \
[200, 350] [650, 800]
/ | \ | \
[1..199] [200..349] [350..499] [500..649] [650..800] [801..]
Leaf Leaf Leaf Leaf Leaf Leaf
pages pages pages pages pages pages
(actual ctid ctid ctid ctid ctid
row pointers pointers pointers pointers pointers)
to heap to heap to heap to heap to heapWhat this means for queries:
WHERE customer_id = 42→ 3 page reads (root → internal → leaf), then heap fetchWHERE customer_id BETWEEN 100 AND 200→ scan contiguous leaf pages (very fast)WHERE customer_id IS NOT NULL→ full index scan (B-tree stores NULLs, but comparisons skip them)WHERE lower(customer_id)→ misses the index entirely (wrong data in index)
Fill Factor and Bloat
B-tree pages have a configurable fillfactor (default 90%). Each page holds up to 90% of its capacity, leaving room for in-place updates. When a page fills completely, it splits — creating two half-full pages. Heavy UPDATE workloads cause frequent splits → bloat.
-- Reduce fill factor for heavily updated indexes (more free space per page)
CREATE INDEX idx_orders_status ON orders(status) WITH (fillfactor = 70);
-- Check index bloat
SELECT
indexrelname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
-- Rebuild a bloated index without locking the table
REINDEX INDEX CONCURRENTLY idx_orders_status;Every Index Type — When to Use Each
B-Tree (Default) — Ordered Comparisons
Best for: =, <, >, <=, >=, BETWEEN, IN, IS NULL, LIKE 'prefix%', ORDER BY
CREATE INDEX idx_orders_created ON orders(created_at);
CREATE UNIQUE INDEX idx_users_email ON users(email);
-- Expression index: index the computed value, not the raw column
CREATE INDEX idx_users_email_lower ON users(lower(email));
-- Enables: WHERE lower(email) = 'alice@example.com' (uses index)
-- Without this: WHERE lower(email) = ... → seq scanHash Index — Exact Equality Only
Faster than B-tree for pure equality lookups (smaller, O(1) not O(log n)), but useless for ranges or sorting.
CREATE INDEX idx_sessions_token ON sessions USING HASH (session_token);
-- Use only for: WHERE session_token = '...' (equality only)
-- Cannot use for: ORDER BY, BETWEEN, range queriesUse hash when you have a high-cardinality column used only in equality checks (session tokens, UUIDs as lookup keys).
GIN (Generalised Inverted Index) — Multi-Value Data
GIN indexes every element inside a composite value and stores an inverted mapping: element → set of rows containing it. Perfect for full-text search and JSONB.
-- Full-text search
ALTER TABLE articles ADD COLUMN tsv tsvector
GENERATED ALWAYS AS (to_tsvector('english', coalesce(title,'') || ' ' || coalesce(body,''))) STORED;
CREATE INDEX idx_articles_fts ON articles USING GIN(tsv);
-- Query
SELECT title, ts_rank(tsv, query) AS rank
FROM articles, plainto_tsquery('english', 'postgresql performance') query
WHERE tsv @@ query
ORDER BY rank DESC
LIMIT 20;
-- JSONB containment
CREATE INDEX idx_products_attrs ON products USING GIN(attributes);
-- Enables:
-- WHERE attributes @> '{"color": "red"}'
-- WHERE attributes ? 'weight_kg'
-- WHERE attributes->'tags' @> '["sale"]'
-- JSONB with jsonb_path_ops (smaller, faster for @> only)
CREATE INDEX idx_products_attrs_path ON products USING GIN(attributes jsonb_path_ops);GIN vs B-tree for JSONB: Never use B-tree on a JSONB column for containment queries — it cannot help. Always use GIN.
GiST (Generalised Search Tree) — Geometric and Range Data
GiST is a framework for indexing data with overlap — geometric types, ranges, network addresses.
-- Range overlap queries
CREATE TABLE bookings (
id SERIAL PRIMARY KEY,
resource_id INT,
period tstzrange
);
CREATE INDEX idx_bookings_period ON bookings USING GiST(period);
-- Find overlapping bookings (crucial for calendar/scheduling systems)
SELECT * FROM bookings
WHERE period && '[2026-04-18 09:00, 2026-04-18 11:00)'::tstzrange;
-- PostGIS spatial index (also uses GiST)
CREATE INDEX idx_locations_geo ON locations USING GiST(coordinates);
SELECT * FROM locations WHERE ST_DWithin(coordinates, ST_Point(-73.9, 40.7), 1000);BRIN (Block Range Index) — Naturally Ordered Large Tables
BRIN stores min/max values per block range (default 128 blocks ≈ 1MB). Tiny index, but only useful when data is physically sorted by the indexed column — like a created_at timestamp on an insert-only table.
-- Perfect for time-series: rows are inserted in timestamp order
CREATE INDEX idx_events_ts ON events USING BRIN(occurred_at);
-- Index is ~1000× smaller than B-tree for same column
-- But only useful because new rows have larger timestamps (physical order matches logical order)
-- Useless for: frequently updated columns, UUIDs, random datapg_trgm — Contains Search and Fuzzy Matching
The pg_trgm extension enables GIN/GiST indexes for LIKE '%substring%' and fuzzy similarity search — things B-tree cannot handle.
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_products_name_trgm ON products USING GIN(name gin_trgm_ops);
-- Now these all use the index:
WHERE name ILIKE '%widget%' -- contains search
WHERE name % 'wiget' -- fuzzy match (similarity > threshold)
WHERE similarity(name, 'widget') > 0.3 -- explicit similarity
ORDER BY name <-> 'widget' LIMIT 10 -- nearest neighbours by trigram distanceComposite Indexes — Column Order Rules
A composite index on (a, b, c) is ordered first by a, then by b within each a value, then by c. This determines what queries it can accelerate.
CREATE INDEX idx_orders_compound ON orders(status, customer_id, created_at);Can use this index:
WHERE status = 'pending'WHERE status = 'pending' AND customer_id = 42WHERE status = 'pending' AND customer_id = 42 AND created_at > '2026-01-01'ORDER BY status, customer_id, created_at
Cannot efficiently use this index:
WHERE customer_id = 42(skipping the leading column)WHERE created_at > '2026-01-01'(skipping two leading columns)
Column ordering rules:
- Equality conditions first (most selective first among equals)
- Range conditions last (only one range condition can use the index efficiently)
ORDER BYcolumns last if you want index-ordered scans
-- Query: WHERE status = 'shipped' AND created_at > '2026-04-01' ORDER BY created_at
-- Best index: (status, created_at) — equality on status, range on created_at matches ORDER BY
CREATE INDEX idx_orders_status_created ON orders(status, created_at);Covering Indexes (INCLUDE)
An index-only scan reads data entirely from the index — never touching the heap. Add frequently read non-key columns with INCLUDE:
-- Without INCLUDE: Index Scan uses index to find row, then heap fetch for email/name
CREATE INDEX idx_users_id ON users(id);
-- With INCLUDE: Index Only Scan — email and name served from index, no heap fetch
CREATE INDEX idx_users_id_covering ON users(id) INCLUDE (email, name);
-- Query that benefits:
SELECT id, email, name FROM users WHERE id = 42;
-- → Index Only Scan (0 heap fetches)INCLUDE columns are not part of the sort key — they do not help ORDER BY or range queries. They just store values alongside the index entry to eliminate heap fetches.
EXPLAIN ANALYZE — Reading the Full Output
EXPLAIN ANALYZE actually executes the query and shows both estimated and actual performance. Read it bottom-up (leaf nodes execute first).
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC
LIMIT 100;Limit (cost=1234.56..1235.06 rows=100 width=48)
(actual time=12.345..12.456 rows=100 loops=1)
-> Sort (cost=1234.56..1259.56 rows=10000 width=48)
(actual time=12.344..12.389 rows=100 loops=1)
Sort Key: o.created_at DESC
Sort Method: top-N heapsort Memory: 35kB
-> Hash Join (cost=890.23..1099.56 rows=10000 width=48)
(actual time=5.123..11.234 rows=8432 loops=1)
Hash Cond: (o.customer_id = c.id)
Buffers: shared hit=2341 read=456
-> Index Scan using idx_orders_status_created on orders
(cost=0.43..123.45 rows=10000 width=32)
(actual time=0.023..4.567 rows=8432 loops=1)
Index Cond: ((status = 'pending') AND (created_at > ...))
Buffers: shared hit=1234 read=234
-> Hash (cost=456.78..456.78 rows=50000 width=24)
(actual time=3.456..3.456 rows=50000 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2847kB
Buffers: shared hit=1107 read=222
-> Seq Scan on customers
(actual time=0.012..2.345 rows=50000 loops=1)How to read each node:
| Field | What it means | Warning sign |
|-------|--------------|--------------|
| cost=X..Y | Planner's estimate: X=startup, Y=total | Always compare to actual |
| actual time=X..Y | Real execution: X=first row, Y=last row | |
| rows=N (estimated) | Planner's row count guess | |
| rows=N (actual) | Real rows returned | If estimate ÷ actual > 10: statistics are stale |
| loops=N | How many times this node ran | Multiply actual time by loops |
| Buffers: shared hit=X read=Y | X pages from cache, Y from disk | High read = cache miss |
| Sort Method: external | Sort spilled to disk | Increase work_mem |
| Batches: N > 1 (Hash Join) | Hash table didn't fit in memory | Increase work_mem |
Diagnosing Row Estimate Errors
-- Estimated 10000 rows, actual 8432 rows — good (within 2×)
-- Estimated 10000 rows, actual 8 rows — bad (planner chose wrong plan)
-- Fix: refresh statistics
ANALYZE orders;
-- Increase statistics target for poorly-estimated columns
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
ANALYZE orders;
-- Check current statistics
SELECT n_distinct, correlation, histogram_bounds
FROM pg_stats
WHERE tablename = 'orders' AND attname = 'status';
-- correlation close to 1 or -1 = data is physically sorted (BRIN useful)
-- correlation near 0 = random orderExplaining Join Strategies
| Join type | When PostgreSQL chooses it | Cost characteristic | |-----------|--------------------------|---------------------| | Nested Loop | Small inner table, indexed | O(outer × log inner) | | Hash Join | Large tables, no index on join column | O(n+m), needs memory | | Merge Join | Both sides already sorted | O(n log n + m log m) |
-- Force a specific join strategy for testing
SET enable_nestloop = OFF;
SET enable_hashjoin = ON;
EXPLAIN ANALYZE SELECT ...;
-- Reset after testing
RESET enable_nestloop;Slow Query Patterns — Diagnosis and Fix
Pattern 1: Index Bypass via Function
-- SLOW — function wrapping the column breaks index usage
WHERE DATE(created_at) = '2026-04-18'
WHERE EXTRACT(year FROM created_at) = 2026
WHERE UPPER(name) = 'ALICE'
-- FAST — rewrite without function OR create expression index
-- Option A: rewrite the predicate
WHERE created_at >= '2026-04-18' AND created_at < '2026-04-19'
-- Option B: expression index (index stores the computed value)
CREATE INDEX idx_orders_date ON orders(DATE(created_at));
-- Now WHERE DATE(created_at) = '2026-04-18' can use the indexPattern 2: Leading Wildcard
-- SLOW — leading wildcard prevents B-tree use entirely
WHERE email LIKE '%@gmail.com'
-- Option A: reverse the string (if querying by suffix)
CREATE INDEX idx_users_email_reversed ON users(reverse(email));
WHERE reverse(email) LIKE reverse('%@gmail.com') -- = 'moc.liamg@%'
-- Option B: pg_trgm for contains search
CREATE EXTENSION pg_trgm;
CREATE INDEX idx_users_email_trgm ON users USING GIN(email gin_trgm_ops);
WHERE email ILIKE '%@gmail.com' -- now uses trgm indexPattern 3: OR with Different Columns
-- SLOW — OR across different columns often forces a seq scan
WHERE status = 'pending' OR customer_id = 42
-- FAST — UNION ALL (each branch uses its own index)
SELECT * FROM orders WHERE status = 'pending'
UNION ALL
SELECT * FROM orders WHERE customer_id = 42 AND status != 'pending'Pattern 4: N+1 Problem
-- SLOW — N+1: one query per order to get customer
FOR order IN orders:
SELECT * FROM customers WHERE id = order.customer_id
-- FAST — single JOIN
SELECT o.*, c.name, c.email
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.created_at > NOW() - INTERVAL '7 days'Pattern 5: Implicit Type Coercion
-- Table: user_id is INTEGER, application passes STRING '42'
WHERE user_id = '42' -- PostgreSQL may cast every row: slow
WHERE user_id = 42 -- exact type match: index used
-- Check with EXPLAIN:
EXPLAIN SELECT * FROM orders WHERE customer_id = '42';
-- If you see: Filter: ((customer_id)::text = '42') → type mismatchPattern 6: COUNT(*) on Large Tables
-- SLOW for large tables — requires table scan (unless vacuumed recently)
SELECT COUNT(*) FROM orders;
-- FAST approximation (from statistics, instant)
SELECT reltuples::bigint FROM pg_class WHERE relname = 'orders';
-- For filtered count, ensure the filter column is indexed:
SELECT COUNT(*) FROM orders WHERE status = 'pending';
-- uses index if (status) or (status, ...) index existsStatistics, Autovacuum, and Bloat
Understanding pg_stat_statements
The most important production tool for finding slow queries:
-- Enable (add to postgresql.conf, restart required)
-- shared_preload_libraries = 'pg_stat_statements'
-- Top 20 slowest queries by mean execution time
SELECT
LEFT(query, 100) AS query_snippet,
calls,
ROUND(mean_exec_time::numeric, 2) AS mean_ms,
ROUND(total_exec_time::numeric, 2) AS total_ms,
ROUND(stddev_exec_time::numeric, 2) AS stddev_ms,
rows / calls AS avg_rows
FROM pg_stat_statements
WHERE calls > 100
ORDER BY mean_exec_time DESC
LIMIT 20;
-- Queries with worst row estimate accuracy
SELECT
LEFT(query, 100) AS query_snippet,
calls,
rows / calls AS actual_avg_rows
FROM pg_stat_statements
ORDER BY calls DESC
LIMIT 20;Autovacuum: When It's Not Keeping Up
Autovacuum reclaims dead rows from UPDATEs and DELETEs. If it can't keep up, tables bloat, statistics go stale, and queries slow down.
-- Tables with most dead rows (candidates for manual VACUUM)
SELECT
schemaname,
relname,
n_live_tup,
n_dead_tup,
ROUND(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct,
last_autovacuum,
last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;
-- Manual vacuum + analyze for a specific table
VACUUM (ANALYZE, VERBOSE) orders;
-- For extreme bloat (locks table briefly):
VACUUM FULL orders; -- rewrites table, reclaims all dead spaceTune autovacuum for write-heavy tables:
-- Lower thresholds for the orders table (vacuum more aggressively)
ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0.01, -- vacuum when 1% of rows are dead (default 20%)
autovacuum_analyze_scale_factor = 0.005, -- analyze when 0.5% rows changed
autovacuum_vacuum_cost_delay = 2 -- allow faster I/O (default 20ms delay)
);Finding and Removing Unused Indexes
Every index has a write cost (INSERT, UPDATE, DELETE must update it). Unused indexes slow writes for no benefit:
-- Indexes never used since last statistics reset
SELECT
schemaname,
tablename,
indexrelname AS index_name,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
idx_scan AS times_used
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
-- pg_stat_user_indexes resets on pg_stat_reset() or server restart
-- Give it at least 2 weeks before dropping "unused" indexes
-- Safe drop (non-blocking in PostgreSQL 15+):
DROP INDEX CONCURRENTLY idx_name;Finding Missing Indexes on Foreign Keys
PostgreSQL does not automatically index foreign keys. Missing FK indexes cause slow JOIN performance and lock contention during DELETE cascades:
-- Find FK columns without indexes
SELECT
tc.table_name,
kcu.column_name,
ccu.table_name AS referenced_table,
ccu.column_name AS referenced_column
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage ccu
ON ccu.constraint_name = tc.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY'
AND NOT EXISTS (
SELECT 1 FROM pg_indexes
WHERE tablename = tc.table_name
AND indexdef LIKE '%' || kcu.column_name || '%'
);Creating Indexes Safely in Production
Never run CREATE INDEX on a live table without CONCURRENTLY:
-- BLOCKS writes for entire duration (avoid on production)
CREATE INDEX idx_orders_customer ON orders(customer_id);
-- Non-blocking: takes longer but does not lock the table
CREATE INDEX CONCURRENTLY idx_orders_customer ON orders(customer_id);
-- Monitor progress:
SELECT phase, blocks_done, blocks_total,
ROUND(blocks_done * 100.0 / blocks_total, 1) AS pct
FROM pg_stat_progress_create_index
WHERE relid = 'orders'::regclass;CONCURRENTLY makes two passes: first builds the index, then catches up on changes made during the build. It takes 2–3× longer but causes zero write downtime.
Connection Pooling: PgBouncer
PostgreSQL creates a new process (~5MB RAM, ~50ms startup) per connection. At 500+ concurrent connections, you're spending more time on connection overhead than queries.
Application (1000 connections)
│
▼
PgBouncer (pool: 20–50 connections)
│
▼
PostgreSQL (handles 20–50 efficient connections); pgbouncer.ini
[databases]
mydb = host=localhost port=5432 dbname=mydb
[pgbouncer]
listen_port = 6432
pool_mode = transaction ; most efficient: connection returned after each transaction
max_client_conn = 1000 ; total client connections to PgBouncer
default_pool_size = 20 ; actual connections to PostgreSQL
reserve_pool_size = 5 ; extra for spikesPool modes:
transaction: connection returned to pool after each transaction. Most efficient. Cannot use prepared statements orSETsession variables across transactions.session: connection held for the entire client session. Safer but less efficient.
Quick Diagnostic Workflow
Query is slow?
│
├── EXPLAIN ANALYZE
│ ├── Seq Scan on large table → add index
│ ├── Rows estimate >> actual → ANALYZE table; ALTER COLUMN SET STATISTICS
│ ├── Sort Method: external → increase work_mem
│ └── Index exists but not used?
│ ├── Function on column → expression index or rewrite query
│ ├── Leading wildcard LIKE → pg_trgm index
│ ├── Type mismatch → fix application to pass correct type
│ └── Low selectivity → index not worth it for < 5% rows
│
├── pg_stat_statements → find top slow queries by mean_exec_time
│
├── pg_stat_user_tables → find tables with high n_dead_tup → VACUUM
│
└── pg_stat_user_indexes → find idx_scan = 0 → consider droppingRelated: PostgreSQL Advanced Features — Window functions, JSONB, pgvector
Related: SQL Advanced Guide — CTEs, analytical queries
Related: dotnet-efcore-performance — EF Core query optimisation