This is not a theoretical guide to database optimization. This is the exact process I use when a query is slow and I need to find out why — and then make it fast.
The numbers are real. The worst query I fixed went from 12 seconds to 280ms. Here's how.
Summary
Start with pg_stat_statements to find what's actually slow. Use EXPLAIN (ANALYZE, BUFFERS) to understand why. Then apply the minimum-intervention fix: usually a missing index, a wrong column order, or a missing partial index. Don't guess. Measure.
Step 1: Find What's Actually Slow
Intuition is wrong. The query you think is slow might not be in your top 10 by database time. Start with data:
-- Enable pg_stat_statements if not already active
-- Add to postgresql.conf: shared_preload_libraries = 'pg_stat_statements'
SELECT
LEFT(query, 100) AS query_preview,
calls,
ROUND(total_exec_time::numeric / 1000, 2) AS total_seconds,
ROUND(mean_exec_time::numeric, 2) AS mean_ms,
ROUND(stddev_exec_time::numeric, 2) AS stddev_ms,
rows
FROM pg_stat_statements
WHERE query NOT LIKE '%pg_stat%'
ORDER BY total_exec_time DESC
LIMIT 20;
Look at total_exec_time first, not mean_exec_time. A query that takes 100ms but runs 10,000 times is more impactful than a query that takes 1 second but runs twice.
Step 2: Understand Why With EXPLAIN ANALYZE
Once you have the worst offenders, run them with EXPLAIN:
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM listings
WHERE category_id = 5
AND city_id = 42
AND is_active = true
AND price BETWEEN 1000 AND 10000
ORDER BY created_at DESC
LIMIT 20;
Always use ANALYZE — without it, EXPLAIN shows estimated costs, not real execution. Always include BUFFERS — this shows cache hits vs. disk reads, which often reveals the real problem.
Reading the Output
Limit (cost=1234.56..1234.89 rows=20 width=412)
(actual time=4218.332..4218.461 rows=20 loops=1)
Buffers: shared hit=0 read=48234 ← all disk reads, no cache
-> Sort (cost=... rows=...)
Sort Key: created_at DESC
Sort Method: quicksort Memory: 312kB
-> Seq Scan on listings ← PROBLEM: sequential scan
(cost=0.00..14832.12 rows=892 width=412)
(actual time=0.021..4198.124 rows=1847 loops=1)
Filter: (category_id = 5 AND city_id = 42 AND is_active)
Rows Removed by Filter: 478153 ← scanning 478K rows for 1847
Key signals:
- Seq Scan — almost always bad on large tables; indicates missing index
- Rows Removed by Filter — high ratio means the index isn't helping with selectivity
- Buffers: shared read (vs. hit) — reads from disk are 100× slower than cache
- Sort Method: quicksort — if memory exceeds
work_mem, this spills to disk
Step 3: Choose the Right Index Type
B-tree Indexes (Default)
For range queries, equality, and ORDER BY:
-- Bad: indexes each column independently
CREATE INDEX ON listings (category_id);
CREATE INDEX ON listings (city_id);
-- Better: composite index matching the query pattern
-- Column order: equality conditions first, then range, then ORDER BY
CREATE INDEX CONCURRENTLY idx_listings_cat_city_active_created
ON listings (category_id, city_id, created_at DESC)
WHERE is_active = true; -- Partial: only active rows
Column order in composite indexes matters critically:
- Equality conditions first (
category_id = 5,city_id = 42) - Then range conditions (
price BETWEEN ...) - Then ORDER BY column (
created_at)
A composite index (category_id, city_id, created_at) can be used for queries filtering on category_id alone, or category_id + city_id, but not for queries filtering only on city_id.
Partial Indexes
When a significant fraction of rows always satisfy a condition:
-- If 60% of listings are inactive, this index only covers active listings
-- Half the size, better cache utilization, faster index scans
CREATE INDEX CONCURRENTLY idx_listings_active
ON listings (category_id, created_at DESC)
WHERE is_active = true;
-- Index only pending orders (perhaps 5% of total)
CREATE INDEX CONCURRENTLY idx_orders_pending
ON orders (created_at, user_id)
WHERE status = 'pending';
Covering Indexes (INCLUDE)
When the query reads columns not needed for filtering:
-- Without covering: index scan + heap fetch for each row
-- (heap fetch is random I/O — slow)
SELECT user_id, amount, status, created_at
FROM transactions
WHERE user_id = 42
ORDER BY created_at DESC;
-- With covering index: index-only scan (no heap access)
CREATE INDEX CONCURRENTLY idx_transactions_user_covering
ON transactions (user_id, created_at DESC)
INCLUDE (amount, status, reference_id);
After adding this index, run VACUUM ANALYZE transactions to update visibility map — required for index-only scans to work.
GIN Indexes for JSONB and Arrays
-- Searching inside JSONB columns
CREATE INDEX CONCURRENTLY idx_products_attrs_gin
ON products USING gin (attributes);
-- Now this query uses the index:
SELECT * FROM products
WHERE attributes @> '{"color": "blue"}';
Common Mistakes
Wrong: Adding Too Many Indexes
Every index has a cost: slower writes, more disk space, more memory pressure. After an optimization sprint:
-- Find indexes that have never been used
SELECT
schemaname,
tablename,
indexname,
idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
Drop unused indexes ruthlessly. In one audit, I dropped 11 indexes that had never been used in production — write throughput improved measurably.
Wrong: Not Using CONCURRENTLY
-- This locks the table for the duration of index build
CREATE INDEX idx_listings_cat ON listings (category_id);
-- This doesn't lock — always use CONCURRENTLY in production
CREATE INDEX CONCURRENTLY idx_listings_cat ON listings (category_id);
On a 500K-row table, CREATE INDEX (without CONCURRENTLY) can take 5–10 minutes and block all reads and writes.
Wrong: Stale Statistics
The planner's cost estimates depend on table statistics. After large bulk inserts/deletes:
-- Update statistics
ANALYZE listings;
-- Or for all tables
ANALYZE;
Stale statistics can cause the planner to choose a seq scan even when a perfectly good index exists.
The Full Process
1. pg_stat_statements → top 20 by total_exec_time
2. EXPLAIN (ANALYZE, BUFFERS) on each
3. Identify: Seq Scan? Wrong index? Bad column order? Heap fetch?
4. Create CONCURRENTLY the minimum index that fixes it
5. Re-run EXPLAIN to verify index is being used
6. Monitor pg_stat_user_indexes for unused indexes
7. ANALYZE after bulk operations
Results From a Real Audit
| Query | Before | After | Index Type |
|---|---|---|---|
| Listing by category + city + price | 4.2s | 280ms | Composite partial |
| Transaction history | 6.1s | 145ms | Covering |
| JSONB attribute search | 380ms | 28ms | GIN |
| Active sessions | 8.2s | 18ms | Partial |
| Admin report | 30s+ | 2.1s | Composite + VACUUM |
Key Takeaways
- pg_stat_statements by total_exec_time, not mean — frequent slow queries matter more than rare very slow ones
- EXPLAIN ANALYZE BUFFERS always — buffers reveal cache vs disk; estimated vs actual reveals plan quality
- Composite index column order: equality first, then range, then ORDER BY
- Partial indexes are underused — dramatically smaller and faster for queries with constant filter conditions
- Drop unused indexes — they cost writes and memory; use pg_stat_user_indexes to audit