Database Indexing Guide
Index Types
| Type | Best For | Supports | Notes |
|---|---|---|---|
| B-tree (default) | Equality, range, sorting | =, <, >, BETWEEN, LIKE 'foo%', ORDER BY | Works for 90%+ use cases |
| Hash | Equality only | = only | Faster for exact match; no range support |
| GIN | Multi-valued (arrays, JSONB, tsvector) | @>, ?, @@ full-text | Larger, slower to build, fast to query |
| GiST | Geometric, range, proximity | PostGIS, tsrange, nearest neighbor | Lossy — recheck needed |
| BRIN | Very large sequential tables | Range queries on naturally ordered data | Tiny index; good for time-series data |
| Partial | Subset of rows (WHERE clause) | Any type, smaller and faster | Only indexes matching rows |
| Expression | Function result, LOWER(email) | Query must use same expression | Must match query expression exactly |
| Covering | Include extra columns (INCLUDE) | Index-only scans | Avoids heap fetch entirely |
Composite Index Column Order
-- Rule: Equality columns first, then range, then ORDER BY
-- Scenario: WHERE status = 'active' AND created_at > '2024-01-01' ORDER BY user_id
-- Optimal index order:
CREATE INDEX idx_status_created_user ON orders(status, created_at, user_id);
-- Leftmost prefix rule examples:
-- Can use index:
-- WHERE status = 'active' ✓ (prefix)
-- WHERE status = 'active' AND created_at > X ✓ (prefix + range)
-- Cannot use index:
-- WHERE created_at > '2024-01-01' ✗ (skips first column)
-- WHERE user_id = 42 ✗ (skips first two)
-- Covering index (INCLUDE for non-filter columns)
CREATE INDEX idx_orders_cover
ON orders(user_id, status)
INCLUDE (amount, created_at);
-- Query: SELECT amount, created_at FROM orders WHERE user_id=42 AND status='paid'
-- → Index-only scan, zero heap access
-- Partial index (index only active records)
CREATE INDEX idx_active_orders ON orders(user_id, created_at)
WHERE status = 'active';
-- Smaller index, faster queries for active orders
-- Expression index
CREATE INDEX idx_email_lower ON users(LOWER(email));
-- Use with: WHERE LOWER(email) = 'user@example.com'
Index Selectivity & Maintenance
-- Check index selectivity (PostgreSQL)
SELECT attname,
n_distinct,
correlation -- 1.0 = perfectly ordered (BRIN-friendly)
FROM pg_stats
WHERE tablename = 'orders';
-- Find unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexname NOT LIKE '%pkey';
-- Check index bloat (rebuild if bloat > 30%)
SELECT indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE tablename = 'orders';
-- Rebuild index (PostgreSQL)
REINDEX INDEX CONCURRENTLY idx_orders_user_id;
-- MySQL: check index usage
SELECT * FROM sys.schema_unused_indexes;
SELECT * FROM sys.schema_redundant_indexes;
-- When NOT to index:
-- • Small tables (full scan is faster)
-- • Low-cardinality columns (boolean, gender)
-- • Columns rarely used in WHERE/JOIN/ORDER BY
-- • Write-heavy tables (each index slows INSERT/UPDATE/DELETE)