Database Indexing Guide

Index Types

Type	Best For	Supports	Notes
B-tree (default)	Equality, range, sorting	=, <, >, BETWEEN, LIKE 'foo%', ORDER BY	Works for 90%+ use cases
Hash	Equality only	= only	Faster for exact match; no range support
GIN	Multi-valued (arrays, JSONB, tsvector)	@>, ?, @@ full-text	Larger, slower to build, fast to query
GiST	Geometric, range, proximity	PostGIS, tsrange, nearest neighbor	Lossy — recheck needed
BRIN	Very large sequential tables	Range queries on naturally ordered data	Tiny index; good for time-series data
Partial	Subset of rows (WHERE clause)	Any type, smaller and faster	Only indexes matching rows
Expression	Function result, LOWER(email)	Query must use same expression	Must match query expression exactly
Covering	Include extra columns (INCLUDE)	Index-only scans	Avoids heap fetch entirely

Composite Index Column Order

-- Rule: Equality columns first, then range, then ORDER BY

-- Scenario: WHERE status = 'active' AND created_at > '2024-01-01' ORDER BY user_id
-- Optimal index order:
CREATE INDEX idx_status_created_user ON orders(status, created_at, user_id);

-- Leftmost prefix rule examples:
-- Can use index:
--   WHERE status = 'active'                    ✓ (prefix)
--   WHERE status = 'active' AND created_at > X ✓ (prefix + range)
-- Cannot use index:
--   WHERE created_at > '2024-01-01'            ✗ (skips first column)
--   WHERE user_id = 42                         ✗ (skips first two)

-- Covering index (INCLUDE for non-filter columns)
CREATE INDEX idx_orders_cover
ON orders(user_id, status)
INCLUDE (amount, created_at);
-- Query: SELECT amount, created_at FROM orders WHERE user_id=42 AND status='paid'
-- → Index-only scan, zero heap access

-- Partial index (index only active records)
CREATE INDEX idx_active_orders ON orders(user_id, created_at)
WHERE status = 'active';
-- Smaller index, faster queries for active orders

-- Expression index
CREATE INDEX idx_email_lower ON users(LOWER(email));
-- Use with: WHERE LOWER(email) = 'user@example.com'

Index Selectivity & Maintenance

-- Check index selectivity (PostgreSQL)
SELECT attname,
       n_distinct,
       correlation   -- 1.0 = perfectly ordered (BRIN-friendly)
FROM pg_stats
WHERE tablename = 'orders';

-- Find unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexname NOT LIKE '%pkey';

-- Check index bloat (rebuild if bloat > 30%)
SELECT indexname,
       pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE tablename = 'orders';

-- Rebuild index (PostgreSQL)
REINDEX INDEX CONCURRENTLY idx_orders_user_id;

-- MySQL: check index usage
SELECT * FROM sys.schema_unused_indexes;
SELECT * FROM sys.schema_redundant_indexes;

-- When NOT to index:
-- • Small tables (full scan is faster)
-- • Low-cardinality columns (boolean, gender)
-- • Columns rarely used in WHERE/JOIN/ORDER BY
-- • Write-heavy tables (each index slows INSERT/UPDATE/DELETE)

Database Indexing Guide

Index Types

Composite Index Column Order

Index Selectivity & Maintenance

Alat terkait