Automatic Index Creation Benchmarks

From sequential scan to sorted. Five index strategies, five query patterns, and the numbers to show for it.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 5 min read

Five million rows, five query patterns, and the indexes that make the difference.

Methodology

These benchmarks measure the performance impact of Gold Lapel's automatic index creation across five common query patterns. The baseline dataset has no secondary indexes — only primary keys. Gold Lapel observes query patterns during a warmup phase, creates indexes using CREATE INDEX CONCURRENTLY, and the benchmark measures steady-state performance after index creation.

Parameter	Value
PostgreSQL	16.2
Gold Lapel	0.1.0
CPU	4-core AMD EPYC (c5.xlarge)
RAM	8 GB
Storage	gp3 SSD, 3000 IOPS
Dataset	5M events, 200K users, 50K sessions
Concurrent clients	16
Duration	5 minutes per test

What the EXPLAIN looks like

The most telling measure of index effectiveness is the EXPLAIN plan transition: from sequential scan (reading the entire table) to index scan (reading only matching rows).

-- Before: Sequential scan on 5M rows
EXPLAIN ANALYZE
SELECT * FROM events WHERE user_id = 42 AND created_at > '2025-01-01';

--                          QUERY PLAN
-- -------------------------------------------------------
-- Seq Scan on events  (cost=0.00..198234.00 rows=847 ...)
--   (actual time=234.112..567.891 rows=823 loops=1)
--   Filter: ((user_id = 42) AND (created_at > ...))
--   Rows Removed by Filter: 4999177
-- Planning Time: 0.089 ms
-- Execution Time: 567.934 ms

-- After: Gold Lapel creates the composite index
-- CREATE INDEX CONCURRENTLY gl_idx_events_user_created
--   ON events (user_id, created_at);

--                          QUERY PLAN
-- -------------------------------------------------------
-- Index Scan using gl_idx_events_user_created on events
--   (cost=0.43..42.18 rows=847 ...)
--   (actual time=0.034..4.812 rows=823 loops=1)
--   Index Cond: ((user_id = 42) AND (created_at > ...))
-- Planning Time: 0.112 ms
-- Execution Time: 4.847 ms

Results

Query pattern	Index created	Without index (P50)	With index (P50)	Improvement
Single-column equality (status filter)	B-tree on status	342ms	2.1ms	163x
Composite equality + range (user + date)	B-tree on (user_id, created_at)	567ms	4.8ms	118x
Partial index (active records only)	B-tree on status WHERE active	342ms	0.9ms	380x
Text prefix search (email LIKE)	B-tree on email varchar_pattern_ops	891ms	1.4ms	636x
Multi-column sort (ORDER BY a, b)	B-tree on (region, revenue DESC)	234ms	3.2ms	73x

The partial index shows the highest improvement (380x) because it indexes only 6% of rows (active records), making the index 17x smaller and faster to scan than a full B-tree on the same column.

Analysis

The improvement factor varies by pattern because it depends on selectivity — how many rows the query actually needs out of the total table size.

High-selectivity filters (equality on a specific user, text prefix match) show the largest improvements because the index eliminates the most rows. The text prefix pattern improves 636x because the LIKE query without an index must scan every row's email column.
Composite indexes show moderate improvements (73x–118x) because the index serves both the filter and the sort, eliminating a separate Sort node in the query plan.
Partial indexes combine selectivity with a smaller index size. When 94% of rows are in a terminal state, indexing only the active 6% produces an index that fits in a single index page — dramatically faster than scanning the full B-tree.

Index creation overhead

Gold Lapel creates all indexes using CONCURRENTLY, which does not lock the table for writes during creation. Index build times for this dataset:

Index type	Build time	Index size
Single-column B-tree (5M rows)	3.2s	107 MB
Composite B-tree (5M rows)	4.8s	172 MB
Partial B-tree (300K rows)	0.4s	6.4 MB
varchar_pattern_ops (5M rows)	5.1s	214 MB

The one-time build cost is amortized across every subsequent query. For a pattern executing 100 times per hour, the 3-5 second build cost is recovered within the first minute of use.

How to reproduce

# Clone the benchmark suite
git clone https://github.com/goldlapel/benchmarks
cd benchmarks/index-auto-creation

# Start test environment
docker compose up -d

# Seed dataset (no indexes beyond primary keys)
./seed.sh --no-indexes

# Run baseline
./run.sh --mode baseline --clients 16 --duration 300

# Run with Gold Lapel (indexes created during warmup)
./run.sh --mode goldlapel --clients 16 --duration 300

# Compare
./compare.sh results/baseline.json results/goldlapel.json

The seed script creates tables with no secondary indexes. Gold Lapel's warmup phase detects patterns and creates indexes before the timed benchmark begins.

Terms referenced in this article

If the index strategies behind these numbers interest you, I have written a comprehensive guide to PostgreSQL index types — B-tree, GIN, GiST, BRIN, and hash — that explains when each is selected and why the performance differences are as dramatic as you have just seen.