N+1 Query Detection Benchmarks
Measuring the improvement. ORM-generated N+1 patterns detected and batched at the wire protocol level, with reproducible results.
Methodology
These benchmarks measure Gold Lapel's ability to detect N+1 query patterns in real time and batch them into efficient IN-clause queries. The test suite simulates common ORM access patterns — Django, Rails, Prisma, SQLAlchemy — across varying depth levels (1+N, 1+N+N, 1+N+N+N).
The "before" measurements represent the raw ORM output: one query per related record. The "after" measurements show the same requests after Gold Lapel has detected the pattern and begun batching.
| Parameter | Value |
|---|---|
| PostgreSQL | 16.2 |
| Gold Lapel | 0.1.0 |
| CPU | 4-core AMD EPYC (c5.xlarge) |
| RAM | 8 GB |
| Storage | gp3 SSD, 3000 IOPS |
| Dataset | 100K users, 2M posts, 8M comments |
| Page size | 25 items per page (N=25) |
| Duration | 5 minutes per test |
The pattern
An N+1 query occurs when an ORM fetches a list of records, then issues a separate query for each record's related data. For a page of 25 items with one level of nesting, this means 26 queries. With two levels, 51 queries. With three levels, 76 queries.
-- What the ORM generates: 1 + N queries (N=25)
-- Query 1: Fetch page of users
SELECT * FROM users ORDER BY created_at DESC LIMIT 25;
-- Queries 2-26: Fetch posts for EACH user (one at a time)
SELECT * FROM posts WHERE user_id = 1;
SELECT * FROM posts WHERE user_id = 2;
SELECT * FROM posts WHERE user_id = 3;
-- ... 22 more individual queries
-- What Gold Lapel rewrites it to: 2 queries
SELECT * FROM users ORDER BY created_at DESC LIMIT 25;
SELECT * FROM posts WHERE user_id IN (1, 2, 3, ..., 25); Gold Lapel detects the repeating pattern — identical query structure with only the parameter changing — and batches the individual queries into a single IN-clause query. The application receives the same results in the same order.
Results
| Pattern | Queries (before) | Queries (after) | Latency (before) | Latency (after) | Speedup |
|---|---|---|---|---|---|
| Users + posts (1+N) | 26 | 2 | 48ms | 6ms | 8x |
| Users + posts + comments (1+N+N) | 51 | 3 | 94ms | 9ms | 10x |
| Users + posts + comments + likes (1+N+N+N) | 76 | 4 | 142ms | 11ms | 13x |
| Products + reviews + images (1+N+N) | 51 | 3 | 87ms | 8ms | 11x |
| Orders + items + products (1+N+N) | 51 | 3 | 91ms | 9ms | 10x |
Latency includes total round-trip time for all queries in the pattern. The improvement factor scales with nesting depth because each additional level adds N more queries that can be batched.
Analysis
The latency improvement is primarily driven by eliminating round trips, not by reducing query execution time. Each individual SELECT in an N+1 pattern executes in under 1ms — the query itself is fast. The cost is the network round trip: sending the query, waiting for PostgreSQL to parse, plan, and execute it, then receiving the result. At 25 round trips, this overhead dominates.
Batching converts 25 round trips into 1. The single batched query takes slightly longer to execute (an IN clause with 25 values vs a single equality check), but the savings in round-trip overhead dwarf the marginal execution cost.
At deeper nesting levels (1+N+N+N), the improvement is even more dramatic: 76 queries become 4, eliminating 72 round trips. The 13x latency improvement at this depth makes the difference between a 142ms page load and an 11ms page load.
Scaling with page size
The benchmarks above use N=25 (a page of 25 items). As page size increases, the improvement scales linearly:
| Page size (N) | Queries without GL | Queries with GL | Reduction |
|---|---|---|---|
| 10 | 11 | 2 | 82% |
| 25 | 26 | 2 | 92% |
| 50 | 51 | 2 | 96% |
| 100 | 101 | 2 | 98% |
How to reproduce
All benchmark code, data generators, and ORM simulation scripts are open source.
# Clone the benchmark suite
git clone https://github.com/goldlapel/benchmarks
cd benchmarks/n-plus-one-detection
# Start test environment
docker compose up -d
# Seed dataset
./seed.sh
# Run baseline (direct Postgres connection)
./run.sh --mode baseline --duration 300
# Run with Gold Lapel
./run.sh --mode goldlapel --duration 300
# Compare
./compare.sh results/baseline.json results/goldlapel.json The benchmark suite includes simulated access patterns for Django, Rails (ActiveRecord), Prisma, and SQLAlchemy. Results are consistent across ORMs because the N+1 pattern at the wire protocol level is identical.
Terms referenced in this article
These benchmarks measure detection and batching at the protocol level. For the application-level side of the story — how ORMs generate the N+1 pattern in the first place and what you can do about it in code — the ORM performance benchmark compares Django, Rails, Spring, and five other frameworks on the same queries.