Join and Aggregation Query Benchmarks
The numbers, if you please. Multi-table joins with GROUP BY on a 10-million-row dataset, with and without materialized view optimization.
Methodology
All benchmarks run on identical infrastructure with a fixed dataset. The "before" measurements use direct PostgreSQL connections. The "after" measurements use Gold Lapel as a transparent proxy, with pattern detection and materialized view creation already complete. Each test runs for 5 minutes at steady state with 16 concurrent clients.
| Parameter | Value |
|---|---|
| PostgreSQL | 16.2 |
| Gold Lapel | 0.1.0 |
| CPU | 4-core AMD EPYC (c5.xlarge) |
| RAM | 8 GB |
| Storage | gp3 SSD, 3000 IOPS |
| Dataset | 10M orders, 500K customers, 50K products |
| Concurrent clients | 16 |
| Duration | 5 minutes per test |
Test queries
The benchmark suite exercises five query patterns commonly found in analytics dashboards, reporting modules, and admin panels. Each involves at least two table joins and an aggregation function.
-- The most expensive pattern: revenue by region and product category
-- 3-table join, GROUP BY two columns, SUM aggregation
SELECT r.region_name, c.category_name, SUM(o.total) AS revenue
FROM orders o
JOIN customers cu ON cu.id = o.customer_id
JOIN regions r ON r.id = cu.region_id
JOIN products p ON p.id = o.product_id
JOIN categories c ON c.id = p.category_id
WHERE o.created_at >= NOW() - INTERVAL '90 days'
GROUP BY r.region_name, c.category_name
ORDER BY revenue DESC;
-- Without GL: Seq Scan on orders (10M rows), Hash Joins, Sort
-- Execution: 487ms
-- With GL: Index Scan on gl_matview_revenue_region_category
-- Execution: 1.2ms Results
| Query pattern | Without GL (P50) | With GL (P50) | Improvement |
|---|---|---|---|
| 3-table join + GROUP BY (revenue by region) | 487ms | 1.2ms | 406x |
| 2-table join + COUNT DISTINCT (active users by plan) | 234ms | 0.8ms | 293x |
| Subquery + SUM (monthly cohort retention) | 612ms | 1.5ms | 408x |
| Self-join + window function (running totals) | 891ms | 2.3ms | 387x |
| 4-table join + GROUP BY + HAVING (top customers by category) | 1,240ms | 2.8ms | 443x |
All times are P50 (median) latency. P99 improvements are more pronounced due to elimination of sequential scan variance. The 4-table join pattern shows the highest improvement factor because the original query is the most expensive.
Analysis
The improvement factor is consistent (293x–443x) because the underlying optimization is identical across all patterns: replacing a multi-table join with aggregation with a direct read from a precomputed materialized view.
The absolute "after" times (0.8–2.8ms) are dominated by three fixed costs: proxy wire protocol overhead (~0.3ms), materialized view index lookup (~0.2–1.0ms), and result serialization (~0.3–1.5ms depending on result set size). The complexity of the original query is irrelevant — the view has already done the work.
This means improvement factors scale with the complexity of the original query. A 100ms query becomes 1ms (100x). A 1,200ms query becomes 3ms (400x). The more expensive the raw query, the larger the speedup.
Throughput impact
At 16 concurrent clients, the database served 33 queries per second for the 3-table join pattern without Gold Lapel. With Gold Lapel, the same setup served 13,300 queries per second — a 403x throughput improvement. Database CPU utilization dropped from 94% to 8%.
The freed resources allow PostgreSQL to serve other workloads — writes, transactional queries, ad-hoc reports — without contention from expensive analytical queries.
How to reproduce
All benchmark code, data generators, and infrastructure definitions are open source.
# Clone the benchmark suite
git clone https://github.com/goldlapel/benchmarks
cd benchmarks/join-aggregation
# Start the test environment
docker compose up -d
# Seed the dataset (10M orders, ~2 minutes)
./seed.sh
# Run without Gold Lapel (baseline)
./run.sh --mode baseline --clients 16 --duration 300
# Run with Gold Lapel (wait 60s for pattern detection)
./run.sh --mode goldlapel --clients 16 --duration 300
# Compare results
./compare.sh results/baseline.json results/goldlapel.json Absolute times vary with hardware, but relative improvement factors should be consistent across environments. The benchmark suite includes a compare.sh script that generates a side-by-side report.
Terms referenced in this article
The materialized view optimization at the heart of these benchmarks has a companion piece that addresses its long-term maintenance. I have written about materialized view pitfalls — refresh costs, bloat accumulation, and the concurrency traps that appear once your views are in production rather than in a benchmark.