← Case Studies

A Data Analytics Startup That Declined a $40K/yr RDS Upgrade

A bootstrapped SaaS with 50 million rows of event data, reporting queries that were getting expensive, and an AWS bill that was about to double.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 6 min read
The r5.2xlarge was ready. The invoice was drafted. There turned out to be another option.
$38K Annual savings
r5.2xl Instance avoided
73% CPU reduction
0 Schema changes

The challenge

The company — a four-person analytics SaaS, bootstrapped, profitable but lean — provides event analytics for mobile apps. Their customers embed a tracking SDK, and the platform produces retention curves, funnel analyses, and cohort reports from the collected event data.

The database: a single RDS instance running PostgreSQL 16 on an r5.xlarge (4 vCPUs, 32 GB RAM). The event table had grown to 52 million rows. The reporting queries — the cohort analyses, the funnel calculations, the retention curves — were aggregations across date ranges, event types, and user segments.

By mid-quarter, the r5.xlarge was running at 87% CPU during business hours. The reporting queries were taking 4-8 seconds. The AWS bill for the RDS instance was $3,300/month. The natural next step was an r5.2xlarge at $6,600/month — doubling the database cost to $40K/year more.

The founder asked a question worth asking: were the queries actually that expensive, or could they be made more efficient?

The setup

Gold Lapel was added as a sidecar container in the existing ECS task definition. The application — a Python/FastAPI stack — was updated with a single environment variable change.

# Added to the existing ECS task definition
{
  "name": "goldlapel",
  "image": "goldlapel/goldlapel:latest",
  "command": [
    "--upstream",
    "postgresql://analytics:password@rds-primary.internal:5432/analytics"
  ],
  "portMappings": [{ "containerPort": 7932 }]
}

# App container DATABASE_URL updated
DATABASE_URL=postgresql://analytics:password@localhost:7932/analytics

Deployment was a single ECS task definition update. The application restarted with the new configuration. No Python code was modified.

What Gold Lapel found

The reporting workload was dominated by three patterns, all variations of the same underlying problem: expensive aggregations over large date ranges, executed repeatedly with identical parameters.

  • Retention cohort queries. Each retention report computed a 30-day cohort matrix by joining the events table against itself — once for the initial event, once for the return event. On 52M rows, this was a 6.8-second query. It ran every time a customer opened their retention dashboard, typically 40-60 times per hour across all tenants.
  • Funnel step calculations. Funnel reports counted distinct users at each step of a multi-event sequence. Each funnel query scanned 2-12 million rows depending on the date range. Average execution: 3.4 seconds, running 80-100 times per hour.
  • Missing composite indexes on event lookups. The event table had indexes on app_id and event_type separately, but the most common filter pattern — WHERE app_id = ? AND event_type = ? AND created_at BETWEEN ? AND ? — required a composite index that did not exist.

Gold Lapel created two materialized views (for the retention and funnel aggregations) and one composite index. The materialized views refreshed on a write-aware schedule — when new events arrived for a given app, only that app's view partitions were refreshed.

The results

Query patternBeforeAfterImprovement
Retention cohort6.8s42ms99.4%
Funnel calculation3.4s28ms99.2%
Event type lookups180ms4ms98%

RDS CPU utilization dropped from 87% to 23% during business hours. The r5.xlarge was no longer under pressure. The r5.2xlarge upgrade was cancelled.

The math

OptionMonthly costAnnual cost
Upgrade to r5.2xlarge$6,600$79,200
Stay on r5.xlarge + Gold Lapel$3,449$41,388
Savings$3,151/mo$37,812/yr

Gold Lapel's cost ($49/db/month) was included in the "stay" option. The net savings: approximately $38,000 per year. For a bootstrapped four-person company, that is not a rounding error.

Key takeaways

  • Optimization before scaling is not just a principle — it is arithmetic. The r5.2xlarge would have doubled the database cost and provided approximately 2x the CPU. The optimized queries used 73% less CPU. Optimization delivered a larger improvement at a fraction of the cost.
  • Reporting queries are ideal optimization targets. They are read-heavy, repetitive, and aggregate-intensive — precisely the pattern that materialized views address. The same query running 60 times per hour with the same parameters is doing the same work 60 times.
  • Infrastructure budget matters most for lean teams. $38K/year is a meaningful fraction of a four-person company's runway. The decision between "scale the instance" and "optimize the queries" has direct impact on what the company can build next.

Terms referenced in this article

The materialized views at the centre of this story deserve a closer look. I have written about the common pitfalls of materialized views — the refresh strategies, the bloat accumulation, and the concurrency traps that determine whether they remain a solution or become a new problem.