Appendix A: The Glossary

The Waiter of Gold Lapel · Updated Mar 30, 2026 Published Mar 11, 2026 · 10 min

Advisory lock: An application-level cooperative lock in PostgreSQL, identified by a 64-bit integer. Faster than row-level locks, creates no table bloat, and is automatically released when the session ends. pg_try_advisory_lock returns immediately without blocking, making it ideal for preventing concurrent materialized view refreshes across multiple application instances. (Chapter 5)
Autovacuum: PostgreSQL's automatic background process for reclaiming storage occupied by dead tuples and updating table statistics. Critical for materialized views refreshed with CONCURRENTLY. Tune autovacuum_vacuum_scale_factor to 0.01 for frequently refreshed views — the default of 0.20 is too conservative. (Chapter 5, Chapter 16)
Bloat: Accumulated dead tuples in a table or materialized view that have not been reclaimed by VACUUM. Causes increased storage usage, slower sequential scans, and degraded index performance. A common consequence of under-tuned autovacuum on write-heavy tables or frequently refreshed materialized views.
BRIN index: Block Range Index. An extremely compact index type that stores summary information for ranges of physical table blocks. Effective for naturally ordered data such as timestamps or sequential IDs. Suitable for very large materialized views where B-tree index size would be prohibitive. (Chapter 4)
B-tree index: The default PostgreSQL index type. Supports equality and range queries. The most common index type for materialized views and the required type for the UNIQUE index that enables REFRESH CONCURRENTLY. (Chapter 4)
Cache invalidation: The problem of ensuring cached data reflects the current state of the source data. When cache and database are separate systems (e.g., Redis beside PostgreSQL), invalidation is a distributed consistency problem with exponential complexity. When the cache is a materialized view inside the database, invalidation reduces to REFRESH MATERIALIZED VIEW — one atomic command. Phil Karlton identified this as one of the two hard problems in computer science, circa 1996. (Chapter 6)
Citus: A PostgreSQL extension (now part of Azure Cosmos DB for PostgreSQL) that provides distributed table sharding, distributed query execution, and reference tables. The least disruptive sharding path for applications that have genuinely exhausted single-server optimizations. (Chapter 16)
Cold start: The delay incurred when a serverless compute instance initializes from a stopped state. In serverless PostgreSQL providers like Neon, cold start latency ranges from approximately 500 milliseconds to 2 seconds as the compute node provisions and PostgreSQL starts. (Chapter 15)
Complexity tax: The cumulative operational cost of running services an application does not need. Includes infrastructure fees, monitoring overhead, on-call burden, credential rotation, version upgrades, and engineering time spent on maintenance rather than product development. A 2025 survey by Port (State of Internal Developer Portals) found 75% of developers lose 6 to 15 hours per week to tool sprawl. (Chapter 2, Chapter 19)
CONCURRENTLY: A modifier for REFRESH MATERIALIZED VIEW that allows reads to continue during refresh. Acquires an ExclusiveLock rather than an AccessExclusiveLock. Requires a UNIQUE index on column names only — no expressions, no WHERE clause. Uses a FULL OUTER JOIN diff algorithm that produces dead tuples requiring VACUUM. (Chapter 4)
Connection pinning: A behavior in AWS RDS Proxy where the use of prepared statements causes the proxy to bind a client to a specific PostgreSQL backend for the entire session, negating the pooling benefit. Particularly problematic with Prisma, which uses prepared statements by default. (Chapter 15)
Connection pooler: A service that maintains persistent connections to PostgreSQL and multiplexes client connections across them. Converts hundreds or thousands of application connections into a small pool of real database connections. PgBouncer, PgCat, Supavisor, and Odyssey are the four major options. (Chapter 15, Chapter 17)
Covering index: An index that includes additional columns beyond the indexed ones via the INCLUDE clause, enabling index-only scans where the query is answered entirely from the index without touching the table heap. Reduces I/O for read-heavy queries against materialized views. (Chapter 4, Chapter 16)
Curriculum effect: The phenomenon where architecture decisions from one era (PostgreSQL + Redis + Elasticsearch, circa 2010) become unquestioned industry assumptions in the next, not because anyone reaffirmed them, but because nobody questioned them. Bootcamps, tutorials, and starter templates propagate patterns long after the underlying tools have evolved past them. (Chapter 2)
Database-native caching: Using PostgreSQL's own features — materialized views for complex aggregated data, UNLOGGED tables for simple key-value storage — instead of external caching services. Eliminates cache invalidation as a distributed consistency problem. (Chapter 6)
Dead tuple: A row version that is no longer visible to any transaction but has not yet been reclaimed by VACUUM. Produced by UPDATE and DELETE operations, including those from REFRESH MATERIALIZED VIEW CONCURRENTLY's diff algorithm. Monitor via n_dead_tup in pg_stat_user_tables. (Chapter 4, Chapter 5)
Debezium: A Change Data Capture platform that watches the PostgreSQL write-ahead log and emits structured events for every data change. Appropriate for propagating changes to external systems such as Elasticsearch and data warehouses. Not necessary when the consumer is the same PostgreSQL database — materialized views handle that case with zero new infrastructure. (Chapter 6)
Debounce: A technique that delays action until a burst of events has settled. In LISTEN/NOTIFY-driven materialized view refresh, a debounce timer — typically 5 seconds — prevents rapid writes from triggering rapid refreshes. The view refreshes once after write activity settles, rather than once per write. (Chapter 5)
Eager loading: An ORM technique that loads related objects in bulk, in advance, rather than individually on access. The primary fix for N+1 query patterns. Implementations vary by framework: select_related / prefetch_related in Django, includes in Rails, @EntityGraph in Hibernate, with() in Laravel, include in Prisma, Preload in GORM. (Chapter 3)
EXPLAIN ANALYZE: A PostgreSQL command that executes a query and reports the actual execution plan, including row counts, timing, and buffer usage at each node. The primary diagnostic tool for individual query optimization. Read the output from bottom to top. (Chapter 18)
Flag table pattern: A materialized view refresh strategy where a lightweight trigger marks a boolean flag when source data changes. A periodic job checks the flag and refreshes only when it is set. Eliminates unnecessary refreshes when underlying data has not changed. (Chapter 5)
GIN index: Generalized Inverted Index. The standard index type for full-text search (tsvector), JSONB containment queries, array operations, and trigram similarity (pg_trgm). Essential for search-oriented materialized views. Without a GIN index, full-text search queries perform sequential scans. (Chapter 4)
GiST index: Generalized Search Tree. Supports geometric data, range types, and certain full-text search scenarios. Also supports pg_trgm trigram operations as an alternative to GIN. (Chapter 4)
Good Job: A PostgreSQL-backed background job library for Ruby on Rails with built-in cron scheduling, a web dashboard, and advisory lock-based concurrency control. An alternative to Sidekiq that requires no Redis. (Chapter 9)
Heap swap: The internal mechanism for non-concurrent REFRESH MATERIALIZED VIEW. PostgreSQL creates a new heap file with fresh data and atomically replaces the old one by updating pg_class. Produces zero dead tuples. Blocks reads during the swap. (Chapter 4)
Idle in transaction: A connection state in PostgreSQL where a transaction has been opened but neither committed nor rolled back. Indicates an application code bug — typically a long-running external operation (API call, file processing) occurring inside an open transaction. Blocks autovacuum, holds locks, and wastes connection slots. Monitor via pg_stat_activity. (Chapter 18)
Invalidation trap: The exponentially growing web of cache-to-data dependencies that transforms a "simple caching layer" into a distributed consistency nightmare. The number of invalidation triggers grows with the number of relationships in the data model, not the number of tables. Structurally eliminated when cache and database are the same system. (Chapter 6)
JSONB: PostgreSQL's binary JSON column type, available since version 9.4 (2014). Supports indexing with GIN, containment queries, path operations, and full CRUD. Provides native document storage without MongoDB. (Chapter 1, Chapter 19)
Lazy loading: An ORM pattern where related objects are loaded from the database only when accessed, via proxy objects or deferred references. Efficient for single-object operations. Produces N+1 query patterns when used in loops. The default behavior in Django, Rails, Hibernate, Laravel, Sequelize, and GORM. (Chapter 3)
LISTEN/NOTIFY: PostgreSQL's built-in publish/subscribe mechanism. Notifications are transactional — held until COMMIT, discarded on ROLLBACK — and support payloads up to 8,000 bytes. Requires no external message broker. Used for event-driven materialized view refresh and as a Redis pub/sub replacement. Rails 8's SolidCable is built on it. (Chapter 5, Chapter 6)
Materialized view: A database object that stores the results of a query as a physical heap table with its own storage, statistics, and indexes. Provides index-scan read performance at the cost of data freshness — data is a snapshot from the last REFRESH. Supports every SQL feature PostgreSQL offers, including window functions, CTEs, LATERAL joins, and recursive queries. Available since PostgreSQL 9.3 (2013). The central technique of this book. (Chapter 4)
max_connections: The PostgreSQL configuration parameter that sets the maximum number of concurrent connections. Default is 100. Each connection spawns a dedicated backend process consuming approximately 5–10MB of shared memory. Connection 101 receives FATAL: too many connections for role. The parameter at the center of the serverless connection crisis. (Chapter 15)
N+1 query: A performance anti-pattern where retrieving N items requires 1 initial query plus N additional queries for related data. Caused by ORM lazy loading in loop contexts. A page displaying 25 posts with authors generates 26 queries where a single JOIN would suffice. The single most common cause of ORM-related performance problems. (Chapter 3)
Odyssey: A multi-threaded C connection pooler developed by Yandex. Distinguished by mutual TLS (mTLS) authentication support and sophisticated prepared statement deduplication. Appropriate for enterprise environments with strict security compliance requirements. (Chapter 17)
ORM ceiling: The point at which eager loading, query optimization, and ORM-specific workarounds are no longer sufficient for complex read queries — multi-table aggregations, window functions, CTEs. The developer faces a choice between raw SQL (effective but inelegant) and materialized views (which serve pre-computed results through the ORM's normal query interface). (Chapter 3)
ORM Tax: The cumulative performance cost of ORM-generated SQL versus hand-optimized queries. Includes N+1 patterns, inefficient JOINs, unnecessary data retrieval, and the inability to express complex aggregations. Every application using an ORM pays it. Most do not know they are paying it. (Chapter 3)
Partial index: An index that covers only a subset of a table's rows, defined by a WHERE clause. Indexes only the rows queries actually touch. If 90% of orders are archived and queries only target active ones, a partial index on WHERE status = 'active' is smaller, faster to maintain, and faster to scan. (Chapter 16, Chapter 18)
Partition pruning: PostgreSQL's automatic optimization that excludes irrelevant partitions from query execution based on the WHERE clause. A query for January 2026 data on a time-partitioned table touches only the January partition. Requires no application code changes. (Chapter 16)
pg_cron: A PostgreSQL extension that provides cron-style job scheduling inside the database as a background worker process. The recommended approach for scheduling materialized view refresh. Supported on AWS RDS, Google Cloud SQL, Azure Database for PostgreSQL, and Supabase. Three lines of SQL; no application code. (Chapter 5)
pg_ivm: The Incremental View Maintenance extension for PostgreSQL (version 1.12, September 2025). Installs AFTER triggers that capture affected rows and compute incremental changes to the materialized view, rather than recomputing the entire view. Supports SELECT, WHERE, JOINs, and common aggregates. Not available on major managed cloud platforms as of early 2026. Compatible with PostgreSQL 13 through 18. (Chapter 5)
pg_stat_activity: A PostgreSQL system view showing current activity for each connection: state (active, idle, idle in transaction), the current query, wait events, and connection duration. The diagnostic tool for connection exhaustion and stuck transactions. (Chapter 18)
pg_stat_statements: A PostgreSQL extension that tracks planning and execution statistics for every SQL statement. Sort by total_exec_time descending to find the queries consuming the most cumulative database resources. The first extension to enable on any PostgreSQL instance. (Chapter 1, Chapter 18)
pg_stat_user_tables: A PostgreSQL system view showing per-table statistics including live tuple count, dead tuple count, and the timestamps of the last autovacuum and autoanalyze. The diagnostic tool for table bloat and autovacuum health. (Chapter 18)
pg_trgm: A PostgreSQL extension providing trigram-based similarity matching. Enables fuzzy search, typo tolerance, autocomplete, and "did you mean" suggestions. Indexable with GIN and GiST.
PgBouncer: The original PostgreSQL connection pooler, in production since 2007. Written in C with a single-threaded event-loop architecture. Peaks at approximately 44,000 TPS with 50 concurrent clients (Tembo 2024 benchmarks). The safest default choice for most deployments. (Chapter 17)
PgCat: A multi-threaded connection pooler written in Rust. Reaches 59,000 TPS at 1,250 concurrent clients — 34% higher peak throughput than PgBouncer at 25 times the concurrency. Provides built-in read/write splitting, replica failover, and prepared statement support in transaction mode. Originally developed at PostgresML. (Chapter 17)
Refresh strategy: The approach used to keep a materialized view current. Options include pg_cron scheduling (simplest), application-level scheduling (Celery, SolidQueue, @Scheduled), flag-table patterns (conditional), LISTEN/NOTIFY event-driven refresh (near-real-time), and pg_ivm incremental maintenance (immediate). Start with pg_cron at 15 minutes and adjust. (Chapter 5)
relkind: The pg_class column that identifies the type of a database object. Tables are r, regular views are v, materialized views are m, and indexes are i. A materialized view's relkind = 'm' distinguishes it from both tables and views in the system catalog. (Chapter 4)
Scenic: A Ruby gem, originally by thoughtbot, for managing database views and materialized views in Rails. Provides versioned SQL definitions, migration generators, cascade refresh, and a zero-downtime side_by_side update strategy. Approximately 3,600 GitHub stars. Used by Mastodon, Code.org, and Lobsters. (Chapter 9)
Session mode: A connection pooler mode where the client receives a dedicated PostgreSQL connection for the entire session. Useful for long-lived application servers but does nothing for the serverless connection crisis. Contrast with transaction mode. (Chapter 15)
SolidCache / SolidCable / SolidQueue: Rails 8's PostgreSQL-backed replacements for Redis. SolidQueue handles background jobs (37signals processes 20 million jobs per day with it), SolidCache handles caching, and SolidCable handles WebSocket connections via database polling. Together, they eliminate Redis from the default Rails stack. (Chapter 2, Chapter 9, Chapter 19)
Stale: The industry-standard term for a materialized view whose data does not reflect the current state of the underlying tables. Freshness depends on the refresh strategy and interval. Not "dirty" — stale. (Chapter 5)
Supavisor: A connection pooler written in Elixir, designed for multi-tenant architectures. Built for Supabase's platform, handling connection multiplexing for thousands of tenant databases simultaneously. Peaks at approximately 21,700 TPS with higher per-query latency than PgBouncer or PgCat, reflecting its multi-tenant coordination overhead. (Chapter 17)
Table partitioning: Splitting a table into child tables on the same server, organized by a partition key. Declarative partitioning (available since PostgreSQL 10) supports range, list, and hash strategies. Not sharding — partitioning keeps all data on one server. Enables partition pruning, simpler maintenance, and efficient time-series queries. (Chapter 16)
Three-layer cache model: A caching architecture using PostgreSQL alone: materialized views for aggregated, multi-table, computed data (the hard cache invalidation problems); UNLOGGED tables for simple key-value caching (session data, rate limiting); and framework caching for individual hot records. Covers every caching need without external services. (Chapter 6)
Transaction mode: A connection pooler mode where the client receives a database connection only for the duration of a single transaction. The moment the transaction completes, the connection returns to the pool. The correct mode for serverless workloads. Constraints: prepared statements may not persist across transactions, session-level SET commands are lost, and LISTEN/NOTIFY requires a persistent connection. (Chapter 15, Chapter 17)
tsvector: PostgreSQL's built-in full-text search data type. Stores a sorted list of lexemes (normalized words) with positional information. Supports tokenization, stemming, stop word removal, boolean operators, phrase matching, and relevance ranking via ts_rank. Available since PostgreSQL 8.3 (2008). Indexable with GIN. (Chapter 4)
UNLOGGED table: A PostgreSQL table that bypasses the Write-Ahead Log for faster writes — approximately 2.9 times faster than regular tables. Data persists across clean restarts but is truncated after crashes. The correct durability profile for session storage, temporary caching, and ephemeral data. The Redis SET/GET replacement. (Chapter 1, Chapter 6, Chapter 16)
VACUUM: The PostgreSQL operation that reclaims storage from dead tuples and updates visibility information. VACUUM ANALYZE additionally updates table statistics for the query planner. Essential for materialized views refreshed with CONCURRENTLY. Schedule explicitly after refresh if autovacuum cannot keep pace. (Chapter 4, Chapter 5)
Write-Ahead Log (WAL): PostgreSQL's transaction durability mechanism. Every data change is written to the WAL before being applied to the data files, ensuring crash recovery. WAL entries also drive streaming replication to read replicas. Both refresh methods generate WAL; CONCURRENTLY may generate more WAL than standard refresh when many rows change. (Chapter 4)
Zombie connection: A database connection held open by a frozen serverless container (e.g., AWS Lambda) that is no longer executing queries. Occupies a PostgreSQL connection slot and consumes memory while waiting for queries that will never arrive. Zombie connections accumulate during traffic spikes faster than they expire, contributing to connection exhaustion. (Chapter 15)