SQLProgrammingSenior Database Engineer

Which specific property of PostgreSQL advisory locks allows session-scoped mutexes to prevent duplicate business key ingestion without creating row-level contention or table bloat?

Pass interviews with Hintsage AI assistant

Answer to the question.

History of the question.

Advisory locks first appeared in PostgreSQL 8.2 to provide lightweight, application-level synchronization primitives that operate outside the MVCC tuple visibility system. They were designed for workflows such as queue processing and idempotent ingestion where table-based locking would be semantically inappropriate or performance-prohibitive. Unlike row-level locks which are tied to specific table tuples and recorded in the xmax system column, advisory locks reside entirely within the shared-memory lock manager, offering a mechanism to marshal access to abstract resources without generating dead tuples or WAL traffic.

The problem.

In high-concurrency idempotent ingestion pipelines, enforcing uniqueness on business keys (e.g., external UUIDs) via traditional INSERT ... ON CONFLICT or SELECT FOR UPDATE creates severe bottlenecks. Row-level approaches require writing to the heap to set lock bits, which bloats tables, accelerates VACUUM pressure, and causes hotspots in unique indexes during conflict resolution. The challenge is to provide mutual exclusion for logical entities—such as a hashed business key—without touching the storage layer, while ensuring that lock failures do not leak resources into persistent connection pools.

The solution.

The critical property is that advisory locks are stored exclusively in the LOCKTAG hash table within shared memory, using LOCKMETHOD_ADVISORY, and thus never modify the underlying relation pages. By employing pg_advisory_xact_lock(hashtext(business_key)), the application acquires a transaction-scoped mutex that automatically releases upon COMMIT or ROLLBACK, preventing the lock leakage associated with session-level pg_advisory_lock. This approach eliminates table bloat and index contention because the lock exists only as a lightweight entry in memory, as demonstrated below:

BEGIN; -- Acquire transaction-bound lock on hashed business key SELECT pg_advisory_xact_lock(hashtext('a1b2c3d4')); -- Safe to insert; no unique index contention if another session holds the lock INSERT INTO events (business_key, payload) VALUES ('a1b2c3d4', '{"event":"click"}') ON CONFLICT (business_key) DO NOTHING; COMMIT;

Situation from life

The data platform team at a telemetry company needed to guarantee exactly-once processing for 50,000 events per second ingested from Kafka into PostgreSQL, where each event carried a client-generated UUID that served as the idempotency key. Initial load tests using INSERT ... ON CONFLICT DO NOTHING on a unique UUID column caused severe tail latency due to spinlock contention on the unique B-tree index and rapidly accumulating bloat from HOT update failures. The WAL generation rate doubled during peak hours, threatening replication lag and storage capacity.

One proposed fix involved pre-checking for key existence using SELECT * FROM events WHERE business_key = $1 FOR UPDATE, then inserting only if the result was empty. While this prevented duplicates, it forced every writer to acquire a row lock on either the existing row or a surrogate reservation row, creating a massive hotspot on the reservation table’s pages. The approach generated substantial table bloat—requiring VACUUM to reclaim dead tuples every fifteen minutes—and could not prevent race conditions between the check and the insert without holding the lock for the entire transaction duration, severely limiting throughput.

The architecture team suggested moving coordination to an external Redis cache using SETNX operations to gate inserts. This eliminated database bloat and reduced PostgreSQL load, but introduced critical failure modes: network partitions between the Redis cluster and the database could allow duplicate inserts when the Redis lock expired but the PostgreSQL transaction had not yet committed. Furthermore, maintaining consistency across two distributed systems added operational complexity and required implementing Redlock or similar algorithms, increasing latency by approximately 5 milliseconds per operation.

The chosen design leveraged PostgreSQL’s native advisory locks via pg_advisory_xact_lock(hashtext(business_key)), acquiring a transaction-bound lock on the hashed UUID before attempting the insert. Because these locks live only in shared memory and do not touch the heap, they impose zero storage overhead and release automatically on transaction termination, preventing the lock leakage observed with session-level locks. To avoid undetectable deadlocks, the application layer sorted all UUIDs in each batch by their hashed integer value before acquiring locks, ensuring a global ordering protocol across all concurrent workers.

Advisory locks were selected because they provided the lowest latency (sub-millisecond acquisition) and zero storage side-effects while maintaining strict correctness without external dependencies. Unlike the Redis approach, the lock’s lifetime was bound to the database transaction, guaranteeing atomicity between lock acquisition and insert commit. Unlike SELECT FOR UPDATE, no table bloat was generated, and unlike raw ON CONFLICT, the unique index was never stressed by conflicting concurrent inserts because serialization occurred before the heap access.

After deployment, the ingestion pipeline sustained 80,000 events per second with p99 latency under 10 milliseconds, compared to previous 200ms spikes during contention peaks. Table bloat dropped to negligible levels, allowing autovacuum to run only during off-peak hours, and WAL volume decreased by 40%, significantly reducing archival storage costs and replica lag. The system maintained exactly-once semantics through multiple database restarts and connection pool churns without a single duplicate event or deadlock-induced timeout.

What candidates often miss

Why does using pg_advisory_lock (session-scoped) instead of pg_advisory_xact_lock risk connection pool exhaustion and duplicate ingestion in a high-throughput worker architecture?

Candidates frequently fail to recognize that pg_advisory_lock persists until explicitly unlocked or the session disconnects, even if the transaction aborts. In a pooled environment where workers reuse long-lived connections, a logic error or exception that bypasses the unlock call leaves the lock held indefinitely, causing subsequent workers processing the same business key to wait forever. pg_advisory_xact_lock should be used instead because it ties the lock lifetime to the transaction boundary, ensuring automatic release on ROLLBACK and preventing mutex leakage that would otherwise starve the worker pool and stall the ingestion pipeline.

How does the absence of a total ordering guarantee when acquiring multiple advisory locks lead to undetectable deadlocks, and what specific application pattern eliminates this hazard?

Unlike row-level deadlocks which PostgreSQL’s deadlock_timeout detector resolves by killing a victim transaction, advisory lock deadlocks are invisible to the engine because they occur in user-defined namespaces. If Worker A locks resource X then Y, while Worker B locks Y then X, both sessions wait indefinitely without error. The mandatory pattern is to sort all resource identifiers (e.g., hashtext(uuid) values) in a strict monotonic order (ascending or descending) across the entire application before issuing any lock requests. This global ordering ensures that wait-for graphs remain acyclic, making circular dependencies impossible and eliminating the risk of silent hangs.

What shared memory limitation restricts the number of advisory locks a single transaction can hold, and how does exceeding max_locks_per_transaction manifest compared to row-level lock exhaustion?

Many candidates assume advisory locks are infinite, but they consume entries in the shared lock table governed by the max_locks_per_transaction configuration parameter (default 64). Holding more locks than this limit in one transaction raises ERROR: out of shared memory (SQLSTATE 53200), immediately aborting the transaction. This contrasts with row-level locks, where exceeding limits typically triggers a lock upgrade or waits depending on lock_timeout, but does not exhaust a fixed shared memory pool. The mitigation involves batching operations into smaller sub-transactions or aggregating multiple logical resources under a single advisory lock key via composite hashing, rather than attempting to lock thousands of individual keys simultaneously.