History of the question: Temporal interval analysis has challenged relational databases since the 1970s, as SQL was originally designed without native interval types. Early solutions relied on cursor-based iteration or Cartesian products between intervals, resulting in quadratic complexity. The introduction of window functions in SQL:2003 and the ROWS BETWEEN frame specification enabled efficient running aggregates, establishing the foundation for modern event-based concurrency calculations.
The problem: Determining maximum concurrency requires understanding the precise moments when state changes occur—specifically when a reservation begins or ends. The naive approach expands every interval into discrete time units (time-slicing), which is computationally prohibitive for long-duration stays. The core challenge lies in calculating how many intervals overlap at any specific point without materializing every moment of the timeline.
The solution: Employ a discrete event simulation pattern. Transform the interval table into an event stream using UNION ALL, assigning a weight of +1 to each check-in (start) and -1 to each check-out (end). By chronologically sorting these events and applying a running sum via SUM() OVER (ORDER BY ...) window functions, you derive the active count at every transition point. The maximum value of this running sum represents the peak concurrency.
WITH events AS ( SELECT check_in AS event_time, 1 AS delta FROM reservations UNION ALL SELECT check_out AS event_time, -1 AS delta FROM reservations ), concurrency AS ( SELECT event_time, SUM(delta) OVER ( ORDER BY event_time, delta DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS active_count FROM events ) SELECT MAX(active_count) AS peak_concurrency FROM concurrency;
Problem description: A luxury resort chain experienced mysterious overbooking incidents during holiday weekends despite their availability system reporting vacancies. The legacy query calculated daily occupancy by exploding each reservation into individual nightly rows using a recursive CTE, generating millions of rows for month-long stays. During New Year's Eve analysis, this approach required 12 seconds to execute and deadlocked the booking transaction table, preventing real-time reservations.
Solution A: Time-slice expansion with tally tables. The operations team initially suggested pre-generating a calendar table and joining it against reservations using event_date BETWEEN check_in AND check_out. This method provides intuitive daily aggregates compatible with standard GROUP BY clauses. Pros: Conceptually simple for business analysts, easy to integrate with existing BI tools. Cons: Generates O(N × D) rows where D is average duration, causing exponential growth; fails catastrophically with minute-level granularity or long-term leases; consumes excessive tempdb space.
Solution B: Interval tree with materialized paths. A senior architect proposed implementing a segment tree structure using nested sets to index interval boundaries, enabling logarithmic-time overlap queries. Pros: Optimal theoretical complexity for frequent updates and point queries. Cons: Requires complex triggers to maintain tree balance; violates ANSI SQL portability by relying on procedural extensions; introduces write amplification that harms the OLTP workload during booking spikes.
Solution C: Chronological event stream with running sums (chosen). The database team adopted the event-based approach, treating each reservation boundary as a delta operation. This reduced the dataset from millions of exploded rows to exactly 2N events (start and end for each reservation). Pros: O(N log N) complexity dominated by the sort operation, constant memory footprint, and pure ANSI SQL compatibility across PostgreSQL, Oracle, and SQL Server. Cons: Requires careful handling of simultaneous events and does not inherently identify which specific reservations contributed to the peak without additional joins.
Result: Query latency dropped from 12 seconds to 45 milliseconds. The analysis revealed that the true bottleneck was not room inventory (500 units) but elevator capacity, as 320 guests attempted simultaneous 6 PM check-ins. This insight prompted implementing staggered check-in tiers rather than constructing a new wing, saving $2M in capital expenditure while eliminating deadlocks.
Why does the solution require ORDER BY event_time, delta DESC specifically, and what happens if you omit the secondary sort on delta?
Candidates frequently ignore boundary condition semantics at shared timestamps. When one guest checks out at exactly 10:00 AM and another checks in at 10:00 AM, processing order determines whether the room appears occupied by two guests simultaneously. By sorting with delta DESC, we ensure -1 (departure) processes before +1 (arrival) at identical timestamps. Without this secondary sort, the running sum temporarily drops then jumps, potentially registering a phantom peak when the previous state was actually higher. This subtle ordering prevents off-by-one errors that lead to overbooking vulnerabilities in production systems.
How would you modify this query to identify which specific reservations were active during the peak concurrency moment, not merely the count?
Most candidates attempt to filter within the same CTE, failing to recognize that the peak might span a continuous interval rather than a single point. The robust approach requires a two-pass strategy: first, isolate the timestamp where active_count equals the maximum using a subquery or CTE, then join back to the original reservations table using the overlap predicate r.check_in <= peak.event_time AND r.check_out > peak.event_time. Candidates miss that multiple timestamps might share the same maximum value, necessitating a DISTINCT or EXISTS clause to avoid duplicate reservation listings when the peak plateau persists across several event transitions.
What modifications are necessary if business rules change such that check-out time is inclusive (guest occupies room until 11:59 PM) rather than exclusive, and how does this affect the event weighting?
Candidates often overlook interval boundary semantics. Inclusive end-points [start, end] create overlaps when one reservation ends and another begins on the same day. The solution requires converting inclusive boundaries to exclusive by adding an infinitesimal epsilon (or the next discrete time unit) to check-out times before generating the -1 event. Alternatively, adjust the join logic to use check_out >= event_time while keeping the running sum logic intact. Failure to adjust this transforms the discrete event model from half-open intervals to closed intervals, causing the algorithm to report conflicts where none exist and underestimate true capacity by exactly one unit during high-turnover periods.