Answer to the question.

History of the question

This challenge originates from capacity planning and resource allocation domains, particularly in systems like hotel reservation platforms, cloud infrastructure auto-scaling, and medical facility scheduling. Early solutions relied on cursor-based iteration or external application logic to iterate through timelines, suffering from severe performance penalties on large datasets. The advent of ANSI SQL:2003 window functions enabled purely relational approaches to temporal analysis, allowing databases to handle complex interval arithmetic efficiently within the engine.

The problem

Given a table of resource bookings with start_time and end_time timestamps, the objective is to determine the maximum number of concurrent reservations active at any single moment, and identify the specific time window(s) when this peak occurred. The complexity arises because standard aggregation collapses temporal data, while simple joins create Cartesian explosion when intervals overlap. A robust solution must treat interval start and end as discrete events, calculating a running tally of active resources at every transition point.

The solution

The canonical approach transforms intervals into discrete events using UNION ALL to separate starts (weight +1) and ends (weight -1), then applies a cumulative sum via SUM() OVER (ORDER BY timestamp) to track concurrency. To handle simultaneous starts/ends deterministically, end events must be processed before start events at the same timestamp (using a secondary sort key). Finally, wrap this in a CTE to filter for the maximum concurrency value.

WITH events AS (
  SELECT start_time AS ts, 1 AS delta, 0 AS is_end
  FROM reservations
  UNION ALL
  SELECT end_time AS ts, -1 AS delta, 1 AS is_end
  FROM reservations
),
concurrency AS (
  SELECT ts,
         SUM(delta) OVER (ORDER BY ts, is_end, delta 
                          ROWS UNBOUNDED PRECEDING) AS concurrent_count
  FROM events
)
SELECT MAX(concurrent_count) AS peak_concurrency
FROM concurrency;

To find the specific time windows of peak usage, join back to identify periods between consecutive timestamps where the count equals the maximum.

Situation from life

A SaaS platform tracked millions of video transcoding jobs in a table jobs with started_at and completed_at timestamps. The operations team needed to identify exact periods when GPU utilization hit 100% to optimize queue scheduling.

One approach considered was using a cursor to iterate chronologically, incrementing a counter on starts and decrementing on ends. While straightforward for developers familiar with imperative languages, this method processed rows sequentially, taking over 45 minutes on production data and locking tables. It also required complex transaction management to ensure read consistency.

Another alternative involved generating a time-series table with one row per minute and joining it against intervals using BETWEEN predicates. This produced accurate results but required billions of rows for minute-level precision over a year, consuming terabytes of temporary storage and failing to capture sub-minute peak spikes.

The team selected the event-based UNION ALL approach with ANSI SQL window functions. By treating starts and ends as +1/-1 events, the query executed in 12 seconds using standard B-tree indexes on the timestamp columns. This method correctly handled edge cases where jobs ended exactly as others began.

The analysis revealed that peak concurrency occurred during nightly batch processing between 02:00 and 02:07 UTC, reaching 847 simultaneous jobs. By implementing dynamic queue throttling specifically for this window, they prevented cascade failures and reduced infrastructure over-provisioning by 30%.

What candidates often miss

How do you handle zero-duration intervals (start_time = end_time) without incorrectly inflating the concurrency count?

Zero-duration intervals represent instantaneous events that should not contribute to concurrent load. If treated as standard intervals, they might be counted as active during their own end event. The solution requires assigning a strict ordering key: process end events (-1) before start events (+1) when timestamps collide, and exclude zero-duration intervals from the event stream entirely or assign them a delta of 0, depending on business logic. In ANSI SQL, this is implemented by adding a discriminator column: ORDER BY ts, is_end ASC, delta ASC, ensuring terminations decrement the count before new allocations increment it at the identical timestamp.

Why does the event-based approach potentially return incorrect results if you use UNION instead of UNION ALL when combining start and end events?

UNION implicitly performs a DISTINCT operation, collapsing duplicate timestamps. If two reservations start at exactly 2023-10-01 10:00:00, UNION reduces this to a single row, causing the cumulative sum to miss one +1 increment. This results in under-counting concurrency. UNION ALL preserves every individual interval boundary as a separate event, which is mathematically required because each reservation contributes independently to the total load. Candidates often overlook this distinction, assuming timestamp uniqueness where multiplicity is essential for correct aggregation.

When calculating the specific time windows of peak concurrency (not just the maximum value), how do you avoid gaps in the output if multiple consecutive time periods share the same peak value?

After identifying the maximum concurrency value, joining back to find all timestamps where this occurs yields discrete points. To reconstruct continuous duration blocks, you must apply the Gaps and Islands technique: use LAG() to check if the previous row was also at peak, and LEAD() to check if the next row is at peak. Only output rows where the previous value differs (island starts) or the next value differs (island ends). Then pair these using ROW_NUMBER() to create start-end pairs. Candidates frequently output raw timestamp lists or use GROUP BY on the count value, which loses the temporal adjacency information needed to distinguish separate peak incidents from one continuous peak period.

In scenarios demanding analysis of temporal overlap density, how would you calculate the precise moments when resource utilization reached its absolute peak, using strictly **ANSI SQL** set-based logic without procedural iteration?