History of the question
Time-weighted averages emerged as a critical metric in industrial IoT and financial time-series analysis. Simple arithmetic means distort reality because sensor values persist until the next measurement. Prior to the ANSI SQL:2003 standard, calculating these averages required procedural cursors or expensive self-joins. These methods performed in O(n²) time complexity.
The introduction of LEAD and LAG window functions revolutionized this domain. They enabled single-pass, set-based interval calculations that run in O(n) time. This makes real-time analytics on billions of rows feasible within the database layer.
The problem
Given a table readings with columns device_id, ts (timestamp), and value, the goal is to compute a weighted average. Each row must contribute proportionally to the time delta until the next reading. Mathematically, this is $\frac{\sum (value_i \times (ts_{i+1} - ts_i))}{\sum (ts_{i+1} - ts_i)}$.
The final row presents a boundary condition. It has no subsequent timestamp, so its interval must be defined either as zero, extrapolated to a current time, or capped at a known end time. The solution must avoid cursors, user-defined functions, or self-joins to remain purely declarative.
The solution
Use the LEAD window function to project the next timestamp into the current row. Calculate the epoch difference to derive the weight. Then apply standard weighted mean formulas.
WITH weighted AS ( SELECT device_id, value, ts, COALESCE( EXTRACT(EPOCH FROM (LEAD(ts) OVER (PARTITION BY device_id ORDER BY ts) - ts)), 0 ) AS duration_seconds FROM readings ) SELECT device_id, SUM(value * duration_seconds) / NULLIF(SUM(duration_seconds), 0) AS time_weighted_avg FROM weighted GROUP BY device_id;
This approach uses PARTITION BY to ensure the window resets per device. This prevents interleaving timestamps from different sensors. The COALESCE handles the terminal row by assigning zero weight, effectively excluding it from the denominator.
A pharmaceutical manufacturing line monitors 200 bioreactors. Each emits temperature data at irregular intervals—every 10 seconds during heating phases, but every 30 minutes during idle holds. The quality team required a daily time-weighted average to ensure compliance. A simple average would overweight the rapid-fire heating samples and underweight the stable holds, potentially masking dangerous temperature excursions.
One proposed solution involved extracting all data into a Python pandas DataFrame. Engineers would calculate diff() on timestamps and compute the weighted mean. While flexible, this approach transferred gigabytes of data across the network. It also crashed the analytics workstation when processing quarter-end reports spanning 90 days of high-frequency data.
Another alternative used a correlated subquery to find the MIN(ts) greater than the current row for each device. This executed correctly on test sets of 1,000 rows. However, it exhibited quadratic degradation, taking 45 minutes for a single reactor's full history.
The team selected the ANSI SQL window function approach. By keeping computation inside the PostgreSQL cluster, the query leveraged parallel sequential scans and avoided network overhead. The final implementation processed 50 million rows across all reactors in under 12 seconds. This enabled real-time dashboard updates that allowed operators to spot thermal drift within minutes rather than hours.
How do you handle the final observation in each partition where no subsequent timestamp exists to define the interval weight?
Candidates frequently omit the boundary condition. This causes the terminal row's interval to evaluate as NULL, which SQL aggregates ignore. Consequently, the final reading's contribution is dropped, skewing the average. The correct approach uses COALESCE to substitute either zero or EXTRACT(EPOCH FROM (boundary_time - ts)) if the average must extend to a known end time like CURRENT_TIMESTAMP.
Why does the formula SUM(value * duration) / SUM(duration) mathematically represent the time-weighted average, and what happens if you use AVG(value) instead?
This computes the weighted arithmetic mean where duration serves as the weight $w_i$. Candidates often confuse this with a geometric mean or attempt to use AVG(value * duration), which produces a sum of products without normalization. Using AVG(value) treats each row equally, assuming uniform time steps, which violates the requirement that longer-lasting values exert greater influence.
How do temporal gaps or duplicate timestamps within the same partition affect the LEAD function's behavior, and why is PARTITION BY device_id essential?
Candidates sometimes forget that LEAD operates on the physical order of rows within the window specification. Without PARTITION BY device_id, the function calculates intervals across different sensors, creating nonsensical negative or massive durations. Additionally, if duplicate timestamps exist, LEAD returns the next distinct row regardless, potentially creating zero-second intervals. Candidates must decide whether to deduplicate first using DISTINCT or ROW_NUMBER() filtering to avoid division-by-zero errors.