To effectively build aggregations with "level" filtering (for example, first filter, then calculate sums by groups, then by the entire dataset), window functions (OVER()) and nested GROUP BY are used.
Example: Let's find the maximum order amount by each manager, but only among orders with the status 'paid', and then display the manager's name with the absolute maximum among all.
WITH PaidOrders AS ( SELECT ManagerID, SUM(OrderAmount) AS TotalPaid FROM Orders WHERE Status = 'paid' GROUP BY ManagerID ), WithMax AS ( SELECT *, MAX(TotalPaid) OVER() AS MaxTotalPaid FROM PaidOrders ) SELECT ManagerID, TotalPaid FROM WithMax WHERE TotalPaid = MaxTotalPaid;
This approach (CTE + window functions) allows for multi-level filtering and aggregation.
Trick: "What is the difference between applying the WHERE filter before grouping (GROUP BY) and using HAVING after? What does this often lead to in reports?"
Answer: WHERE removes rows before grouping, thus providing a strict input set. HAVING filters aggregated groups — therefore, it may temporarily "retain" extraneous rows if the filter is logically inconsistent. An incorrect filter location often leads to errors in final aggregates or incorrect report results.
-- Getting the sum only for 'paid', through WHERE SELECT ManagerID, SUM(OrderAmount) FROM Orders WHERE Status = 'paid' GROUP BY ManagerID; -- Or calculating sums for all, then trimming with HAVING SELECT ManagerID, SUM(OrderAmount) FROM Orders GROUP BY ManagerID HAVING SUM(OrderAmount) > 1000;
Story
Project: Sales reports, verification audit.
Mistake: The developer applied HAVING Status='paid' instead of WHERE, leading to aggregates incorrectly including unpaid orders, resulting in inaccurate annual KPIs for staff.
Story
Project: Banking analytics.
Mistake: An attempt to apply a window function to a complex aggregation without PARTITION BY led to aggregates being calculated over the entire table instead of the group. The department's budget was incorrectly calculated — it had to be restored manually.
Story
Project: Online store, order statistics.
Mistake: The nesting of window functions in a subquery was overlooked during optimization — causing the server to process data multiple times, the query became 20 times slower than just using double GROUP BY.