Conditional aggregation is a classic reporting task. Initially, separate queries with filters were created to count different metrics. Soon, a more compact solution emerged — conditional aggregation using CASE inside aggregate functions (e.g., SUM(CASE WHEN ...)). Problems arise when combining filters, groupings, and summary headers: it's easy to get incorrect sums or misinterpret the result.
Solution: use conditional aggregation inside aggregate functions. For example, to display the number of orders "in progress" and "completed" for all employees:
SELECT employee_id, SUM(CASE WHEN status = 'processing' THEN 1 ELSE 0 END) as processing_count, SUM(CASE WHEN status = 'done' THEN 1 ELSE 0 END) as done_count FROM Orders GROUP BY employee_id;
Key features:
What happens if the CASE condition returns NULL instead of 0?
The SUM aggregate function ignores NULL. Therefore, if you write CASE WHEN ... THEN 1 END, the missed rows will not be counted. It's better to always explicitly set ELSE 0.
SUM(CASE WHEN status = 'processing' THEN 1 ELSE 0 END)
Can a WHERE filter change the final sum by statuses during conditional aggregation?
Yes: if the main WHERE limits the selection (e.g., WHERE region = 'west'), then the counts will be made only on the filtered data. For global totals, use a subquery or remove the filter.
Can HAVING be used to filter rows before grouping?
No. HAVING filters already grouped data by aggregates. Filter the initial rows through WHERE.
In an analytical report, used:
SUM(CASE WHEN status = 'approved' THEN 1 END)
There were many NULLs, the final sums were understated. Due to the WHERE filter, some necessary rows were lost.
Pros:
Cons:
Used:
SUM(CASE WHEN status = 'approved' THEN 1 ELSE 0 END)
The code was maintained in a single query, with filters on totals through HAVING.
Pros:
Cons: