Conditional aggregation is used when you need to aggregate values (e.g., sum or count) based on different conditions within a single query. Historically, to achieve such goals, SQL developers had to write multiple subqueries or perform joins and groupings separately for each category, resulting in less readable and less efficient code.
The problem lay in the difficulty of simultaneously calculating multiple aggregates with different conditions — each individual aggregate usually requires filtering, and simply applying SUM() or COUNT() does not account for the necessary condition within the aggregate function.
Solution — using the CASE construct within an aggregate function, which allows "separating" aggregates based on different conditions on the fly, without resorting to multiple joins:
Example code:
SELECT department, SUM(CASE WHEN status = 'approved' THEN amount ELSE 0 END) AS approved_sum, SUM(CASE WHEN status = 'pending' THEN amount ELSE 0 END) AS pending_sum, COUNT(CASE WHEN status = 'rejected' THEN 1 END) AS rejected_count FROM payments GROUP BY department;
Key features:
Can I do without the CASE construct for conditional aggregation?
An incorrect answer would be to use WHERE directly in the query along with the aggregate function. In fact, WHERE filters rows before aggregation, not within each aggregate column.
Example code (incorrect way):
SELECT COUNT(*) FROM payments WHERE status = 'approved'; SELECT COUNT(*) FROM payments WHERE status = 'pending';
These queries cannot be concatenated into a single result line based on different conditions without CASE.
What happens if you use NULL within the conditional expression?
If CASE does not return an alternative value, it will default to NULL, and the aggregate function will ignore NULL.
Example code:
SUM(CASE WHEN status = 'approved' THEN amount END) -- If the status is not 'approved', then NULL, and the row is ignored in SUM
Can I use IF instead of CASE?
In some SQL dialects (e.g., MySQL) this is possible, but such code becomes non-portable between different databases. In universal queries, always use CASE.
An analyst tried to calculate sums for each status by writing several separate subqueries. The external report became complex, and adding new statuses required rewriting the code each time.
Pros:
A developer used CASE in one general query, creating a universal report with the ability to accommodate new statuses by simply editing one query.
Pros: