ProgrammingBackend Developer, BI Analyst

How to implement reliable conditional aggregation in SQL when building reports with multiple filters? What nuances exist when mixing HAVING, CASE, and aggregate functions?

Pass interviews with Hintsage AI assistant

Answer.

Conditional aggregation is a classic reporting task. Initially, separate queries with filters were created to count different metrics. Soon, a more compact solution emerged — conditional aggregation using CASE inside aggregate functions (e.g., SUM(CASE WHEN ...)). Problems arise when combining filters, groupings, and summary headers: it's easy to get incorrect sums or misinterpret the result.

Solution: use conditional aggregation inside aggregate functions. For example, to display the number of orders "in progress" and "completed" for all employees:

SELECT employee_id, SUM(CASE WHEN status = 'processing' THEN 1 ELSE 0 END) as processing_count, SUM(CASE WHEN status = 'done' THEN 1 ELSE 0 END) as done_count FROM Orders GROUP BY employee_id;

Key features:

  • Regardless of the WHERE filter, all necessary sums are calculated in a single query.
  • CASE inside SUM allows constructing complex multi-metric reports.
  • HAVING applies to already aggregated data, for post-filtering the grouping result.

Trick questions.

What happens if the CASE condition returns NULL instead of 0?

The SUM aggregate function ignores NULL. Therefore, if you write CASE WHEN ... THEN 1 END, the missed rows will not be counted. It's better to always explicitly set ELSE 0.

SUM(CASE WHEN status = 'processing' THEN 1 ELSE 0 END)

Can a WHERE filter change the final sum by statuses during conditional aggregation?

Yes: if the main WHERE limits the selection (e.g., WHERE region = 'west'), then the counts will be made only on the filtered data. For global totals, use a subquery or remove the filter.

Can HAVING be used to filter rows before grouping?

No. HAVING filters already grouped data by aggregates. Filter the initial rows through WHERE.

Common mistakes and anti-patterns

  • Missing ELSE, CASE returns NULL, totals are incorrect.
  • Mixing WHERE and HAVING, getting the wrong rows.
  • Using multiple queries instead of a single one with CASE.

Real-life example

Negative case

In an analytical report, used:

SUM(CASE WHEN status = 'approved' THEN 1 END)

There were many NULLs, the final sums were understated. Due to the WHERE filter, some necessary rows were lost.

Pros:

  • Short code.

Cons:

  • Incorrect results, the report was explained incorrectly to the business.

Positive case

Used:

SUM(CASE WHEN status = 'approved' THEN 1 ELSE 0 END)

The code was maintained in a single query, with filters on totals through HAVING.

Pros:

  • Accurate, transparent metrics.
  • Easy to add a new status.

Cons:

  • Queries become longer, carefulness with filters is needed.