Background:
Standard SQL was originally designed as a declarative language without elements of classical programming (loops, branching, direct jumps), but extensions like T-SQL/PLSQL have introduced constructs such as WHILE, CASE, and even an imitation of GOTO.
Problem:
Iterative operations (processing batches of records one by one) often lead to performance degradation, especially when dealing with large volumes, if they are not replaced with a "set-based" approach. Branching, CASE, and IF are very convenient, but excessive nesting can hinder code readability and predictability.
Solution:
The use of control structures should be justified only when it is impossible to rely on bulk (set) processing! For complex calculations, a small loop, trigger, or CASE is permissible. For mass processing, it is better to use window functions or UPDATE with subqueries.
Example code (T-SQL):
DECLARE @i INT = 1 WHILE (@i <= 5) BEGIN IF @i % 2 = 0 PRINT CONCAT('Even: ', @i) ELSE PRINT CONCAT('Odd: ', @i) SET @i = @i + 1 END
CASE expression:
SELECT num, CASE WHEN num % 2 = 0 THEN 'even' ELSE 'odd' END AS parity FROM numbers
Key features:
Can CASE be used for filtering rows, like WHERE?
No! CASE returns different values but does not filter rows. A common mistake is to "filter" through CASE instead of WHERE, resulting in an incorrect outcome.
What is the difference between WHILE and CURSOR — are they not the same?
WHILE is a basic loop with user-controlled variable management; CURSOR works through table records, holding a reference to a row. CURSOR is more resource-intensive and often operates much slower for large datasets.
Which approach is faster for bulk processing: WHILE loop with UPDATE or one set-based UPDATE?
In 99% of cases, a single set-based UPDATE (or INSERT) is significantly faster than a loop through individual records (even if it seems "more flexible").
-- Incorrect approach DECLARE @id INT = 1 WHILE (@id <= 100000) BEGIN UPDATE t SET flag=1 WHERE id=@id SET @id = @id + 1 END -- Correct UPDATE t SET flag=1 WHERE id BETWEEN 1 AND 100000
In a project for updating the status of a million orders, a loop with UPDATE for each id was written. The processing lasted ~8 hours. When it failed halfway, everything was lost, and it had to be fixed manually.
Pros:
Cons:
Refactored to set-based UPDATE with a single expression. Execution time dropped to 6 minutes, processing is atomic.
Pros:
Cons: