Bulk UPDATE is a critical procedure when modifying a large number of rows in related tables. In SQL history, the typical implementation involves UPDATE with subquery or JOIN. The problem is that any bulk update operation without controlling the order of execution blocks multiple rows, causes lock escalation, and can lead to deadlocks during multiple updates.
Solution:
Example code (PostgreSQL):
UPDATE Orders o SET status = 'archived' FROM Customers c WHERE o.customer_id = c.id AND c.closed = TRUE AND o.status != 'archived';
Or in batches:
WITH upd AS ( SELECT o.id FROM Orders o JOIN Customers c ON o.customer_id = c.id WHERE c.closed = TRUE AND o.status != 'archived' LIMIT 10000 ) UPDATE Orders SET status = 'archived' WHERE id IN (SELECT id FROM upd);
Key features:
What will happen if you run UPDATE on similar tables simultaneously without separating ranges or using opposing filters?
Deadlock is likely to occur: processes block the same rows, waiting for each other. To avoid this, batches should not overlap or should be executed strictly sequentially.
Is there a difference between UPDATE through JOIN and subquery when it comes to bulk status change?
If appropriate indexes exist, the key difference lies only in readability and sometimes in the performance of a specific DBMS. JOIN is usually faster as it allows the optimizer to create a better plan.
When is it relevant to use TRUNCATE/DELETE instead of UPDATE?
If business logic allows — for example, when you need to physically remove archived records or reset a table, rather than just changing a status flag. However, for bulk status updates — only use UPDATE.
In a large online store, multiple UPDATEs to change order and customer statuses were run simultaneously, without dividing by intervals. The result: mutual blocks, forced rollbacks several times, and unrecorded data was lost.
Pros:
Cons:
Large selections were split into batches, executed strictly sequentially, and only necessary rows were processed according to the filter.
Pros:
Cons: