ProgrammingBackend Developer

Explain the differences between the UPDATE, MERGE, and INSERT ... ON DUPLICATE KEY UPDATE commands in SQL. In which cases is each approach preferable, and how do these tools affect performance and data integrity?

Pass interviews with Hintsage AI assistant

Answer

SQL provides various ways to update or insert data into a table:

  • UPDATE — modifies existing records that match the condition.
  • MERGE (or UPSERT) — combines the logic of inserting and updating in one command: if the row is found — it updates; if not — it adds.
  • INSERT ... ON DUPLICATE KEY UPDATE (MySQL) or INSERT ... ON CONFLICT DO UPDATE (PostgreSQL) — inserts a new row and updates the existing one on key conflicts.

When and what to use:

  • If it is certain that records do not exist yet — use INSERT.
  • For updating existing records — UPDATE with conditions.
  • If it is unknown whether a record exists, it is more efficient to use MERGE or UPSERT — this saves complexity and reduces the number of SQL queries.
  • UPSERT operations are usually faster with a large number of records and minimize race conditions.

Example (PostgreSQL):

INSERT INTO employees(id, name, salary) VALUES (1, 'Ivan', 100000) ON CONFLICT (id) DO UPDATE SET salary = EXCLUDED.salary;

Impact on performance and integrity:

  • Multiple separate UPDATE/INSERT may lead to race conditions and blocking.
  • MERGE/UPSERT is more efficient during bulk migration and maintains atomicity.

Trick Question

What happens if you try to perform an UPSERT on a unique field while simultaneously inserting and updating the same row within the same transaction?

Correct answer: A locking/conflict error may occur because the transaction attempts to change the same row twice. The operation will either not complete or will result in a deadlock error depending on the DBMS.

Example:

BEGIN; INSERT INTO users(id, name) VALUES (1, 'Oleg') ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name; UPDATE users SET name = 'Petr' WHERE id = 1; COMMIT;

With incorrect order and isolation level, a conflict of changes may arise.


Story #1

In one data migration logic, a combination of SELECT followed by INSERT or UPDATE was used, which sometimes resulted in another transaction setting data earlier, causing a "duplicate key" error. This led to frequent failures of nightly data exports, requiring the logic to be replaced with UPSERT.


Story #2

In a MySQL project, ON DUPLICATE KEY UPDATE was incorrectly applied, which did not correctly account for composite unique keys. As a result, some data was not updated and duplicates appeared. The issue was only detected on the production stand.


Story #3

In a project, UPDATE was used instead of UPSERT for synchronizing user data. In case of connection failures, part of the data was lost as the update was skipped for new users. After analysis, MERGE/UPSERT was fully implemented.