Java.DBMigrationTools.What is upsert ?

Upsert = UPDATE + INSERT

An upsert is a database operation that:

  • INSERTS a row if it does not exist
  • UPDATES the existing row if it already exists

👉 It is typically based on a unique key or primary key.


Why upserts exist (say this at interview)

Upserts solve:

  • Duplicate inserts on retries
  • Race conditions between check-then-insert
  • Idempotent writes
  • Concurrent writes in distributed systems

Instead of:

SELECT ...;
IF NOT EXISTS THEN INSERT;

which is unsafe),

you do one atomic operation.


Upserts in different databases

PostgreSQL (most common)

INSERT INTO users (id, email, name)
VALUES (1, 'a@mail.com', 'Stanley')
ON CONFLICT (id)
DO UPDATE SET
    email = EXCLUDED.email,
    name  = EXCLUDED.name;

EXCLUDED = the row you tried to insert

Conflict target must have a unique constraint

Upsert vs plain INSERT (important)

❌ Plain insert:

INSERT INTO orders (id, amount) VALUES (1, 100);

Retry → error or duplicate

✅ Upsert:

INSERT INTO orders (id, amount)
VALUES (1, 100)
ON CONFLICT (id) DO NOTHING;

Retry → safe, idempotent ✔


Common upsert strategies

1. Insert or do nothing

ON CONFLICT (id) DO NOTHING

Used when:

  • You only care about first write
  • Deduplication

2. Insert or overwrite

DO UPDATE SET status = EXCLUDED.status;

Used when:

  • You want last write wins
  • Sync from external system

3. Insert or update conditionally

Very senior-level pattern.


Upserts and idempotence (connect the dots)

Upserts are one of the main tools to achieve idempotence:

RetryResult
First callInsert
Second callUpdate or no-op
Final stateSame

Perfect for:

  • REST POST retries
  • Kafka consumers
  • Payment processing
  • Event sourcing projections

Typical interview pitfalls ❌

  • “Upsert is select-then-insert” → ❌ race condition
  • “Upsert doesn’t need unique index” → ❌ wrong
  • “Transaction makes it safe” → ❌ still unsafe under concurrency

Short interview-ready answer

An upsert is a single atomic database operation that inserts a row if it doesn’t exist or updates it if it does, usually based on a unique constraint.
It’s commonly used to make writes idempotent and safe under retries and concurrency.

This entry was posted in Без рубрики. Bookmark the permalink.