Upsert = UPDATE + INSERT
An upsert is a database operation that:
- INSERTS a row if it does not exist
- UPDATES the existing row if it already exists
👉 It is typically based on a unique key or primary key.
Why upserts exist (say this at interview)
Upserts solve:
- Duplicate inserts on retries
- Race conditions between check-then-insert
- Idempotent writes
- Concurrent writes in distributed systems
Instead of:
SELECT ...;
IF NOT EXISTS THEN INSERT;
which is unsafe),
you do one atomic operation.
Upserts in different databases
PostgreSQL (most common)
INSERT INTO users (id, email, name)
VALUES (1, 'a@mail.com', 'Stanley')
ON CONFLICT (id)
DO UPDATE SET
email = EXCLUDED.email,
name = EXCLUDED.name;
EXCLUDED = the row you tried to insert
Conflict target must have a unique constraint
Upsert vs plain INSERT (important)
❌ Plain insert:
INSERT INTO orders (id, amount) VALUES (1, 100);
Retry → error or duplicate
✅ Upsert:
INSERT INTO orders (id, amount)
VALUES (1, 100)
ON CONFLICT (id) DO NOTHING;
Retry → safe, idempotent ✔
Common upsert strategies
1. Insert or do nothing
ON CONFLICT (id) DO NOTHING
Used when:
- You only care about first write
- Deduplication
2. Insert or overwrite
DO UPDATE SET status = EXCLUDED.status;
Used when:
- You want last write wins
- Sync from external system
3. Insert or update conditionally
Very senior-level pattern.
Upserts and idempotence (connect the dots)
Upserts are one of the main tools to achieve idempotence:
| Retry | Result |
|---|---|
| First call | Insert |
| Second call | Update or no-op |
| Final state | Same |
Perfect for:
- REST POST retries
- Kafka consumers
- Payment processing
- Event sourcing projections
Typical interview pitfalls ❌
- “Upsert is select-then-insert” → ❌ race condition
- “Upsert doesn’t need unique index” → ❌ wrong
- “Transaction makes it safe” → ❌ still unsafe under concurrency
Short interview-ready answer
An upsert is a single atomic database operation that inserts a row if it doesn’t exist or updates it if it does, usually based on a unique constraint.
It’s commonly used to make writes idempotent and safe under retries and concurrency.