Java.DBMigrationTools.How can you ensure a schema migration is idempotent?

Idempotent means: running the migration twice leaves the DB in the same final state (or the second run is a no-op). In strict tools like Flyway, versioned migrations normally run once, but idempotency still matters for: reruns after partial failures, manual re-application, multi-tenant/shard rollouts, hotfix scripts, and repeatables.

1) Prefer “design-level idempotency” over “IF EXISTS everywhere”

Best approach: expand–migrate–contract

  • Expand (additive, safe if already present)
  • Migrate (backfill in batches, safe to resume)
  • Contract (drop old stuff later, guarded)

This naturally supports reruns because each step is either additive or resumable.

2) Use conditional DDL (DB-specific)

PostgreSQL examples

Create table

CREATE TABLE IF NOT EXISTS users (...);

Add column (PG 11+ supports IF NOT EXISTS)

ALTER TABLE users ADD COLUMN IF NOT EXISTS email_verified boolean;

Create index (PG supports IF NOT EXISTS; CONCURRENTLY cannot be in a txn)

CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_email_verified
ON users (email_verified);

Add constraint (no IF NOT EXISTS for all constraint types)
Pattern:

  • check catalog first, then add via DO block:
DO $$
BEGIN
  IF NOT EXISTS (
    SELECT 1
    FROM pg_constraint
    WHERE conname = 'chk_users_email_verified'
  ) THEN
    ALTER TABLE users
      ADD CONSTRAINT chk_users_email_verified
      CHECK (email_verified IN (true, false));
  END IF;
END $$;

MySQL examples

MySQL idempotency is trickier because ALTER TABLE ... ADD COLUMN IF NOT EXISTS depends on version (8.0.29+ supports some IF NOT EXISTS forms; behavior varies). Safer pattern:

  • query information_schema then conditionally execute in your migration framework (or use tool-specific preconditions).

3) Make data migrations resumable + safe

This is where most “non-idempotent” pain comes from.

Use deterministic updates

  • Avoid “update all rows blindly” without a guard.
  • Use predicates like:
UPDATE users
SET email_verified = false
WHERE email_verified IS NULL;

Second run becomes a no-op.

Upserts instead of inserts

  • Postgres: INSERT ... ON CONFLICT DO NOTHING/UPDATE
  • MySQL: INSERT ... ON DUPLICATE KEY UPDATE

Backfill in batches with checkpoints

For large tables:

  • backfill with LIMIT windows
  • store progress in a small “migration checkpoint” table, or use last processed id.

4) Separate transactional vs non-transactional steps

If you’re on Postgres:

  • CREATE INDEX CONCURRENTLY cannot run inside a transaction block.
    So an “idempotent” approach also means structuring migrations correctly:
  • migration A: schema changes (transactional)
  • migration B: concurrently build index (non-transactional)

If you retry, each step must be safe on its own.

5) Use tool features: Preconditions and “mark ran”

Liquibase (best support here)

  • preConditions like tableExists, columnExists, indexExists
  • Use onFail="MARK_RAN" when appropriate (careful: it can hide real problems)
    This is the cleanest way to encode idempotency without DB-specific SQL hacks.

Flyway

Flyway doesn’t have rich preconditions for versioned migrations; you typically:

  • write conditional SQL (DB-specific)
  • keep migrations small and fix-forward
  • use repeatables for objects that should converge (views/functions)

6) Build “idempotency tests”

In CI:

  1. apply migrations to an empty DB
  2. run migrate again
  3. assert:
    • no changes (or tool reports “0 migrations applied”)
    • repeatables behave as expected
    • no failures due to “already exists”

This catches a ton of real-world issues.

7) What I’d say in an interview

“I make migrations idempotent by designing them as additive steps and resumable backfills: use expand–migrate–contract, guard DDL with existence checks or Liquibase preconditions, and write DML so re-running becomes a no-op (e.g., only update NULLs, use upserts). I also split transactional and non-transactional operations like Postgres concurrent indexes into separate migrations and test by applying migrations twice in CI.”

This entry was posted in Без рубрики. Bookmark the permalink.