Java.DBMigrationTools.How do you test migrations during pull request validation?

To validate migrations in a PR, you want two different guarantees:

  1. Correctness: migrations apply cleanly and produce the expected schema/data.
  2. Safety: they won’t brick prod (checksum edits, blocking DDL, non-idempotent backfills, etc.).

Here’s what a solid PR pipeline looks like.

1) Fast checks (every PR, always)

A. Lint + policy checks (static)

  • Enforce naming/version conventions (Flyway version uniqueness, Liquibase id uniqueness).
  • Block “editing old migrations” (append-only rule).
  • SQL linting (optional) + forbidden patterns:
    • Postgres: CREATE INDEX without CONCURRENTLY for “large-table” migrations
    • ALTER TABLE ... SET NOT NULL without a backfill step
    • DROP COLUMN without a two-step contract pattern
    • BEGIN; ... CREATE INDEX CONCURRENTLY ... COMMIT; (invalid in Postgres)

B. Tool validation

  • Flyway: flyway validate (checksum, ordering, locations)
  • Liquibase: liquibase validate

C. Dry-run artifact (reviewable)

  • Flyway: generate dry run SQL (-dryRunOutput=...)
  • Liquibase: updateSQL
    Publish as a PR artifact so reviewers can see what will run.

2) Apply migrations on a fresh ephemeral database (every PR)

Spin up DB via Docker/Testcontainers and run migrations “from empty”.

What you verify

  • Migration tool exits 0
  • Schema history table reflects expected state
  • Basic smoke queries succeed (tables exist, indexes exist, constraints exist)

Implementation options (pick one)

  • Testcontainers in a JUnit test (my favorite in Java projects)
  • Docker Compose in CI job
  • Ephemeral DB service provided by CI

Key requirement: same DB engine/version as prod.

3) Apply migrations on a “baseline snapshot” (nightly or for risky PRs)

From-empty doesn’t catch:

  • lock duration
  • table rewrites
  • big backfills
  • collation/encoding edge cases
  • real data violations (duplicates/nulls)

So add a heavier job that:

  • restores a sanitized prod snapshot (or a representative dataset)
  • runs the new migrations
  • measures duration and checks for blocking/timeout failures

Many teams run this nightly, or only when db/** changed.

4) Assertions: schema + data expectations

After migration, run checks:

  • Expected tables/columns exist
  • Indexes exist and are valid
  • Critical constraints exist
  • For data migrations: spot-check counts, null rates, uniqueness

In Java, this can be:

  • a small set of SQL assertions
  • or integration tests that load Spring context and hit a repository/query

5) “Rollback testing” (what it means in Flyway world)

Not actual rollback. Instead:

  • verify forward-only safety
  • verify compatibility (expand–migrate–contract):
    • run migrations
    • run app tests against the migrated schema
    • (optional advanced) run the previous app version against the expanded schema to ensure rollback compatibility

That last one is a very senior move.

6) CI pipeline outline (what I’d implement)

PR job (fast):

  1. Policy lint: no edits to applied migrations + naming uniqueness
  2. Tool validate
  3. Dry-run SQL artifact
  4. Testcontainers DB → apply migrations from empty → smoke checks

Optional heavy job (nightly / db-changes only):
5. Restore snapshot → apply migrations → perf/lock thresholds → smoke queries

Interview-ready answer

“In PR validation, we run validate + generate dry-run SQL for review, then apply migrations to an ephemeral DB (same engine/version as prod) and run smoke assertions. We also enforce append-only migrations and block risky patterns like transactional + non-transactional DDL mixes. For more confidence, we run a heavier job on a sanitized prod snapshot nightly or when db files change, to catch lock/perf and data-quality issues.”

This entry was posted in Без рубрики. Bookmark the permalink.