To validate migrations in a PR, you want two different guarantees:
- Correctness: migrations apply cleanly and produce the expected schema/data.
- Safety: they won’t brick prod (checksum edits, blocking DDL, non-idempotent backfills, etc.).
Here’s what a solid PR pipeline looks like.
1) Fast checks (every PR, always)
A. Lint + policy checks (static)
- Enforce naming/version conventions (Flyway version uniqueness, Liquibase id uniqueness).
- Block “editing old migrations” (append-only rule).
- SQL linting (optional) + forbidden patterns:
- Postgres:
CREATE INDEXwithoutCONCURRENTLYfor “large-table” migrations ALTER TABLE ... SET NOT NULLwithout a backfill stepDROP COLUMNwithout a two-step contract patternBEGIN; ... CREATE INDEX CONCURRENTLY ... COMMIT;(invalid in Postgres)
- Postgres:
B. Tool validation
- Flyway:
flyway validate(checksum, ordering, locations) - Liquibase:
liquibase validate
C. Dry-run artifact (reviewable)
- Flyway: generate dry run SQL (
-dryRunOutput=...) - Liquibase:
updateSQL
Publish as a PR artifact so reviewers can see what will run.
2) Apply migrations on a fresh ephemeral database (every PR)
Spin up DB via Docker/Testcontainers and run migrations “from empty”.
What you verify
- Migration tool exits 0
- Schema history table reflects expected state
- Basic smoke queries succeed (tables exist, indexes exist, constraints exist)
Implementation options (pick one)
- Testcontainers in a JUnit test (my favorite in Java projects)
- Docker Compose in CI job
- Ephemeral DB service provided by CI
Key requirement: same DB engine/version as prod.
3) Apply migrations on a “baseline snapshot” (nightly or for risky PRs)
From-empty doesn’t catch:
- lock duration
- table rewrites
- big backfills
- collation/encoding edge cases
- real data violations (duplicates/nulls)
So add a heavier job that:
- restores a sanitized prod snapshot (or a representative dataset)
- runs the new migrations
- measures duration and checks for blocking/timeout failures
Many teams run this nightly, or only when db/** changed.
4) Assertions: schema + data expectations
After migration, run checks:
- Expected tables/columns exist
- Indexes exist and are valid
- Critical constraints exist
- For data migrations: spot-check counts, null rates, uniqueness
In Java, this can be:
- a small set of SQL assertions
- or integration tests that load Spring context and hit a repository/query
5) “Rollback testing” (what it means in Flyway world)
Not actual rollback. Instead:
- verify forward-only safety
- verify compatibility (expand–migrate–contract):
- run migrations
- run app tests against the migrated schema
- (optional advanced) run the previous app version against the expanded schema to ensure rollback compatibility
That last one is a very senior move.
6) CI pipeline outline (what I’d implement)
PR job (fast):
- Policy lint: no edits to applied migrations + naming uniqueness
- Tool validate
- Dry-run SQL artifact
- Testcontainers DB → apply migrations from empty → smoke checks
Optional heavy job (nightly / db-changes only):
5. Restore snapshot → apply migrations → perf/lock thresholds → smoke queries
Interview-ready answer
“In PR validation, we run validate + generate dry-run SQL for review, then apply migrations to an ephemeral DB (same engine/version as prod) and run smoke assertions. We also enforce append-only migrations and block risky patterns like transactional + non-transactional DDL mixes. For more confidence, we run a heavier job on a sanitized prod snapshot nightly or when db files change, to catch lock/perf and data-quality issues.”