To simulate a migration safely before prod, you want repeatability, prod-like data, and the same execution path your deployment will use. Here’s the playbook I expect a senior backend engineer to describe in an interview.
1) Use a prod-like environment, not “my laptop”
Goal: catch perf + locking issues, not just syntax.
- Restore a recent prod snapshot (sanitized if needed) into a dedicated dev/staging DB.
- Match DB version, extensions, collation/locale, parameter group, and hardware class (at least roughly).
- Ensure the same schemas, roles, search_path, and privileges exist.
If you can’t use prod data, at least generate data volume & distribution (skew, NULL rates, hot keys) close to prod.
2) Run migrations in “dry-run / plan” mode first
This catches ordering mistakes and shows what will execute.
Flyway
- Generate SQL without applying (or use equivalent in your setup):
flyway migrate -dryRunOutput=flyway-dryrun.sql
- Always run:
flyway validate(checksum/order problems)flyway info(what would run)
Liquibase
- Render SQL:
liquibase updateSQL
- Validate:
liquibase validate
- Useful sanity:
liquibase status --verbose
Why this matters: you review the actual DDL/DML that will hit prod and catch “oops, wrong schema / wrong default / wrong index name / wrong order”.
3) Execute the real migration against the cloned dev DB
Dry-run ≠ truth. You need real execution to surface:
- long-running index builds
- table rewrites (e.g., adding a column with a volatile default in some DBs)
- blocking locks / lock escalation
- statement timeouts
- deadlocks with background traffic
Important: run it via the same container/image and config as prod (same Flyway/Liquibase version, same JVM, same flags). Don’t run “manually” with a different toolchain.
4) Measure the impact (perf + locks), not just “it succeeded”
On PostgreSQL, for example:
EXPLAIN (ANALYZE, BUFFERS)for heavy DML queries- watch locks during migration:
pg_stat_activity,pg_locks
- watch IO/CPU and duration
- if you build indexes:
- verify you used
CREATE INDEX CONCURRENTLYfor big tables (and remember: it can’t run inside a transaction block)
- verify you used
On MySQL:
- check whether operations are online (InnoDB online DDL), and measure metadata locks.
5) Automate the simulation in CI (ephemeral DB)
Best practice: a pipeline stage that:
- spins up a DB (Docker/Testcontainers)
- restores a snapshot or seed dataset
- runs migration tool
- runs smoke queries / integration tests
- fails the build if migration time/locks exceed thresholds (optional but great)
For Java/Spring Boot teams, this is often:
- Testcontainers Postgres/MySQL
@SpringBootTest+ Liquibase/Flyway enabled- extra scripts to restore a dump (for “big-data” runs, do this in nightly builds)
6) Validate rollback strategy (even for “forward-only” teams)
Even if you don’t rollback migrations:
- ensure you can restore from backup quickly
- ensure “expand-migrate-contract” is used for zero-downtime:
- expand schema (nullable column, new table)
- deploy code that writes both / reads new with fallback
- backfill in controlled batches
- contract (drop old column/index) later
7) What I’d say in an interview (tight answer)
“I simulate migrations by restoring a recent prod snapshot into a staging DB with the same engine/version and config, then I run the migrations using the exact same Flyway/Liquibase version and container we use in prod. First I generate SQL (dry run / updateSQL) and validate checksums, then I execute for real while monitoring locks, duration, and IO. Finally I run smoke tests and verify the schema and key queries. This is automated in CI with an ephemeral DB, and for risky changes we use expand–migrate–contract to keep deployments zero-downtime.”