for zero-downtime with Flyway/Liquibase the tool is not the magic — the migration strategy is. The standard interview answer is Expand → Migrate → Contract (aka parallel change), plus operational guardrails.
Core pattern: Expand → Migrate → Contract
1) Expand (backward compatible DB change)
Goal: deploy DB changes that old app + new app can both live with.
Typical actions:
- Add new nullable columns / tables
- Add new indexes (concurrently where supported)
- Add triggers / views to keep old/new in sync (optional)
- Add new constraints as NOT VALID / disabled first (Postgres) and validate later
Rules:
- Never drop/rename stuff the old version still uses
- Avoid locking operations during peak (e.g., big table ALTERs)
Example (rename column safely):
- Add
new_col - Keep
old_colfor now - (Optional) trigger to mirror writes
- Release app that writes both
2) Migrate (data backfill + dual-write / dual-read)
Goal: move data gradually while both versions run.
Approaches:
- Backfill in chunks (job/batch), not in a single huge migration
- Application does:
- dual-write (write to old + new)
- read-new-fallback-old (or the opposite), depending on risk
Operational tips:
- Make backfill idempotent
- Track progress (marker table / cursor / timestamps)
- Throttle to protect DB (sleep, limited batch size)
3) Contract (remove old stuff)
Only when:
- All nodes are on the new version
- Backfill is complete
- Monitoring confirms no reads/writes to old paths
Actions:
- Drop old columns/tables
- Remove triggers
- Enforce constraints fully (NOT NULL, FK validation)
- Cleanup indexes
This step is often a separate release days later.
Tooling: how Flyway/Liquibase fits
Use migrations for schema, not heavy backfills
Best practice:
- Schema changes in Flyway/Liquibase migrations
- Large data backfills in a controlled job (app job, one-off worker, or admin service)
Why:
- Deploy pipelines time out
- DB locks / long transactions
- Harder to retry safely
- Rollback is messy
Small data fixes are OK in migrations if fast + idempotent.
Concurrency-safe DDL practices (especially Postgres)
- Create indexes with minimal locking:
- Postgres:
CREATE INDEX CONCURRENTLY ...(note: non-transactional)
- Postgres:
- Add constraints in phases:
ADD CONSTRAINT ... NOT VALIDthenVALIDATE CONSTRAINT
- Prefer additive changes:
- add column/table first, enforce later
If you have a DB that treats many ALTERs as table rebuilds (e.g., some MySQL ops), you plan around that: online DDL / pt-online-schema-change / gh-ost, etc.
Deployment sequencing (what actually happens)
A safe rollout often looks like:
- Deploy DB Expand migration (Flyway/Liquibase)
- Deploy app vNext (compatible with old schema)
- Run backfill job (can be continuous)
- Flip feature flag to read new path
- Observe
- Contract migration later (drop old)
With Kubernetes:
- ensure old + new pods overlap (rolling update)
- readiness probes must fail fast on incompatible schema (but schema is compatible, so OK)
- avoid “migration at startup on every pod” unless you’ve designed for single-run (leader lock)
Guardrails you should mention in interviews
1) One writer for migrations
Avoid multiple instances applying migrations simultaneously:
- Flyway has locking via its schema history table
- Liquibase uses DATABASECHANGELOGLOCK
Still: decide where it runs (CI/CD step, init container, or one pod).
2) Idempotency and retries
- Make backfill resumable
- Design migrations to be restartable where possible
3) Observability
Track:
- backfill progress
- query error rates (old/new)
- DB locks & slow queries
- replication lag (if any)
4) Feature flags
Use flags for:
- dual-write enablement
- read switch
- contract cleanup timing
Interview-ready 30-second answer
“For zero-downtime, I use Expand–Migrate–Contract. First I apply backward-compatible schema changes (additive, no drops/renames), then deploy code that can work with both schemas and do dual-write / safe read strategies. Large backfills run as resumable jobs, not inside the migration tool. Once all instances are on the new version and data is migrated, I do a final contract migration to remove old columns and enforce constraints.”