You version-control large changelog histories by treating them as immutable audit logs, then adding structure + baselines so new environments don’t have to replay 5 years of history every time.
1) Immutable history + “append-only” rules
- Never edit applied migrations; fix-forward with new migrations.
- Enforce in CI:
- changes under
db/migration(ordb/changelog/changes) must be new files only - block modifications to old files unless explicitly approved
- changes under
This keeps history trustworthy and makes debugging sane.
2) Split history into time-based folders (and include by directory)
Huge repositories become navigable when the tree is predictable.
Flyway
db/migration/
2024/
2025/
2026/
Liquibase
db/changelog/changes/
2024/...
2025/...
2026/...
Master changelog (Liquibase) should include directories, not list 2,000 files manually.
3) “Baseline” / “snapshot” so you don’t replay everything forever
This is the big one.
Flyway: baseline
- You can start tracking from a known schema state using
flyway baselineandbaselineVersion. - Typical approach:
- For new environments, create schema via dump/snapshot
- Mark baseline at the corresponding version
- Run only newer migrations
What it buys you: onboarding a new env is minutes, not hours.
Liquibase: “squash” via a new baseline changelog
Common approach:
- Periodically create a baseline changelog that represents current schema
- New envs apply baseline + recent incremental changes
- Old envs keep their
DATABASECHANGELOGhistory; you don’t rewrite it
Important: baseline generation must be controlled and reviewed (it’s easy to miss grants, functions, sequences).
4) Use DB dumps for environment creation, migrations for evolution
A mature pattern:
- Provisioning: restore from a schema dump (or an image) + optionally seed data
- Evolution: apply incremental migrations from that baseline forward
This aligns with GitOps and speeds up ephemeral envs.
5) Archive and “freeze” ancient migrations
You can keep everything in git (best for audit), but you can also:
- move migrations older than N years into an
archive/folder - keep them still readable, but not part of the standard include path for new envs (because you baseline)
Rule: only do this after you have a baseline strategy; otherwise you’ll break fresh installs.
6) Tagging and releases
Tie DB state to app releases:
- git tags like
release-2.14.0correspond to:- app artifact version
- expected schema version (Flyway) / last changeset (Liquibase)
Then you can answer quickly:
- “What DB version is prod supposed to be on for release X?”
7) CI checks that scale with large history
- Run
validateon every PR (fast) - Run full “migrate from scratch” on:
- nightly builds, or
- only when
db/changes (and maybe using a baseline dump, not truly scratch)
- Add lint checks:
- naming convention
- forbidden statements (e.g., Postgres
CREATE INDEXwithoutCONCURRENTLYon large tables) - prohibit
DROP COLUMNunless ticket/approval label
8) Interview answer (what I’d want to hear)
“We keep migrations immutable and append-only in git, organize them by year/month, and enforce that old migrations can’t be edited. To avoid replaying years of history for new environments, we baseline: Flyway baseline or a Liquibase baseline changelog plus a schema dump for provisioning. CI runs validate on every PR and periodic full migration tests; releases are tagged so app version maps to an expected schema version.”
A nuance I’ll probe you on (senior signal)
- Squashing/baselining is not “deleting history.” It’s about speeding up fresh installs while preserving auditability in existing environments.
- The hard part is making sure the baseline captures everything (schema, indexes, sequences, views, functions, grants, extensions).