Java.DBMigrationTools.What is Vacuum in postgres?

The short, correct definition (interview-ready)

VACUUM removes dead tuples created by updates/deletes so PostgreSQL can reuse space and keep queries fast.


Why VACUUM exists (MVCC in 30 seconds)

Postgres uses MVCC:

  • UPDATE doesn’t overwrite a row → it creates a new version
  • DELETE marks rows as invisible
  • Old versions (dead tuples) remain on disk

Without VACUUM:

  • tables bloat
  • indexes bloat
  • queries slow down
  • transaction ID wraparound risk appears (catastrophic)

What VACUUM actually does

  • Marks dead tuples as reusable
  • Updates visibility maps
  • Helps planner statistics (partially)
  • Prevents transaction ID wraparound

⚠️ It does not shrink the table file (that’s a common misconception)


Types of VACUUM

1) AUTOVACUUM (the default, should always be ON)

  • Background workers
  • Runs automatically based on thresholds
  • Non-blocking for normal queries
SHOW autovacuum;

✅ Safe for production
✅ Always enabled in sane setups


2) VACUUM

Manual, lightweight cleanup.

VACUUM orders;
  • Frees space inside the table
  • Doesn’t lock reads/writes
  • Does not reduce disk size

3) VACUUM FULL (dangerous)

VACUUM FULL orders;
  • Rewrites the entire table
  • Exclusive lock (blocks reads & writes)
  • Physically shrinks table on disk

❌ Rarely used in production
❌ Often causes downtime


What VACUUM is not

❌ Not a migration
❌ Not transactional
❌ Not reversible
❌ Not something you run inside Flyway/Liquibase migrations

VACUUM cannot run inside a transaction block.


Why migrations care about VACUUM

  • Large data migrations create many dead tuples
  • Index rebuilds + updates can cause bloat
  • VACUUM is often scheduled after heavy migrations
  • Never mix it into schema migrations

Correct pattern:

  1. Deploy migration
  2. Let autovacuum do its job
  3. Manually tune or vacuum off-hours if needed

Common interview pitfalls ❌

  • “VACUUM deletes rows” → wrong
  • “VACUUM frees disk space” → usually wrong
  • “You should VACUUM in migrations” → very wrong

When you do think about VACUUM explicitly

  • Huge update/delete jobs
  • Tables with frequent churn
  • Long-running transactions blocking autovacuum
  • Sudden table bloat

One-sentence senior takeaway

VACUUM is PostgreSQL’s garbage collector for MVCC, reclaiming dead row versions so space and performance don’t degrade over time.

This entry was posted in Без рубрики. Bookmark the permalink.