Database.Beginner.Keystones: Why Fragmentation of leaf pages in b-tree is Bad

1. More Pages to Read

  • Fragmentation = each page holds fewer rows (wasted space).
  • To find or scan the same number of rows, the database needs to read more pages.
  • More pages = more I/O operations.

👉 Disk I/O is the slowest part of most databases!


2. Cache Inefficiency (Wasted Memory)

  • Databases use memory (buffer pool, page cache) to store pages in RAM.
  • Fragmented pages waste memory — more pages are needed for the same data.
  • Cache misses happen more often.

👉 RAM is precious — more pages = less effective caching.


3. Slower Index Traversal

  • In a B-tree/B+ tree:
    • More leaf pages mean longer leaf scans.
    • Even if the tree height stays the same, reading through fragmented leaves is slower.

👉 Especially for range queries (e.g., BETWEEN, ORDER BY).


4. Increased Page Splits

  • Fragmentation increases the likelihood of future page splits.
  • When you insert new data into a fragmented index:
    • Less available space → new splits → more fragmentation.

👉 Fragmentation feeds even more fragmentation — a bad cycle.

5. Higher Storage Consumption

  • Fragmented indexes use more disk space than necessary.
  • Instead of tightly packed data, you have lots of wasted gaps.

👉 Higher storage costs and backups take longer.

6. Slower Backups and Maintenance

  • More fragmented pages → more data to scan during:
    • Backups
    • Index statistics gathering
    • Database consistency checks

👉 Maintenance operations take longer and affect uptime.

In Short

KeystoneWhy It’s Bad
More pages to readMore I/O = slower reads and queries.
Cache inefficiencyWasted RAM space; more cache misses.
Slower index traversalScanning fragmented leaves is slower.
Increased page splitsInserts cause more splits, worsening fragmentation.
Higher storage consumptionWastes disk space; larger database size.
Slower maintenanceBackups, reindexing, and checks become slower.

🔥 Bottom Line

Fragmentation makes everything slower and more expensive — both for day-to-day queries and long-term maintenance.

This entry was posted in Без рубрики. Bookmark the permalink.