Fragmentation happens when the physical storage of data (or indexes) on disk becomes disorganized.
- In an ideal world: Data is stored contiguously — nice and neat.
- In reality (especially after lots of INSERTs, UPDATEs, DELETEs): Data gets scattered — small pieces in different places.
- This scattering is called fragmentation.
Analogy 📚
Think of a bookshelf:
- Initially: Books are neatly arranged.
- Over time: You add new books, remove old ones, reorganize — but now books are placed randomly with gaps everywhere.
- When you search for a book, you have to jump around — it’s slower.
Same in a database:
- Reading fragmented data = more disk I/O = slower performance.
Types of Fragmentation
1. Table Fragmentation
- The actual table’s rows are stored non-sequentially.
- Reading a single table scan takes more time because the disk head jumps all over to find the next piece of data.
2. Index Fragmentation
- The B-tree (index structure) becomes unbalanced:
- Lots of empty space inside pages (internal fragmentation).
- Pages are no longer physically ordered (external fragmentation).
- Index pages can be half-empty due to deletes and updates.
- The index becomes less efficient — more jumps during a lookup.
How Does Fragmentation Happen?
- INSERT new rows randomly.
- UPDATE rows, and the new value doesn’t fit in place (it gets moved).
- DELETE rows, leaving “holes” or empty spots.
Over time, your once neatly packed pages become:
| Row1 | Row2 | FREE | Row3 | FREE | Row4 | Row5 | FREE |
Why is Fragmentation Bad?
- More Disk I/O:
- Database has to fetch scattered blocks.
- Slower Reads:
- Reading sequentially is fast.
- Jumping around is slow, especially on spinning disks (HDDs).
- Bigger Indexes:
- More empty space = more pages to scan = slower index traversal.
On SSDs, the performance hit is less dramatic, but still real.
How to Fix Fragmentation?
Databases provide ways to rebuild or reorganize tables and indexes:
- REBUILD INDEX: Drops and recreates the index — removes fragmentation completely.
- REORGANIZE INDEX: Rearranges the index pages to be more efficient without fully dropping it.
- VACUUM (in PostgreSQL) or OPTIMIZE TABLE (in MySQL): Repack the table and remove dead space.
Think of it like defragmenting a hard drive — making all the files contiguous again.
Quick Example: SQL Server
Check fragmentation:
SELECT avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('YourTableName'), NULL, NULL, 'LIMITED');
Rebuild the index:
ALTER INDEX ALL ON YourTableName REBUILD;
In Short
| Fragmentation | What Happens | Why It’s Bad |
|---|---|---|
| Data/Indexes are scattered | Rows or index pages not in order | Slower read performance, more I/O |
Real-world tip:
- OLTP systems (many small updates/deletes) suffer fragmentation faster than read-only systems.
- Monitoring and periodically rebuilding indexes is critical for high-performance databases!