A B-tree index (short for Balanced Tree) is the most common type of index used in relational databases (e.g., PostgreSQL, MySQL InnoDB, Oracle). It organizes data in a way that allows fast lookup, insertion, deletion, and range queries, all in logarithmic time.
Let’s break down how it works in simple terms:
🌳 1. Structure of a B-tree
A B-tree is a balanced multi-level tree with these key properties:
- ✅ Sorted: Keys in each node are ordered.
- ✅ Balanced: All leaf nodes are at the same level.
- ✅ Multi-way: Each node can have many children, not just 2 (unlike binary trees).
- ✅ Disk-optimized: Designed to reduce disk reads/writes by grouping many keys in each node.
📚 Example (Simplified)
Suppose you have a B-tree index on column email
:
[anna, dave, kate]
/ | | \
<anna anna–dave dave–kate >kate
Each node has:
- Keys (sorted)
- Pointers to children (where to go based on key comparison)
🔍 2. How Lookup Works
SELECT * FROM users WHERE email = 'dave@example.com';
- Start at root node
- Find the range where
'dave@example.com'
fits - Follow pointer to the correct child node
- Repeat until leaf node
- Get pointer to actual row in the table (if non-clustered)
This takes O(log n) time.
📈 3. How Insert Works
- Navigate to the correct leaf node
- Insert key in sorted order
- If node is full, split and promote middle key to parent
- Tree remains balanced
🔁 4. How Range Queries Work
For:
SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-01-31';
- B-tree quickly locates the starting key
- Then performs a sequential scan across leaf nodes (they’re linked like a doubly linked list)
- Stops when upper bound is reached
This is why B-tree indexes are great for BETWEEN
, >
, <
, >=
, <=
, and ORDER BY`.
🔧 Summary of Operations
Operation | Time Complexity |
---|---|
Lookup (point) | O(log n) |
Insert/Delete | O(log n) |
Range scan | O(log n + k) (where k = rows returned) |
🧠 Why B-trees Work So Well
- Nodes are designed to match disk block sizes, minimizing disk I/O
- Balance ensures predictable performance even with millions of rows
- Self-adjusting on insert/delete