🧠 Hash Table Implementations (per Knuth & CLRS)
1. Chaining
Each bucket contains a list (or tree) of entries. Collisions are handled by appending to that list.
2. Open Addressing
All entries live in the array itself. On collision, we probe (linear, quadratic, or double hashing) to find the next free slot.
🔎 How is Java’s HashMap
implemented?
✅ Java’s HashMap
uses chaining.
- Internally it uses:
- An array of buckets
- Each bucket is a linked list (or a red-black tree if it gets too long)
✅ Java also uses some open addressing in HashMap
‘s bucket index calculation:
index = (n - 1) & hash
— which spreads keys better across the array — but this is not full open addressing.
💡 Why Chaining in Java?
✅ Pros of Chaining (and why Java chose it):
Feature | Why it matters for Java |
---|---|
Simpler deletion | Just remove from linked list — fast and clean |
Handles high load factors well | Performance degrades linearly, not catastrophically |
Resilient to clustering | Less sensitive to poor hash functions |
Easy to upgrade | Java 8+: linked list → tree for long chains |
More flexible entries | Java stores full objects — trees, linked lists, metadata |
Java’s object model (GC, object headers, indirection) makes chaining more natural and flexible.
🔁 Comparing Open Addressing vs Chaining
Feature | Chaining | Open Addressing |
---|---|---|
Storage | Buckets with lists/trees | Single array only |
Deletion | Easy | Tricky (needs tombstones) |
Cache friendliness | ❌ Poorer (linked structures) | ✅ Better (data is in contiguous memory) |
Performance at high load | ✅ Still works (longer chains) | ❌ Degrades quickly |
Load factor recommendation | Up to ~0.75 is OK | Must stay below ~0.5–0.7 |
Rehashing cost | Moderate | Potentially very costly |
Java suitability | ✅ Excellent | ❌ Complex with Java object references |
📌 When is Open Addressing better?
- In low-memory, high-performance systems (e.g., embedded, C/C++ code).
- In languages with value types and arrays (not object references).
- When cache locality is crucial (e.g., CPU-level optimization).
🧠 TL;DR:
✅ **Java’s
HashMap
uses chaining (with list/tree in each bucket).
It’s chosen for its simplicity, robustness, and fit with Java’s object model. ❌ Open addressing offers better cache performance, but it’s harder to manage, especially with Java’s GC and references.