🚀 Why Is 31 * x + y Preferable to x + y for hashCode() in a Point Class?
When implementing hashCode(), 31 * x + y is preferred over x + y because it produces better hash distribution, reducing collisions in hash-based collections like HashMap and HashSet.
1️⃣ Problem With x + y
If we define hashCode() as:
@Override
public int hashCode() {
return x + y; // ❌ Not a good hash function
}
📌 Why Is This Bad?
- Different
(x, y)pairs can result in the same sum, leading to hash collisions. - Example:
(2, 3) → 2 + 3 = 5(3, 2) → 3 + 2 = 5(4, 1) → 4 + 1 = 5
- Many different points get the same hash code!
- This causes poor performance in
HashMapandHashSet.
- This causes poor performance in
2️⃣ Why 31 * x + y Works Better
Using a prime multiplier (31) improves hash code distribution:
@Override
public int hashCode() {
return 31 * x + y; // ✅ Better hash function
}
📌 Why Is This Better?
- It spreads out hash codes more evenly across the integer space.
- Prime numbers help reduce patterns where different inputs map to the same value.
- Preserves order sensitivity (e.g.,
(2, 3) ≠ (3, 2)).
Example: Improved Hash Distribution
(x, y) | x + y (❌ Bad) | 31 * x + y (✅ Good) |
|---|---|---|
| (2, 3) | 5 | 31 * 2 + 3 = 65 |
| (3, 2) | 5 | 31 * 3 + 2 = 95 |
| (4, 1) | 5 | 31 * 4 + 1 = 125 |
| (1, 4) | 5 | 31 * 1 + 4 = 35 |
📌 Now, each pair has a unique hash code, reducing collisions!
3️⃣ Why Use 31?
🔹 Prime Number Benefits
- Prime numbers reduce hash collisions because they evenly distribute values.
31is chosen because:- It’s an odd prime, reducing even-number bias.
- It’s computationally fast (
31 * xcan be replaced by(x << 5) - x, avoiding multiplication).
4️⃣ Best Practice for hashCode() in Point
The best way to implement hashCode() is using Objects.hash():
import java.util.Objects;
class Point {
int x, y;
@Override
public int hashCode() {
return Objects.hash(x, y); // ✅ Best practice
}
}
📌 Why?
Objects.hash(x, y)automatically handles collisions well.- It uses prime-based hashing internally.
- Handles null values safely (not needed here, but useful for other classes).
📌 Summary
| Hash Function | Collision Risk | Performance | Best Practice? |
|---|---|---|---|
x + y | ❌ High (many pairs sum to the same value) | ✅ Fast but inefficient in large datasets | ❌ No |
31 * x + y | ✅ Lower (spreads values better) | ✅ Fast (bitwise optimization possible) | ✅ Yes |
Objects.hash(x, y) | ✅ Very low (built-in hashing) | ✅ Optimized internally | ✅ Best |
✅ Final Recommendation
✔ Use 31 * x + y if writing a custom hash function.
✔ Prefer Objects.hash(x, y) for simplicity and efficiency.
This ensures optimal performance and minimal hash collisions in Java collections! 🚀