🚀 Why Is 31 * x + y
Preferable to x + y
for hashCode()
in a Point
Class?
When implementing hashCode()
, 31 * x + y
is preferred over x + y
because it produces better hash distribution, reducing collisions in hash-based collections like HashMap
and HashSet
.
1️⃣ Problem With x + y
If we define hashCode()
as:
@Override
public int hashCode() {
return x + y; // ❌ Not a good hash function
}
📌 Why Is This Bad?
- Different
(x, y)
pairs can result in the same sum, leading to hash collisions. - Example:
(2, 3) → 2 + 3 = 5
(3, 2) → 3 + 2 = 5
(4, 1) → 4 + 1 = 5
- Many different points get the same hash code!
- This causes poor performance in
HashMap
andHashSet
.
- This causes poor performance in
2️⃣ Why 31 * x + y
Works Better
Using a prime multiplier (31
) improves hash code distribution:
@Override
public int hashCode() {
return 31 * x + y; // ✅ Better hash function
}
📌 Why Is This Better?
- It spreads out hash codes more evenly across the integer space.
- Prime numbers help reduce patterns where different inputs map to the same value.
- Preserves order sensitivity (e.g.,
(2, 3) ≠ (3, 2)
).
Example: Improved Hash Distribution
(x, y) | x + y (❌ Bad) | 31 * x + y (✅ Good) |
---|---|---|
(2, 3) | 5 | 31 * 2 + 3 = 65 |
(3, 2) | 5 | 31 * 3 + 2 = 95 |
(4, 1) | 5 | 31 * 4 + 1 = 125 |
(1, 4) | 5 | 31 * 1 + 4 = 35 |
📌 Now, each pair has a unique hash code, reducing collisions!
3️⃣ Why Use 31?
🔹 Prime Number Benefits
- Prime numbers reduce hash collisions because they evenly distribute values.
31
is chosen because:- It’s an odd prime, reducing even-number bias.
- It’s computationally fast (
31 * x
can be replaced by(x << 5) - x
, avoiding multiplication).
4️⃣ Best Practice for hashCode()
in Point
The best way to implement hashCode()
is using Objects.hash()
:
import java.util.Objects;
class Point {
int x, y;
@Override
public int hashCode() {
return Objects.hash(x, y); // ✅ Best practice
}
}
📌 Why?
Objects.hash(x, y)
automatically handles collisions well.- It uses prime-based hashing internally.
- Handles null values safely (not needed here, but useful for other classes).
📌 Summary
Hash Function | Collision Risk | Performance | Best Practice? |
---|---|---|---|
x + y | ❌ High (many pairs sum to the same value) | ✅ Fast but inefficient in large datasets | ❌ No |
31 * x + y | ✅ Lower (spreads values better) | ✅ Fast (bitwise optimization possible) | ✅ Yes |
Objects.hash(x, y) | ✅ Very low (built-in hashing) | ✅ Optimized internally | ✅ Best |
✅ Final Recommendation
✔ Use 31 * x + y
if writing a custom hash function.
✔ Prefer Objects.hash(x, y)
for simplicity and efficiency.
This ensures optimal performance and minimal hash collisions in Java collections! 🚀