Java.Core.There is a class Point{int x, y;}. Why is a hash code of the form 31 * x + y preferable to x + y?

🚀 Why Is 31 * x + y Preferable to x + y for hashCode() in a Point Class?

When implementing hashCode(), 31 * x + y is preferred over x + y because it produces better hash distribution, reducing collisions in hash-based collections like HashMap and HashSet.


1️⃣ Problem With x + y

If we define hashCode() as:

@Override
public int hashCode() {
    return x + y; // ❌ Not a good hash function
}

📌 Why Is This Bad?

  • Different (x, y) pairs can result in the same sum, leading to hash collisions.
  • Example:
    • (2, 3) → 2 + 3 = 5
    • (3, 2) → 3 + 2 = 5
    • (4, 1) → 4 + 1 = 5
  • Many different points get the same hash code!
    • This causes poor performance in HashMap and HashSet.

2️⃣ Why 31 * x + y Works Better

Using a prime multiplier (31) improves hash code distribution:

@Override
public int hashCode() {
    return 31 * x + y; // ✅ Better hash function
}

📌 Why Is This Better?

  • It spreads out hash codes more evenly across the integer space.
  • Prime numbers help reduce patterns where different inputs map to the same value.
  • Preserves order sensitivity (e.g., (2, 3) ≠ (3, 2)).

Example: Improved Hash Distribution

(x, y)x + y (❌ Bad)31 * x + y (✅ Good)
(2, 3)531 * 2 + 3 = 65
(3, 2)531 * 3 + 2 = 95
(4, 1)531 * 4 + 1 = 125
(1, 4)531 * 1 + 4 = 35

📌 Now, each pair has a unique hash code, reducing collisions!


3️⃣ Why Use 31?

🔹 Prime Number Benefits

  • Prime numbers reduce hash collisions because they evenly distribute values.
  • 31 is chosen because:
    • It’s an odd prime, reducing even-number bias.
    • It’s computationally fast (31 * x can be replaced by (x << 5) - x, avoiding multiplication).

4️⃣ Best Practice for hashCode() in Point

The best way to implement hashCode() is using Objects.hash():

import java.util.Objects;

class Point {
    int x, y;

    @Override
    public int hashCode() {
        return Objects.hash(x, y); // ✅ Best practice
    }
}

📌 Why?

  • Objects.hash(x, y) automatically handles collisions well.
  • It uses prime-based hashing internally.
  • Handles null values safely (not needed here, but useful for other classes).

📌 Summary

Hash FunctionCollision RiskPerformanceBest Practice?
x + y❌ High (many pairs sum to the same value)✅ Fast but inefficient in large datasets❌ No
31 * x + y✅ Lower (spreads values better)✅ Fast (bitwise optimization possible)✅ Yes
Objects.hash(x, y)✅ Very low (built-in hashing)✅ Optimized internally✅ Best

Final Recommendation

✔ Use 31 * x + y if writing a custom hash function.
✔ Prefer Objects.hash(x, y) for simplicity and efficiency.

This ensures optimal performance and minimal hash collisions in Java collections! 🚀

This entry was posted in Без рубрики. Bookmark the permalink.