Why ORMs (JPA / Hibernate) struggle with composite primary keys
First: what ORM expects
An ORM assumes:
- One column = one identity
- Identity is simple, immutable, comparable
- Identity is used for:
- Persistence context (1st-level cache)
- Equality checks
- Lazy loading
- Proxies
- Hash-based collections
A composite key breaks all of these assumptions.
1️⃣ @EmbeddedId / @IdClass — what’s the problem?
With a composite PK, you must create a separate ID class
@Embeddable
class UserRoleId {
Long userId;
Long roleId;
}
@Entity
class UserRole {
@EmbeddedId
UserRoleId id;
}
Option B: @IdClass
class UserRoleId {
Long userId;
Long roleId;
}
@Entity
@IdClass(UserRoleId.class)
class UserRole {
@Id Long userId;
@Id Long roleId;
}
👉 Already:
- Extra class
- Extra boilerplate
- Extra cognitive load
Now the real problems start.
2️⃣ equals() / hashCode() — this is the killer
Hibernate uses equals() and hashCode() to manage entities in:
- Persistence context
- Sets
- Maps
- Dirty checking
With surrogate key
equals → id
hashCode → id
Simple. Stable.
With composite key
You must implement:
equals(userId, roleId)
hashCode(userId, roleId)
Problems:
- What if one field is
nullbefore persist? - What if entities are put into a
Setbefore flush? - What if ID fields are mutable?
Result:
- Entities disappear from Sets
- Duplicate entities appear
- Cache behaves incorrectly
These bugs are subtle and brutal.
3️⃣ First-level cache confusion (Persistence Context)
Hibernate cache key:
(EntityClass, PrimaryKey)
With composite PK:
(UserRole, (userId, roleId))
Every lookup must:
- Instantiate composite ID
- Compare multiple fields
- Rely on correct equals/hashCode
This:
- Slows down lookups
- Makes bugs harder to trace
- Breaks identity guarantees if equals/hashCode is wrong
4️⃣ Lazy loading & proxies get awkward
Example:
UserRole role = em.getReference(
UserRole.class,
new UserRoleId(1L, 2L)
);
Compared to:
em.getReference(UserRole.class, 42L);
You now:
- Allocate objects just to reference entities
- Leak DB structure into service code
- Increase verbosity everywhere
5️⃣ Repositories become ugly
Spring Data repository:
Surrogate PK
interface UserRoleRepo extends JpaRepository<UserRole, Long> {}
Composite PK
interface UserRoleRepo extends JpaRepository<UserRole, UserRoleId> {}
Every call now requires:
repo.findById(new UserRoleId(userId, roleId));
This is not wrong, but it’s noisy and error-prone.
6️⃣ Associations are harder to map
Mapping relations:
@ManyToOne
@JoinColumns({
@JoinColumn(name = "user_id"),
@JoinColumn(name = "role_id")
})
Instead of:
@ManyToOne
@JoinColumn(name = "user_id")
ore annotations → more bugs → harder refactors.
7️⃣ Why this matters in real teams
Composite PK issues:
- Slow down development
- Confuse mid-level devs
- Increase code review overhead
- Cause production-only bugs
This is why experienced teams avoid them unless forced.
Mental model (important)
ORMs are optimized for object identity, not relational purity.
Composite PKs are relationally elegant
Surrogate PKs are operationally practical
Clean alternative (best of both worlds)
user_roles (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL,
role_id BIGINT NOT NULL,
UNIQUE (user_id, role_id)
)
ORM:
@Entity
class UserRole {
@Id Long id;
Long userId;
Long roleId;
}
✔ Simple identity
✔ Business rule enforced
✔ ORM-friendly
✔ Evolvable
One-sentence interview explanation (perfect)
ORMs rely on a simple, stable single-column identity, and composite primary keys complicate equality, caching, associations, and repository usage, which is why teams usually avoid them in ORM-based systems.