Java.Hibernate.Middle.Dublicating parents when fetch join

🔎 The root cause: how SQL JOINs work with one-to-many

Imagine you have:

  • Author (id=1) with 3 Books (Book A, Book B, Book C).

When you do a JOIN to get authors and their books:

SELECT a.*, b.* FROM authors a
JOIN books b ON a.id = b.author_id

Your SQL result will look like:

| author_id | author_name | book_id | book_title |
|----------|-------------|---------|------------|
| 1        | John Smith  | 101     | Book A     |
| 1        | John Smith  | 102     | Book B     |
| 1        | John Smith  | 103     | Book C     |

Notice what happens:

  • The same author appears 3 times — once for each book — because that’s how SQL JOINs work: they produce a row per combination of parent-child.

🚨 Without DISTINCT in JPQL

Hibernate reads those rows and reconstructs objects.
But if your query is:

SELECT a FROM Author a JOIN FETCH a.books

Hibernate sees 3 rows, each with an Author, and by default adds all of them to the result list → so your List<Author> will have 3 elements, each seemingly the same Author (but possibly different object instances pointing to the same DB row).
That’s why you get duplicates.

With DISTINCT in JPQL

When you write:

SELECT DISTINCT a FROM Author a JOIN FETCH a.books

Hibernate collects all Authors from the result rows, but then deduplicates them in-memory based on their IDs, so you only get one Author instance in your List<Author>, no matter how many rows the JOIN produced.

🔎 Key point:

  • The duplication isn’t a bug — it’s the nature of SQL’s one-to-many joins.
  • But it becomes a problem when you want a clean list of unique parents in your Java collection.

Always use DISTINCT when JOIN FETCHing a collection
It avoids surprises like:

  • lists containing dozens or hundreds of duplicates of the same parent entity,
  • bugs where List.size() is misleading,
  • unexpected UI issues (e.g., table rows showing duplicates).
This entry was posted in Без рубрики. Bookmark the permalink.