SQL.What is the problem in using null in in operator ?

This is a classic SQL trap and a favorite interview question. Let’s break it down cleanly and precisely.


Short answer (what interviewers want to hear)

IN breaks when the subquery contains NULL because SQL uses three-valued logic.
Any comparison with NULL becomes UNKNOWN, which can make the whole predicate evaluate to false.


The core problem: SQL is NOT boolean

SQL has three-valued logic:

  • TRUE
  • FALSE
  • UNKNOWN ← caused by NULL

And WHERE keeps only rows where the condition is TRUE.


What people expect vs what actually happens

Example data

orders
-------
user_id
-------
1
2
NULL

Query

SELECT *
FROM users
WHERE id IN (SELECT user_id FROM orders);

Logical expansion (important!)

id = 1 OR id = 2 OR id = NULL

Now remember:

id = NULL  --> UNKNOWN

So the expression becomes:

(TRUE OR FALSE OR UNKNOWN)

Depending on values, this can still work sometimes.

The REAL killer: NOT IN

SELECT *
FROM users
WHERE id NOT IN (SELECT user_id FROM orders);

Expansion

id <> 1 AND id <> 2 AND id <> NULL

But:

id <> NULL --> UNKNOWN

So the whole condition becomes:

TRUE AND TRUE AND UNKNOWN --> UNKNOWN

And UNKNOWN is filtered out by WHERE.

Result: ❌ ZERO ROWS

Even though logically you expected some users.

Why this happens (formal explanation)

  • IN / NOT IN is syntactic sugar for comparisons
  • NULL means “unknown value”, not “empty”
  • SQL cannot decide truth → returns UNKNOWN
  • WHERE only keeps TRUE

Why EXISTS works correctly

SELECT *
FROM users u
WHERE EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.user_id = u.id
);

Key difference:

  • EXISTS checks row existence
  • No equality comparison with NULL
  • No three-valued logic explosion

This is why seniors say:

“Use EXISTS, not IN, for subqueries.”

Safe alternatives (this is important)

✅ Option 1: Filter NULLs explicitly

WHERE id IN (
  SELECT user_id
  FROM orders
  WHERE user_id IS NOT NULL
)

✅ Option 2: Use EXISTS (preferred)

WHERE EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.user_id = u.id
)

❌ Avoid

NOT IN (subquery_that_may_return_null)

Interview-ready crisp answer (2–3 sentences)

“The problem with IN is that if the subquery returns NULL, SQL’s three-valued logic turns comparisons into UNKNOWN.
This is especially dangerous with NOT IN, which can filter out all rows unexpectedly.
That’s why EXISTS is safer for correlated subqueries.”


Senior-level bonus insight ⭐

This bug often appears when:

  • Foreign keys are nullable (bad schema design)
  • Legacy tables violate relational integrity
  • Developers rely on ORM-generated SQL

👉 Another reason why proper foreign keys + NOT NULL constraints matter.

This entry was posted in Без рубрики. Bookmark the permalink.