Selectivity = the percentage of rows that match a given condition.
- High selectivity: few rows match
→ e.g.WHERE email = 'alice@example.com'
→ ✅ index is efficient - Low selectivity: many rows match
→ e.g.WHERE gender = 'M'
in a table where 60% are'M'
→ ❌ index might be inefficient
🧠 Why Is Low Selectivity Bad for Index Use?
When many rows match, using an index means:
- The database finds matching row IDs in the index
- Then it has to jump to each row individually (called random I/O)
- This is slower than just reading the whole table sequentially in memory or disk
✅ Example:
Assume this table has 1 million rows:
SELECT * FROM Users WHERE is_active = true;
If 950,000 rows have is_active = true
, that’s very low selectivity
The index on is_active
matches most of the table
The database says:
❌ “Too many matches — don’t waste time jumping around.”
✅ “Just scan the whole table — it’s faster.”
Visual Analogy:
- 📇 Index access is like flipping through a card catalog to find books, but running back and forth in the library for each one.
- 📚 Full table scan is like walking down every shelf and grabbing all relevant books in order — more efficient if you’re taking most of them.
🧪 Rule of Thumb:
- If a condition matches >10–20% of the table, many databases prefer a full scan over index access.