Great question — the answer is:
Usually, no, it does not make sense to index columns with a small number of distinct values (i.e., low cardinality), unless there’s a very specific reason.
📉 Why Not?
Columns like:
gender = 'M' or 'F'
status = 'active' or 'inactive'
is_deleted = true or false
Have very few possible values, so:
- Each index entry matches many rows
- The database will likely do a full table scan anyway, because the index isn’t selective
- Using the index can be slower than just scanning the table
🔍 When Might It Still Make Sense?
✅ 1. In combination with other columns (composite index)
CREATE INDEX idx_status_created ON Users(status, created_at);
Useful if you often query:
SELECT * FROM Users
WHERE status = 'active' AND created_at > '2024-01-01';
Here, status
helps narrow the data before applying the range on created_at
.
✅ 2. In large, filtered queries or partitions
If:
- The table is huge, and
- The low-cardinality column is highly skewed (e.g., 90%
'active'
, 10%'inactive'
), and - You often query the rare value:
SELECT * FROM Users WHERE status = 'inactive';
Then an index might help — but it depends on the query planner and row distribution.
✅ 3. Bitmap Indexes (in data warehouses)
In OLAP databases (e.g., Oracle, Redshift), low-cardinality columns are perfect for bitmap indexes, which compress well and are efficient for complex filtering.
🧠 Rule of Thumb:
Cardinality | Should You Index? |
---|---|
🔴 Very low (e.g., true/false) | ❌ No — often worse performance |
🟡 Medium | ⚠️ Depends on query patterns and size |
🟢 High (e.g., email, ID) | ✅ Yes — ideal for indexing |