Database.Middle.Explain how index can contain data ?

🧠 Think of an Index Like a Smart Table of Contents

In a book:

  • A normal table of contents gives page numbers — you still have to go to the page to read it.
  • A super table of contents not only gives the page number but also shows the content right there.

That’s the idea behind an index that contains data: it’s not just a pointer — it may store actual column values too.

🔍 So How Does That Work in a Database?

In most databases, an index is a tree (like a B-tree):

  • It’s sorted by the indexed column(s).
  • Each entry stores:
    • The key (e.g., email)
    • A pointer to the full row in the table (called a TID or rowid)

But some indexes also store extra column values — this is how Index-Only Scans become possible.


✅ PostgreSQL Example

Let’s say we run:

CREATE INDEX idx_email_name ON users(email, name);

In this index, each entry might look like:

(email = 'alice@example.com', name = 'Alice', row_pointer = 42)

So now if you run:

SELECT name FROM users WHERE email = 'alice@example.com';

PostgreSQL can say:

“I already have name in the index. No need to go to the table!”

This works only if:

  1. All needed columns (email, name) are already in the index.
  2. The row is visible (checked using PostgreSQL’s visibility map).

🔧 MySQL Example (InnoDB)

In InnoDB, secondary indexes can store:

  • The index key
  • The primary key value
  • But not all the columns — so it may still need to fetch the full row

✅ But with covering indexes, you add all needed columns:

CREATE INDEX idx_email_name ON users(email, name);

Now if your query only asks for email and name, it’s covered by the index — no extra lookup needed.


🔬 Summary

ConceptExplanation
Index contains dataIt stores not just the key but also extra column values
Used in Index-Only ScansBecause all the needed data is already in the index
Benefits✅ Less disk I/O, ✅ Faster reads
SQL TipAdd extra columns to index to make it covering