📚 What is Normalization in Databases?
Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves dividing large, complex tables into smaller, related tables and defining relationships between them.
🎯 Goals of Normalization:
- Eliminate duplicate data (e.g., repeating groups or columns)
- Ensure data dependencies make sense
- Simplify updates, inserts, and deletes
- Prevent anomalies (update, insert, delete anomalies)
🧱 Normal Forms (NF) – The Stages of Normalization:
Each level is stricter than the previous. The first three are most commonly applied.
✅ 1NF (First Normal Form)
- Eliminate repeating groups
- Ensure atomic values (each field contains only one value)
✅ Good:
Hobbies: ["reading", "biking"]
❌ Bad:
Hobbies: "reading, biking"
✅ 2NF (Second Normal Form)
- Must be in 1NF
- Remove partial dependencies (i.e., no non-key column depends on part of a composite key)
✅ Example: Split a table where some data only depends on part of the key.
✅ 3NF (Third Normal Form)
- Must be in 2NF
- Remove transitive dependencies (i.e., non-key columns should not depend on other non-key columns)
✅ Example: Instead of storing city_name
and city_zip
in the same table, move city_zip
to a separate table.
🔄 Example Before and After Normalization:
❌ Unnormalized Table
OrderID | CustomerName | CustomerEmail | Product |
---|---|---|---|
101 | Alice | a@mail.com | Laptop |
102 | Alice | a@mail.com | Phone |
✅ After Normalization
Customers Table
CustomerID | Name | |
---|---|---|
1 | Alice | a@mail.com |
Orders Table
OrderID | CustomerID | Product |
---|---|---|
101 | 1 | Laptop |
102 | 1 | Phone |
🧠 Why Normalize?
Benefit | Explanation |
---|---|
Reduces redundancy | Data is not unnecessarily duplicated |
Improves consistency | Changes happen in one place only |
Prevents anomalies | Avoids strange behaviors on insert/delete/update |
Makes relationships clear | Easier to manage and query |