Normalization up to 3NF (Third Normal Form)
What is Normalization?
Normalization is a process in DBMS to organize data in a database to reduce redundancy and improve data integrity.
Normal Forms Overview
-
1NF (First Normal Form): Eliminate repeating groups; ensure atomicity of data.
-
2NF (Second Normal Form): Remove partial dependency (only applicable if relation is in 1NF and has composite keys).
-
3NF (Third Normal Form): Remove transitive dependency.
Step-by-step Normalization
Given Relation (Unnormalized)
Suppose we have this table (relation) StudentDetails:
| StudentID | StudentName | Course | Instructor | InstructorPhone |
|---|---|---|---|---|
| 101 | Ravi Kumar | DBMS | Prof. Sharma | 9876543210 |
| 101 | Ravi Kumar | OS | Prof. Gupta | 9123456789 |
| 102 | Priya Singh | DBMS | Prof. Sharma | 9876543210 |
Step 1: 1NF (First Normal Form)
-
Rule: Each column should have atomic values (no multiple values in a single field), and each row should be unique.
-
Check: Our table already has atomic values.
-
Remove repeating groups: The table is okay for 1NF.
Step 2: 2NF (Second Normal Form)
-
Rule: Must be in 1NF.
-
No partial dependency (meaning: no attribute should depend on only part of a composite key).
-
Find candidate key: Here,
StudentID + Courseuniquely identifies a row. -
Check partial dependency:
-
StudentNamedepends only onStudentID(part of the key) → Partial dependency. -
InstructorandInstructorPhonedepend onCourse(other part of key).
-
-
Solution: Split the table into two tables to remove partial dependency.
Table 1: Student
| StudentID | StudentName |
|---|---|
| 101 | Ravi Kumar |
| 102 | Priya Singh |
Table 2: CourseDetails
| Course | Instructor | InstructorPhone |
|---|---|---|
| DBMS | Prof. Sharma | 9876543210 |
| OS | Prof. Gupta | 9123456789 |
Table 3: StudentCourses
| StudentID | Course |
|---|---|
| 101 | DBMS |
| 101 | OS |
| 102 | DBMS |
Step 3: 3NF (Third Normal Form)
-
Rule: Must be in 2NF.
-
No transitive dependency (i.e., non-key attributes should not depend on other non-key attributes).
Check Table 2 (CourseDetails):
-
InstructorPhonedepends onInstructor(not on the keyCourse) → This is a transitive dependency.
Solution: Split further into:
Table 2a: Course
| Course | Instructor |
|---|---|
| DBMS | Prof. Sharma |
| OS | Prof. Gupta |
Table 2b: Instructor
| Instructor | InstructorPhone |
|---|---|
| Prof. Sharma | 9876543210 |
| Prof. Gupta | 9123456789 |
Final Tables after 3NF
-
Student
| StudentID | StudentName |
|-----------|-------------| -
Instructor
| Instructor | InstructorPhone | -
Course
| Course | Instructor | -
StudentCourses
| StudentID | Course |
Summary:
| Normal Form | What is Removed | Result |
|---|---|---|
| 1NF | Non-atomic values | Atomic columns only |
| 2NF | Partial dependency | Attributes depend on full primary key |
| 3NF | Transitive dependency | No attribute depends on non-key attributes |
Why Normalize?
-
Avoid data redundancy (no duplicate data).
-
Avoid update anomalies (changing data at one place only).
-
Data integrity and consistency.