DBMS NORMALIZATION
First Normal Form-
A given relation is called in First Normal
Form (1NF) if each cell of the table contains only an atomic value.
OR
A relation will be 1NF if it contains an atomic
value.
- It states that an Attribute of
a table cannot hold multiple values. It must hold only single-valued attribute.
- First normal form disallows the multi-valued attribute, composite attribute, and their combinations.
Example-
The following relation is not in 1NF-
Student_id |
Name |
Subjects |
100 |
Ak |
Computer Networks,
Designing |
101 |
Aman |
Database Management
System |
102 |
Anjali |
Automata,
Compiler Design |
Relation is not in 1NF
However,
·
This relation can be brought into 1NF.
·
This can be done by rewriting the relation such that each
cell of the table contains only one value.
Student_id |
Name |
Subjects |
100 |
Aka |
Computer Networks |
100 |
Akash |
Designing |
101 |
Aman |
Database Management System |
102 |
Anjali |
Automata |
102 |
Anjali |
Compiler Design |
This relation is in First Normal Form (1NF).
NOTE-
· By default, every relation is in 1NF.
· This is because formal definition of a relation states that
value of all the attributes must be atomic....{SINGLE VLUE IN E
Second normal form (2NF)
A table is said to be
in 2NF if both the following conditions hold:
· Table is in 1NF (First
normal form)
· A relation is in 2NF if it
has No Partial Dependency .
· All the non prime attribute should be fully functionl dependent on CAndidAte key.
Partial Dependency
A partial
dependency is a dependency where few attributes
of the candidate key determines non-prime attribute(s).
OR
A partial
dependency is a dependency where a portion/part of the candidate key or incomplete candidate key determines non-prime attribute(s).
Example 1 – Consider the table as following below.
STUD_NO
COURSE_NO COURSE_FEE
101 C1 1000
102 C2 1500
103 C4 2000
104 C3 1000
104 C1 1000
102 C5 2000
{Note that, there are many courses
having the same course fee. }
c1= physics....
c2=chemistry
c3=math
c4=it
c5=bio
1.COURSE_FEE cannot alone decide the
value of COURSE_NO or STUD_NO;
2,COURSE_FEE together with STUD_NO cannot decide the value of
COURSE_NO;
3.COURSE_FEE together with COURSE_NO cannot
decide the value of STUD_NO;
Hence,
COURSE_FEE would be a non-prime attribute, as it
does not belong to the one only candidate key {STUD_NO, COURSE_NO} .
But, COURSE_NO -> COURSE_FEE , i.e., COURSE_FEE is dependent on COURSE_NO, which is a proper
subset of the candidate key.
· Non-prime attribute COURSE_FEE is dependent on a proper subset
of the candidate key, which is a partial
dependency and so this relation is not in 2NF.
To convert the
above relation to 2NF,
we need to split the table into two tables such
as :
Table 1: STUD_NO, COURSE_NO
STUD_NO
COURSE_NO
101 C1
102 C2
103 C4
104 C3
104 C1
102 C5
Table 2: COURSE_NO,
COURSE_FEE
COURSE_NO
COURSE_FEE
C1 1000
C2 1500
C4 2000
C3 1000
C1 1000
C5 2000
NOTE-
· To avoid partial dependency, incomplete candidate key must not
determine any non-prime attribute.
· However, incomplete candidate key can determine prime
attributes.
· Partial dependency,will be from prime to non prime.
AB
From here,
v Prime attributes = { AB } ( Candidate Key)
v Non-prime attributes = { C,D }
Now, if we observe the given dependencies-
·
There is no partial
dependency.
Example 2 – Consider following functional dependencies in relation R (A, B , C, D ).
AB
-> C [A and B together determine C]
BC
-> D [B and C together determine D]
In the above relation, AB is the only candidate key and there is no partial dependency, i.e., any proper subset of AB doesn’t determine any non-prime attribute.
Third Normal Form (3NF)
1. A relation will be in 3NF if it is in 2NF and not contain any
transitive partial dependency.
2. 3NF is used to reduce the data duplication. It is also used to
achieve the data integrity.
3. If there is no transitive dependency for non-prime attributes,
then the relation must be in third normal form.
4.Transitive functional dependency may exist, only if there are more than one non key attribute.
Definition:-
Transitive functional dependency arises when one non key attribute is functionally dependent on another non key attribute.
Condition:-
- X
is a super key.
- Y is a prime attribute, i.e., each element of Y is part of some candidate key.
EMPLOYEE_DETAIL
table
EMP_ID |
EMP_NAME |
EMP_ZIP |
EMP_STATE |
EMP_CITY |
222 |
Rahul |
201010 |
UP |
Noida |
333 |
S |
02228 |
UK |
Dehradun |
444 |
Lalit |
60007 |
US |
Chicago |
555 |
Katharine |
06389 |
UK |
Norwich |
666 |
John |
|
MP |
Bhopal |
Super key
in the table above:
1. {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on.
2. Candidate
key: {EMP_ID}
3. Non-prime
attributes: In the given table, all attributes except EMP_ID are
non-prime
Note:- Here, EMP_STATE & EMP_CITY dependent on
EMP_ZIP.
EMP_ZIP dependent on EMP_ID.
The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID). It violates the rule of third normal form.
That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
EMPLOYEE table:
EMP_ID |
EMP_NAME |
EMP_ZIP |
222 |
Rahul |
201010 |
333 |
S |
02228 |
444 |
Lalit |
60007 |
555 |
Katharine |
06389 |
666 |
John |
462007 |
EMPLOYEE_ZIP table:
EMP_ZIP |
EMP_STATE |
EMP_CITY |
201010 |
UP |
Noida |
02228 |
UK |
dehradun |
60007 |
US |
Chicago |
06389 |
UK |
Norwich |
462007 |
MP |
Bhopal |
Notes by:- edukosl
0 Comments
If you have any doubts let me know