Normalization Refine data To attain a good DB design To reduce redundancy Minimize storage space req Eliminate data anomalies
Non loss decomposition vs lossy decomposition Un normalized table
Functional dependency AB DETERMINANT DETERMINED 1-a 1-a 1-a 1-a 1-b 2-a 2-b 1-a
1NF Identify and remove repeating groups Identify and remove non atomic attr Identify the keys for the table
First (sid, status,city,pid,qty) LPU First (sid, status,city,pid,qty) Sid Status City Pid Qty s1 20 Jal p1 100 P2 150 s2 10 Asr P1 200 250 s3 300 s4 P3 jal p4
2NF A relation is in 2NF iff It is in 1NF Every non key attribute is fully depend on the primary key (remove partial dependencies) Anomalies INSERT UPDATE
Break down the relation FIRST into: SECOND(sid, status, city) and SP(sid,pid,QTY)
3NF Elimination of transitive dependencies Still anomalies are there DELETE UPDATE That’s why SP is further breakdown to : SC(sid, city) CS(city, status)
BCNF A relation is said to be in BCNF iff: It is in 3NF All its determinants (i.e the attributes on which other attribute depends) are candidate keys First (sid, status,city,pid,qty)--- not in BCNF SECOND(sid, status, city) SP(sid,pid,QTY) SC(sid, city) CS(city, status)
Trivial − If a functional dependency (FD) X → Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial FDs always hold. Non-trivial − If an FD X → Y holds, where Y is not a subset of X, then it is called a non-trivial FD. Examples: AA ABA ABC
More examples R={A,B,C,D,E} {A->B,B->C,C->D,D->E,E->A) A-Prime attribute B,C,D- Non Prime attribute {AB->C,C->D, D->E,A->B}
4NF A relation is said to be in 4NF iff It is in 3NF or BCNF It does not contain any multi value dependencies
Multivalue dependency It is the dependancy where one attribute value is a multi-value fact about another. a) {customer_name, address} customer_name -> -> address Address <- <- customer_name b) {Course, student_name, Text_book} Course -> -> student_name Course -> -> Text_book c) {Emp_id,Language, Skill} Emp_id -> -> Language Emp_id -> -> Skill Will result in spurious tuples
EMPID LANGUAGE SKILL 101 English Teaching Hindi Conversation 202 Singing
Rule to transform a relation into 4NF A relation R having A, B,C as attributes can be non loss decomposed into two projections R1(A,B) and R2(A,C) iff: MVD A-> -> B/C holds in R
5NF A relation is said to be in 5NF iff Relation is in 4NF It can n’t be further non loss decomposed