4NF (Multivalued Dependency), and 5NF (Join Dependency)
Review Superkey – a set of attributes which will uniquely identify each tuple in a relation Candidate key – a minimal superkey Primary key – a chosen candidate key Secondary key – all the rest of candidate keys Prime attribute – an attribute that is a part of a candidate key (key column) Non-prime attribute – a non-key column Narotama University
Review (cont.) 1NF 2NF 3NF BCNF Eliminate repeating groups. Make a separate table for each set of related attributes, and give each table a primary key. 2NF Eliminate redundant data. Each attribute must be functionally dependent on the primary key. If an attribute depends on only part of a multi-valued key, remove it to a separate table. 3NF Eliminate columns not dependent on key. If attributes do not contribute to a description of the key, remove them to a separate table. Any transitive dependencies are moved into a smaller table. BCNF Every determinant in the table is a candidate key. If there are non-trivial dependencies between candidate key attributes, separate them out into distinct tables. All normal forms are additive, in that if a model is in 3NF, it is by definition also in 2NF and 1NF. Narotama University
4NF Definition A relation R is in 4NF if and only if, for every one of its non- trivial multivalued dependencies XY, X is a superkey— that is, X is either a candidate key or a superset thereof. Nontrivial MVD means that: Y is not a subset of X, and X and Y are not, together, all the attributes. Narotama University
Decomposition into 4NF If XY is a 4NF violation for relation R, we can decompose R using the same technique as for BCNF. XY is one of the decomposed relations. All but Y – X is the other. Narotama University
Decomposition into 4NF Method Find a 4NF violation in R, say AB, where A is not a superkey. If there is such a 4NF violation, break the schema for the relation R that has the 4NF violation into two schemas. R1, whose schema is A’s and B’s. R2, whose schema is the A’s and all attributes of R that are not among the A’s or B’s. Find the FD’s and MVD’s that hold in R1 and R2. Recursively decompose R1 and R2 with respect to their projected dependencies. Narotama University
4NF Decomposition Example Drinkers(name, addr, phones, beersLiked) FD: nameaddr MVD’s: namephones namebeersLiked Key is {name, phones, beersLiked}. All dependencies violate 4NF. Narotama University
4NF Decomposition Example (cont.) Decompose using nameaddr: Drinkers1(name, addr) In 4NF; only dependency is nameaddr. Drinkers2(name, phones, beersLiked) Not in 4NF. MVD’s namephones and namebeersLiked apply. No FD’s, so all three attributes form the key. Narotama University
4NF Decomposition Example (cont.) Decompose Drinkers2 Either MVD name ->-> phones or name ->-> beersLiked tells us to decompose to: Drinkers3(name, phones) Drinkers4(name, beersLiked) Narotama University
MVD Example Drinkers(name, addr, phones, beersLiked) A drinker’s phones are independent of the beers they like. namephones and namebeersLiked. Thus, each of a drinker’s phones appears with each of the beers they like in all combinations. Narotama University
Tuples Implied by namephones MVD Example (cont.) Tuples Implied by namephones If we have tuples: Then these tuples must also be in the relation. name addr phones beersLiked sue a p1 b1 sue a p2 b2 sue a p2 b1 sue a p1 b2 Narotama University
Example Drinkers(name, addr, phones, beersLiked) with MVD Name phones. If Drinkers has the two tuples: name addr phones beersLiked sue a p1 b1 sue a p2 b2 it must also have the same tuples with phones components swapped: sue a p2 b1 sue a p1 b2 Note: we must check this condition for all pairs of tuples that agree on name, not just one pair. Narotama University
Example (cont’d) Violates 4NF because CUSTOMER is not a superkey (and CUSTOMER->->LOAN_NO is non-trivial) Narotama University
Solution: Narotama University
HIGHER NORMAL FORMS 1NF 2NF 3NF BCNF 4NF 5NF functional dependencies multivalued dependencies join dependencies Narotama University
24-HOUR FLIGHT-TIMETABLE, ALL FLIGHTS EVERY DAY ASSUMPTIONS 24-HOUR FLIGHT-TIMETABLE, ALL FLIGHTS EVERY DAY ALL PLANES TAKE-OFF AND LAND (BUT DO NOT CRASH) NO AIRPORT IS LANDING-ONLY & NO AIRPORT IS TAKE-OFF-ONLY TTAB (ORG) = TTAB (DST) THERE IS AT LEAST ONE TIME DELAY ENTRY FOR EVERY FLIGHT SIMILARLY IN WEATHER REPORT HISTORY IF NO TWO FLIGHTS CAN TAKE OFF AT THE SAME TIME IN THE SAME AIRPORT WES CAN BE POSTED TO FLIGHT AND WEATHER@ORIGIN ELIMINATED Narotama University
teaches (UNIVERSITY, DISCIPLINE) is_read_for (DISCIPLINE, DEGREE) Old Town Computing BSc Mathematics PhD New City AWARD teaches (UNIVERSITY, DISCIPLINE) is_read_for (DISCIPLINE, DEGREE) awards (UNIVERSITY, DEGREE) teaches (NewCity, Computing) = true awards (NewCity, PhD) = true is_read_for (Computing, BSc) = true FROM (NewCity teaches Computing) and (Computing is_read_for BSc) IT DOES NOT FOLLOW NewCity awards BSc for_reading Computing Narotama University
Narotama University
Join Dependency JD * (R1, R2, R3, ..., Rm) holds in R iff R = join (R1, R2, R3, ..., Rm ), Ri - a projection of R Narotama University
Fifth Normal Form preventing illogical conjunction of facts R A relation R is in 5NF iff for all JD * (R1, R2, R3, ..., Rm) in R, every Ri is a superkey for R. JD* ( , ) holds for R then R is not in 5NF if does not contain key JD* ( , ) then R is in 5NF if Narotama University
AWARD UNIVERSITY DISCIPLINE Old Town Computing Mathematics New City DEGREE Computing BSc Mathematics PhD UNIVERSITY DEGREE Old Town BSc PhD New City Narotama University
candidate keys - NAME or CODE join dependencies JD1 * ((NAME, CODE, TEACHING), (NAME, RESEARCH)) JD2 * ((NAME, CODE, RESEARCH), (NAME, TEACHING)) JD3 * ((NAME, CODE, TEACHING), (CODE, RESEARCH)) JD4 * ((NAME, CODE, RESEARCH), (CODE, TEACHING)) JD5 * ((NAME, CODE), (NAME, TEACHING), (CODE,RESEARCH)) ............................................................................................. all projections in JD1 - to JD5 are superkeys for RANKING 5NF Narotama University
Summary A multivalued dependency is a statement that two sets of attributes in a relation have sets of values that appear in all possible combinations. If a relation is in 4NF, then every nontrivial MVD is really an FD with a superkey on the left. Narotama University
References http://www.cs.sjsu.edu/~lee/cs157b/Spring09Presentation/ 4NF_and_Multivalued_Dependency_by_Kristina_Miguel. ppt http://www.cs.sjsu.edu/faculty/lee/cs157/4NF%20and%205 NF.ppt Narotama University