Mulitvalued Dependencies
What functional dependencies are obeyed? 1. name → area_code phone email birthdate Josh 253 549-6382 nahumjos@msu.edu 1988-11-01 517 352-8593 Hancheng 382 543-5436 grading_sux@aol.com 1999-12-31 best_teacher@gmail.com Tyler 123-4567 josh_is_silly@msu.edu 1965-03-14 What functional dependencies are obeyed? 1. name → area_code phone 2. name → email 3. name → birthdate 4. None of the above Note, the data is made up, don't get any presents.
name →→ area_code phone name →→ email Josh 253 123-4567 nahumjos@msu.edu 517 111-1111 best_teacher@gmail.com Hancheng 382 543-5436 grading_sux@aol.com Tyler josh_is_silly@msu.edu name birthdate Josh 1988-11-01 Hancheng 1999-12-31 Tyler 1965-03-14 But there is still redundancy!!! There's multiple combinations of phone numbers and email. name →→ area_code phone name →→ email name email Josh nahumjos@msu.edu best_teacher@gmail.com Hancheng grading_sux@aol.com Tyler josh_is_silly@msu.edu name area_code phone Josh 253 123-4567 517 111-1111 Hancheng 382 543-5436 Tyler
Multivalued Dependency A multivalued dependency (MVD) is a statement about some relationship R that when you fix the values for one set of attributes, then the values in certain other attributes are independent of that values of all other attributes in the relation. A1, A2, ..., An →→ B1, B2, ..., Bm holds for a relation R if when we restrict ourselves to the tuples of R that have particular values for each of the attributes among the A's, then the set of values we find among the B's is independent of the set of values we find among the attributes of R that are not among the A's or B's. This MVD holds if: For each pair of tuples t and u of a relation R that agree on all the A's, we can find in R some tuple v that agrees: With both t and u on the A's With t on the B's With u on all attributes of R that are not among the A's or B's.
X →→ Y X Y rest X1 Y1 R1 Y2 R2 ...
Rules regarding MVD's Trivial MVD's: Transitive Rule: FD Promotion: A1, A2, ..., An →→ B1, B2, ..., Bm holds in any relation if {B1, B2, ..., Bm } is a subset of {A1, A2, ..., An } Transitive Rule: if A1, A2, ..., An →→ B1, B2, ..., Bm and B1, B2, ..., Bm →→ C1, C2, ..., Ck then A1, A2, ..., An →→ C1, C2, ..., Ck FD Promotion: Every FD is an MVD if A1, A2, ..., An → B1, B2, ..., Bm, then A1, A2, ..., An →→ B1, B2, ..., Bm Complementation Rule: if A1, A2, ..., An →→ B1, B2, ..., Bm, then A1, A2, ..., An →→ C1, C2, ..., Ck where C are all attributes not among A's and B's
No Splitting Rule for MVD name →→ area_code phone Cannot be split into: name →→ phone name →→ area_code Why? Because the area_code and phone are a group that together form a unit. If you broke them apart, You would make area_code and phone independent, and all possible combinations would need to be present.
Proving All FD's are MVD's X → Y then X →→ Y X Y rest X1 Y1 R1 Y2 == Y1 R2 ...
Fourth Normal Form This form avoids redundancy regarding multivalued dependencies, and is basically identical in approach to Third Normal Form. A relation R is in fourth normal form (4NF) if whenever A1, A2, ..., An →→ B1, B2, ..., Bm is a nontrivial MVD, and {A1, A2, ..., An} is a superkey.
Decomposition into 4NF Input: A relation R0 with a set of functional and multivalued dependencies S0. Output: A decomposition of R0 into relations, all of which are in 4NF. The decomposition has the lossless-join property. Method: Do the following steps, with R = R0: Find a 4NF violation in R, say A1, A2, ..., An →→ B1, B2, ..., Bm, where {A1, A2, ..., An} is not a superkey. Note this MVD could be a true MVD, or it could be a FD (A1, A2, ..., An → B1, B2, ..., Bm,), since every FD is an MVD. If there is none, return R. If there is such a 4NF violation, break the schema for the relation R that has the 4NF violation into two schema: R1, whose schema is A's and B's. R2, whose schema is the A's and all attributes of R that are not among A's and B's. Find the FD's and MVD's that hold in R1 and R2. Recursively decompose R1 and R2 with respect to the respective dependences.