Presentation is loading. Please wait.

Presentation is loading. Please wait.

4 TH NORMAL FORM & Lossless Decomposition By: Karen McVay CS 157B.

Similar presentations


Presentation on theme: "4 TH NORMAL FORM & Lossless Decomposition By: Karen McVay CS 157B."— Presentation transcript:

1 4 TH NORMAL FORM & Lossless Decomposition By: Karen McVay CS 157B

2 REVIEW OF NFs 1NF  All values of the columns are atomic. That is, they contain no repeating values. 1NF  All values of the columns are atomic. That is, they contain no repeating values. 2NF  it is in 1NF and every non-key column is fully dependent upon the primary key (avoid partial dependencies) 2NF  it is in 1NF and every non-key column is fully dependent upon the primary key (avoid partial dependencies)

3 REVIEW OF NF Cont… 3NF  it is in 2NF and every non-key column is non transitively dependent upon its primary key. In other words, all non-key attributes are functionally dependent only upon the primary key. 3NF  it is in 2NF and every non-key column is non transitively dependent upon its primary key. In other words, all non-key attributes are functionally dependent only upon the primary key. BCNF  A relation is in BCNF if every determinant is a candidate key. This is an improved form of third normal form. BCNF  A relation is in BCNF if every determinant is a candidate key. This is an improved form of third normal form. Determinant: an attribute on which some other attribute is fully functionally dependent

4 4NF and Multivalued Dependencies Some relations can exist that are in BCNF but they have redundant data and have update anomalies The next highest normal form is 4NF 4NF is based on multivalued dependencies

5 Multivalued Dependencies Consider a relation R with attributes X, Y, Z where X, Y, Z are sets of attributes The multivalued dependency, X Y, exists if when two tuples exist having the same X values: T1(x, y1, z1) and T2(x, y2, z2), implies the two tuples – –T4(x, y2, z1) and T3(x, y1, z2) also exist

6 Example Suppose we have two one-to-many relationships: Each employee may have many dependants Each employee may work on many projects For any employee, the dependents are completely independent of the projects – –For a given value of ename, the values of pname are only determined by ename and not dname – –For a given value of ename, the values of dname are only determined by ename and not pname – –So, each dname is repeated for each pname, and viceversa

7 Consider the relation EMP enamepnamedname EMP If (Smith, X, John) and (Smith, Y, Anna) exist, then (Smith, Y, John) and (Smith, X, Anna) exist The MVD ename pname | dname exists in EMP Note that EMP is BCNF, and there is a lot of redundancy in EMP

8 We might have liked to have: enamepnamedname EMP SmithX, YJohn, Anna But 1NF does not permit multivalued attributes

9 enamepnamedname EMP SmithXJohn SmithYAnna So, instead of : enamepnamedname EMP SmithX, YJohn, Anna We have: SmithYJohn SmithXAnna

10 Note that if X Y | Z exists, then R can be decomposed into (X,Y) and (R-Y) XYZ R XY Ra XZ Rb And this is a lossless decomposition Decomposing a MVD without loss of information

11 As ename pname | dname exists, EMP can be decomposed into enamepnamedname EMP enamepname EMPa enamedname EMPb This is a lossless decomposition

12 4th Normal Form A Boyce Codd normal form relation is in fourth normal form if (a) there is no multi value dependency in the relation or (b) there are multi value dependency but the attributes, which are multi value dependent on a specific attribute, are dependent between themselves.

13 4 th Normal Form Cont… This is best discussed through mathematical notation. Assume the following relation R(a:pk1, b:pk2, c:pk3) Recall that a relation is in BCNF if all its determinant are candidate keys, in other words each determinant can be used as a primary key. Because relation R has only one determinant (a, b, c), which is the composite primary key and since the primary is a candidate key therefore R is in BCNF.

14 4 th Normal Form Cont… Now R may or may not be in fourth normal form. 1. If R contains no multi value dependency then R will be in Fourth normal form. 2. Assume R has the following two-multi value dependencies: a --->> b and a --->> c In this case R will be in the fourth normal form if b and c dependent on each other. However if b and c are independent of each other then R is not in fourth normal form and the relation has to be projected to two non-loss projections.

15 Consider a case of class enrollment. Each student can be enrolled in one or more classes and each class can contain one or more students. Clearly, there is a many-to-many relationship between classes and students. This relationship can be represented by a Student/Class cross- reference table: {StudentID, ClassID} Example

16 Example Cont… The key for this table is the combination of StudentID and ClassID. To avoid violation of 2NF, all other information about each student and each class is stored in separate Student and Class tables, respectively. The key for this table is the combination of StudentID and ClassID. To avoid violation of 2NF, all other information about each student and each class is stored in separate Student and Class tables, respectively. Note that each StudentID determines not a unique ClassID, but a well-defined, finite set of values. This kind of behavior is referred to as multi-valued dependency of ClassID on StudentID. Note that each StudentID determines not a unique ClassID, but a well-defined, finite set of values. This kind of behavior is referred to as multi-valued dependency of ClassID on StudentID.

17 Consider another example with two many-to-many relationships, between students and classes and between classes and teachers. Consider another example with two many-to-many relationships, between students and classes and between classes and teachers. Example 2 StudentsClasses ** Also, a many-to-many relationship between students and teachers is implied. Also, a many-to-many relationship between students and teachers is implied. ClassesTeachers **

18 The combination of StudentID and TeacherID does not contain any additional information beyond the information implied by the student/class and class/teacher relationships. The combination of StudentID and TeacherID does not contain any additional information beyond the information implied by the student/class and class/teacher relationships. Consequentially, the student/class and class/teacher relationships are independent of each other—these relationships have no additional constraints. The following table is, then, in violation of 4NF: Consequentially, the student/class and class/teacher relationships are independent of each other—these relationships have no additional constraints. The following table is, then, in violation of 4NF: {StudentID, ClassID, TeacherID} Example 2 Cont…

19 As an example of the anomalies that can occur, realize that it is not possible to add a new class taught by some teacher without adding at least one student who is enrolled in this class. As an example of the anomalies that can occur, realize that it is not possible to add a new class taught by some teacher without adding at least one student who is enrolled in this class. 4 th NF and Anomalies

20 4 th Normal Form and anomalies Cont… Case 1: Assume the following relation: Employee (Eid:pk1, Language:pk2, Skill:pk3) No multi value dependency, therefore R is in fourth normal form. No multi value dependency, therefore R is in fourth normal form.

21 case 2: Assume the following relation with multi-value dependency: Employee (Eid:pk1, Languages:pk2, Skills:pk3) Eid --->> LanguagesEid --->> Skills Languages and Skills are dependent. This says an employee speak several languages and has several skills. However for each skill a specific language is used when that skill is practiced. 4th Normal Form and anomalies Cont…

22 Thus employee 100 when he/she teaches speaks English but when he cooks speaks French. This relation is in fourth normal form and does not suffer from any anomalies. EidLanguageSkill 100EnglishTeaching 100KurdishPolitic 100FrenchCooking 200EnglishCooking 200ArabicSinging

23 case 3: Assume the following relation with multi- value dependency: Employee (Eid:pk1, Languages:pk2, Skills:pk3) Eid --->> LanguagesEid --->> Skills Languages and Skills are independent. 4th Normal Form and anomalies Cont…

24 EidLanguageSkill 100EnglishTeaching 100KurdishPolitic 100EnglishPolitic 100KurdishTeaching 200ArabicSinging This relation is not in fourth normal form and suffers from all three types of anomalies.

25 Insertion anomaly: To insert row (200 English Cooking) we have to insert two extra rows (200 Arabic cooking), and (200 English Singing) otherwise the database will be inconsistent. Note the table will be as follow: EidLanguageSkill 100EnglishTeaching 100KurdishPolitics 100EnglishPolitics 100KurdishTeaching 200ArabicSinging 200EnglishCooking 200ArabicCooking 200EnglishSinging

26 Deletion anomaly: If employee 100 discontinue politic skill we have to delete two rows: Deletion anomaly: If employee 100 discontinue politic skill we have to delete two rows: (100 Kurdish Politic), and (100 English Politic) otherwise the database will be inconsistent. EidLanguageSkill 100EnglishTeaching 100KurdishPolitics 100EnglishPolitics 100KurdishTeaching 200ArabicSinging 200EnglishCooking 200ArabicCooking 200EnglishSinging

27 More anomalies Update anomaly: If employee 200 changes his skill from singing to dancing we have to make changes in more than one place. Update anomaly: If employee 200 changes his skill from singing to dancing we have to make changes in more than one place.

28 The relation is projected to the following two non-loss projections which are in forth normal form Emplyee_Language(Eid:pk1, Languages:pk2) EidLanguage 100English 100Kurdish 200Arabic

29 Emplyee_skill(Eid:pk1, Skills:pk2) EidSkill 100Teaching 100Politic 200Singing Cont…

30 References Functional Dependency (Normalization) http://www.emunix.emich.edu/~khaila ny/files/Normalization.htm http://www.emunix.emich.edu/~khaila ny/files/Normalization.htm http://www.emunix.emich.edu/~khaila ny/files/Normalization.htm Multivalued Dependencies (Ozmar Zaine): http://www.cs.sfu.ca/CC/354/zaiane/materi al/notes/Chapter7/node13.html http://www.cs.sfu.ca/CC/354/zaiane/materi al/notes/Chapter7/node13.html http://www.cs.sfu.ca/CC/354/zaiane/materi al/notes/Chapter7/node13.html


Download ppt "4 TH NORMAL FORM & Lossless Decomposition By: Karen McVay CS 157B."

Similar presentations


Ads by Google