Chapter 15 Relational Design Algorithms and Further Dependencies

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Further Dependencies by Pinar Senkul resources: mostly froom Elmasri, Navathe and other books.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
C.1 Appendix C: Advanced Relational Database Design Reasoning with MVDs Higher normal forms Join dependencies and PJNF DKNF.
YV - Relational DB Design and Normalization 236 Κεφάλαιο 6 ΣΧΕΔΙΑΣΜΟΣ ΒΑΣΕΩΝ ΔΕΔΟΜΕΝΩΝ ΚΑΙ ΚΑΝΟΝΙΚΟΠΟΙΗΣΗ (Normalization)
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
1 Multi-valued Dependencies. 2 Multivalued Dependencies There are database schemas in BCNF that do not seem to be sufficiently normalized. Consider a.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Relational Database Design Algorithms by Pinar Senkul resources: mostly froom Elmasri, Navathe.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Chapter 11 Relational Database Design Algorithms and Further Dependencies Copyright © 2004 Ramez Elmasri and Shamkant Navathe.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide DESIGNING A SET OF RELATIONS (2) Goals: Lossless join property (a must). Dependency.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 11 Relational Database Design Algorithms and Further Dependencies.
Relational Database Design Algorithms and Further Dependencies.
Chapter Functional Dependencies and Normalization for Relational Databases.
Chapter 11: Relational Database Design Algorithms and Further Dependencies Chapter 11: Relational Database Design Algorithms and Further Dependencies 1.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Deanship of Distance Learning Avicenna Center for E-Learning 1 Session - 7 Sequence - 2 Normalization Functional Dependencies Presented by: Dr. Samir Tartir.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Relational Database Design Algorithms and Further Dependencies.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Functional Dependencies and Normalization for Relational Databases 1 Chapter 15 تنبيه : شرائح العرض (Slides) هي وسيلة لتوضيح الدرس واداة من الادوات في.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
10/3/2017.
10/3/2017.
Relational Database Design Algorithms and Further Dependencies
Normalization Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 19.
Functional Dependency and Normalization
CSE202 Database Management Systems
Advanced Normalization
CHAPTER 14 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Schema Refinement and Normal Forms
Normalization Functional Dependencies Presented by: Dr. Samir Tartir
Module 5: Overview of Database Design -- Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Relational Database Design by Dr. S. Sridhar, Ph. D
Chapter 8: Relational Database Design
Relational Database Design
Chapter 8: Relational Database Design
3.1 Functional Dependencies
Advanced Normalization
11/12/2018.
Functional Dependencies and Normalization for Relational Databases
Database Management systems Subject Code: 10CS54 Prepared By:
Module 5: Overview of Normalization
Chapter 7: Relational Database Design
Normalization.
Appendix C: Advanced Normalization Theory
Relational Database Design Algorithms and Further Dependencies
Relational Database Design
Chapter Outline 1 Informal Design Guidelines for Relational Databases
Join Dependencies and Fifth Normal Form
Appendix C: Advanced Relational Database Design
Chapter 28: Advanced Relational Database Design
Chapter 7a: Overview of Database Design -- Normalization
Presentation transcript:

Chapter 15 Relational Design Algorithms and Further Dependencies Relational Databases Design Algorithms Relational Decomposition The Dependency Preservation Property The Loossless Join Property Null Values and Dangling Tuples Further Dependencies and Normal Forms Multivalued Dependencies and 4NF Join Dependencies and 5NF Inclusion Dependencies

(cont.) Template Dependencies Domain-Key Normal Form

1 Relational Databases Design Algorithms Normal forms are not sufficient criteria for a good design Example: Any relation with two attributes is always in BCNF-so we can create 2-attribute relations arbitrarily and get BCNF Additional conditions are needed to ensure a good design Lossless join property Dependency preserving property

1.1 Relational Decomposition We start with a universal relation schema R containing all the database attributes: R= {A1, A2, …, An} The design goal is a decomposition D of R into m relation schemas: D = {R1, R2, …, Rm} Each relation schema Ri contains a subset of the attributes of R Every attribute in R should appear in at least one Ri

1.2 The Dependency Preservation Property The database designers define a set F of functional dependencies that should hold on the attributes of R D should preserve the dependencies; informally, the collection of all dependencies that hold on the individual relations Ri should be equivalent to F Formally: Define the projection of F on Ri, denoted by F(Ri), to be the set of FDs X  Y in F+ such that (X  Y)  Ri

(cont.) A decomposition D = {R1, R2, …, Rm} is dependency preserving if (F(R1)  F(R2)  …  F(Rm))+ = F+ This property makes it possible to ensure that the FDs in F hold simply by ensuring that the dependencies on each relation Ri hold individually There is an algorithm to decompose R into a dependency preserving decomposition D = {R1, R2, …, Rm} with respect to F such that each Ri is in 3NF

(cont.) Called the relational synthesis algorithm Find a minimal set of FDs G equivalent to F For each X of an FD X  A in G Create a relation schema Ri in D with the attributes {X  A1  A2  …  Ak} where the Aj’s are all the attributes appearing in an FD in G with X as left hand side If any attributes in R are not placed in any Ri, create another relation in D for these attributes It can be proven that all dependencies in F are preserved and that all relation schemas in D are in 3NF

(cont.) Problems: Must find a minimal cover G for F No efficient algorithm for finding a minimal cover Several minimal covers can exist for F; the result of the algorithm can be different depending on which is chosen

1.3 The Lossless(Non-Additive) Join Property Informally, this property ensures that no spurious tuples appear when the relations in the decomposition are JOINed Formally: A decomposition D = {R1, R2, …, Rm} of R has the lossless join property with respect to a set F of FDs if, for every relation instance r® whose tuples satisfy all the FDs in F, we have: (R1(r(R))  R2(r(R))  …  Rm(r(R))) = r(R) This condition ensures that whenever a relation instance r(R) satisfies F, no spurious tuples are generated by joining the decomposed relations r(Ri)

(cont.) Since we actually store the decomposed relations as base relations, this condition is necessary to generate meaningful results for queries involving JOINs There is an algorithm for testing whether a decomposition D satisfies the lossless join property with respect to a set F of FDs

Figure 15.1

(cont.) There is an algorithm for decomposing R into BCNF relations such that the decomposition has the lossless join property with respect to a set of FDs F on R Set D  {R} While there is a relation schema Q in D that is not in BCNF do Begin Choose one Q in D that is not in BCNF; Find a FD X Y in Q that violates BCNF; Replace Q in D by two relation schemas(Q-Y) and (XY) end;

(cont.) This is based on two properties of lossless join decomposition: The decomposition D = {R1, R2} of R has the lossless join property w.r.t F if and only if either: The FD (R1R2)  (R1  R2) is in F+, or The FD (R1R2)  (R2  R1) is in F+ If D = {R1, R2, …, Rm} of R has the lossless join property w.r.t a set F, and D1 = {Q1,Q2, …, Qk} of Ri has the lossless join property w.r.t. F(Ri) then D = {R1, R2, …, Ri-1, Q1, Q2, …, Qk. Ri+1,… Rm} has the lossless join property w.r.t F

(cont.) Meaning: if D is already lossless in join,then any decomposition Di of Ri that is also lossless in join will also make D lossless in join. There is no algorithm for decomposition into BCNF relations that is dependency preserving A modification of the synthesis algorithm guarantees both the lossless join and dependency preserving properties but into 3NF relations (not BCNF) Fortunately, many 3NF relations are also in BCNF Lossless join and dependency preserving decomposition into 3NF relations:

(cont.) Find a minimal set of FDs G equivalent to F For each X of an FD X  Y in G Create a relation schema Ri in D with the attributes {X  A1  A2  …  Ak} where the Aj’s are all the attributes appearing in an FD in G with X as left hand side If any attributes in R are not placed in any Ri, create another relation in D for these attributes If none of the relations in D contain a key of R, create a relation that contains a key of R and add it to D

1.4 Null Values and Dangling Tuples Null values may create problems if they appear as join attributes The distinction between the result of a regular join and an outer join becomes important when specifying queries Some queries require regular join and other queries require outer join Dangling tuples: Sometimes the attributes of a relation are “partitioned” into several relations with the primary key repeated in each of the relations

(cont.) Tuples whose primary key does not appear in all of the relations are called “dangling tuples” If a regular join is taken on the relations on the primary key to rebuild the tuples, the dangling tuples do not appear in the result

Figure 15.2

Figure 15.3

2. Further Dependencies and Normal Forms Functional dependencies(FDs) are used to specify one very common type of constraint Other types of constraints cannot be specified by FDs alone Additional dependencies include multivalued dependencies(MVDs), join dependencies(JDs), inclusion dependencies Some dependencies lead to normal forms beyond 3NF and BCNF

2.1Multivalued Dependencies and 4NF Informally, a set of attributes X multidetermines a set of attributes Y if the value of X determines a set of values for Y(independently of any other attributes) A multi-valued dependency (MVD) is written as X  Y Specifies a constraint on all relation instances r(R) Formally: Let R be a relation schema; let X and Y be subsets of the attributes in R; and let X = R  (XY) (the remaining attributes)

(cont.) An MVD X  Y holds in R if whenever two tuples t1 and t2 exist in a relation instance r(R) with t1[X] = t2[X], then two tuples t3 and t4 must also exist in r(R) such that the follwing holds: t3[X] = t4[X] = t1[X] = t2[X] t3[Y] = t1[Y] and t4[Y] = t2[Y] t3[Z] = t2[Z] and t4[Z] = t1[Z] The MVD constraint implies that a value of X determines a set of values of Y independently from the values of Z Property of MVD: If X  Y holds, then X  Z also holds t1: x y1 z1 t2: x y2 z2 t3: x y1 z2 t4: x y2 z1  xy x  y/z

(cont.) An MVD X  Y is called a trivial MVD if either: (a) Y  X or, (b) (X  Y) = R A trivial MVD always holds according to the formal MVD definition

Figure 15.4

(cont.) Sound and complete set of inference rules for FDs and MVDs: Given a set F of FDs and MVDs, we can infer additional FDs and MVDs that hold whenever the dependencies in F hold Sound and complete set of inference rules for FDs and MVDs: (Reflexive for FDs) If Y  X, then X  Y (Augmentation for FDs) If X  Y, then XZ  YZ (Notation: XZ stands for X  Z) (Transitive for FDs) If X  Y and Y  Z, then X  Z

(cont.) (Complementation for MVDs) If X  Y, then X  Z (where Z = R  (X  Y)) (Augmentation for MVDs) If X  Y and Z  W, then WX  YZ (Transitive rule for MVDs) If X  Y and Y  Z, then X  (Z  Y) (Relication rule FD to MVD) If X  Y, then X  Y (Coalescence rule for MVDs) If X  Y and there exists W such that (a) Z  Y, (b) W  Z, and (c) W  Y is empty, then X  Z

(cont.) Notes: Fourth Normal Form(4NF): By rule I7, every FD is also an MVD I1 to I8 can derive the closure F+ of a set of dependencies F Fourth Normal Form(4NF): 3NF and BCNF do not deal with multivalued dependencies A relation schema with some non-trivial MVDs may not be a good design (see Figures 15.4 and 15.5) 4NF takes care of these problems, and implies BCNF (every relation in 4NF is also in BCNF)

(cont.) Formal definition of 4NF: Notes: A relation schema R is in 4NF with respect to a set of (functional and multivalued) dependencies F if for every nontrivial multivalued dependency X  Y in F+, X is a superkey of R Check Fig15.4(a) is not in 4NF; ENAME  PNAME/DNAME But ENAME is not the superkey Notes: Since every FD is an MVD, 4NF implies BCNF If all dependencies in F are FDs, the definition for 4NF becomes the definition for BCNF

(cont.) There is an algorithm for decomposing R into 4NF relations such that the decomposition has the lossless join property with respect to a set F of FDs and MVDs on R Algorithm 4.5: Set D  {R} While there is a relation schema Q in D that is not in 4NF do begin Choose one Q in D that is not in 4NF; Find a nontrivial MVD X  Y in Q that violates 4NF; Replace Q in D by two relation schemas (Q  Y) and (X  Y) end;

(cont.) Algorithm 15.5 is based on property LJ1: A decomposition D = {R1, R2} of R has the lossless join property with respect to F if and only if either: (a) (R1 R2)  (R1  R2) holds in F+, or (b) (R1 R2)  (R2  R1) holds in F+

Figure 15.5

2.2 Join Dependencies and 5NF A join dependency JD(R1, R2, …, Rn) is a constraint on R Specifies that every legal instance r(R) should have a lossless join decomposition into R1, R2, …, Rn An MVD is a special case of a JD where n= 2 A JD(R1, R2, …, Rn) is a trivial JD if some Ri = R Fifth normal form (5NF): A relation schema R is in 5NF with respect to a set F of FDs, MVDs, and JDs if for every nontrivial JD(R1, R2, …, Rn), each Ri is a superkey of R

(cont.) 5NF is also called PJNF (project-join normal form) Note: Unfortunately, the theory of MVD and JD lossless join decomposition does not take NULL values into account

2.3 Inclusion Dependencies Used to specify referential integrity and class/subclass constraints between two relations R and S An inclusion dependency R.X < S.Y specifies that at any point in time, if r(R) and s(S) are relation instances of R and S, then: X(r(R))  Y(s(S)) The attributes X of R and Y of S must be compatible

(cont.) Sound and complete inference rules for inclusion dependencies: (ID1) R.X < R.X (ID2) If R.X < S.Y, where X = {A1, A2, …, An}, Y = {B1, B2, …, Bn}, and Ai corresponds to Bi, then Ai < Bi for 1  i  n (ID3) If R.X < S.Y and S.Y < T.Z, then R.X < T.Z So far, there are no normal forms based on inclusion dependencies Example: Department.DMGSSN < EMPLOYEE.SSN WORKS_ON.SSN < EMPLOYEE.SSN WORKS_ON.PNUMBER < Project.PNUMBER