Chapter 15 Relational Design Algorithms and Further Dependencies

Chapter 15 Relational Design Algorithms and Further Dependencies
Relational Databases Design Algorithms Relational Decomposition The Dependency Preservation Property The Loossless Join Property Null Values and Dangling Tuples Further Dependencies and Normal Forms Multivalued Dependencies and 4NF Join Dependencies and 5NF Inclusion Dependencies

(cont.) Template Dependencies Domain-Key Normal Form

1 Relational Databases Design Algorithms
Normal forms are not sufficient criteria for a good design Example: Any relation with two attributes is always in BCNF-so we can create 2-attribute relations arbitrarily and get BCNF Additional conditions are needed to ensure a good design Lossless join property Dependency preserving property

1.1 Relational Decomposition
We start with a universal relation schema R containing all the database attributes: R= {A1, A2, …, An} The design goal is a decomposition D of R into m relation schemas: D = {R1, R2, …, Rm} Each relation schema Ri contains a subset of the attributes of R Every attribute in R should appear in at least one Ri

1.2 The Dependency Preservation Property
The database designers define a set F of functional dependencies that should hold on the attributes of R D should preserve the dependencies; informally, the collection of all dependencies that hold on the individual relations Ri should be equivalent to F Formally: Define the projection of F on Ri, denoted by F(Ri), to be the set of FDs X  Y in F+ such that (X  Y)  Ri

(cont.) A decomposition D = {R1, R2, …, Rm} is dependency preserving if (F(R1)  F(R2)  …  F(Rm))+ = F+ This property makes it possible to ensure that the FDs in F hold simply by ensuring that the dependencies on each relation Ri hold individually There is an algorithm to decompose R into a dependency preserving decomposition D = {R1, R2, …, Rm} with respect to F such that each Ri is in 3NF

(cont.) Called the relational synthesis algorithm
Find a minimal set of FDs G equivalent to F For each X of an FD X  A in G Create a relation schema Ri in D with the attributes {X  A1  A2  …  Ak} where the Aj’s are all the attributes appearing in an FD in G with X as left hand side If any attributes in R are not placed in any Ri, create another relation in D for these attributes It can be proven that all dependencies in F are preserved and that all relation schemas in D are in 3NF

(cont.) Problems: Must find a minimal cover G for F
No efficient algorithm for finding a minimal cover Several minimal covers can exist for F; the result of the algorithm can be different depending on which is chosen

1.3 The Lossless(Non-Additive) Join Property
Informally, this property ensures that no spurious tuples appear when the relations in the decomposition are JOINed Formally: A decomposition D = {R1, R2, …, Rm} of R has the lossless join property with respect to a set F of FDs if, for every relation instance r® whose tuples satisfy all the FDs in F, we have: (R1(r(R))  R2(r(R))  …  Rm(r(R))) = r(R) This condition ensures that whenever a relation instance r(R) satisfies F, no spurious tuples are generated by joining the decomposed relations r(Ri)

(cont.) Since we actually store the decomposed relations as base relations, this condition is necessary to generate meaningful results for queries involving JOINs There is an algorithm for testing whether a decomposition D satisfies the lossless join property with respect to a set F of FDs

Figure 15.1

(cont.) There is an algorithm for decomposing R into BCNF relations such that the decomposition has the lossless join property with respect to a set of FDs F on R Set D  {R} While there is a relation schema Q in D that is not in BCNF do Begin Choose one Q in D that is not in BCNF; Find a FD X Y in Q that violates BCNF; Replace Q in D by two relation schemas(Q-Y) and (XY) end;

(cont.) This is based on two properties of lossless join decomposition: The decomposition D = {R1, R2} of R has the lossless join property w.r.t F if and only if either: The FD (R1R2)  (R1  R2) is in F+, or The FD (R1R2)  (R2  R1) is in F+ If D = {R1, R2, …, Rm} of R has the lossless join property w.r.t a set F, and D1 = {Q1,Q2, …, Qk} of Ri has the lossless join property w.r.t. F(Ri) then D = {R1, R2, …, Ri-1, Q1, Q2, …, Qk. Ri+1,… Rm} has the lossless join property w.r.t F

(cont.) Meaning: if D is already lossless in join,then any decomposition Di of Ri that is also lossless in join will also make D lossless in join. There is no algorithm for decomposition into BCNF relations that is dependency preserving A modification of the synthesis algorithm guarantees both the lossless join and dependency preserving properties but into 3NF relations (not BCNF) Fortunately, many 3NF relations are also in BCNF Lossless join and dependency preserving decomposition into 3NF relations:

(cont.) Find a minimal set of FDs G equivalent to F
For each X of an FD X  Y in G Create a relation schema Ri in D with the attributes {X  A1  A2  …  Ak} where the Aj’s are all the attributes appearing in an FD in G with X as left hand side If any attributes in R are not placed in any Ri, create another relation in D for these attributes If none of the relations in D contain a key of R, create a relation that contains a key of R and add it to D

1.4 Null Values and Dangling Tuples
Null values may create problems if they appear as join attributes The distinction between the result of a regular join and an outer join becomes important when specifying queries Some queries require regular join and other queries require outer join Dangling tuples: Sometimes the attributes of a relation are “partitioned” into several relations with the primary key repeated in each of the relations

(cont.) Tuples whose primary key does not appear in all of the relations are called “dangling tuples” If a regular join is taken on the relations on the primary key to rebuild the tuples, the dangling tuples do not appear in the result

Figure 15.2

Figure 15.3

2. Further Dependencies and Normal Forms
Functional dependencies(FDs) are used to specify one very common type of constraint Other types of constraints cannot be specified by FDs alone Additional dependencies include multivalued dependencies(MVDs), join dependencies(JDs), inclusion dependencies Some dependencies lead to normal forms beyond 3NF and BCNF

2.1Multivalued Dependencies and 4NF
Informally, a set of attributes X multidetermines a set of attributes Y if the value of X determines a set of values for Y(independently of any other attributes) A multi-valued dependency (MVD) is written as X  Y Specifies a constraint on all relation instances r(R) Formally: Let R be a relation schema; let X and Y be subsets of the attributes in R; and let X = R  (XY) (the remaining attributes)

(cont.) An MVD X  Y holds in R if whenever two tuples t1 and t2 exist in a relation instance r(R) with t1[X] = t2[X], then two tuples t3 and t4 must also exist in r(R) such that the follwing holds: t3[X] = t4[X] = t1[X] = t2[X] t3[Y] = t1[Y] and t4[Y] = t2[Y] t3[Z] = t2[Z] and t4[Z] = t1[Z] The MVD constraint implies that a value of X determines a set of values of Y independently from the values of Z Property of MVD: If X  Y holds, then X  Z also holds t1: x y1 z1 t2: x y2 z2 t3: x y1 z2 t4: x y2 z1  xy x  y/z

(cont.) An MVD X  Y is called a trivial MVD if either: (a) Y  X or,
(b) (X  Y) = R A trivial MVD always holds according to the formal MVD definition

Figure 15.4

(cont.) Sound and complete set of inference rules for FDs and MVDs:
Given a set F of FDs and MVDs, we can infer additional FDs and MVDs that hold whenever the dependencies in F hold Sound and complete set of inference rules for FDs and MVDs: (Reflexive for FDs) If Y  X, then X  Y (Augmentation for FDs) If X  Y, then XZ  YZ (Notation: XZ stands for X  Z) (Transitive for FDs) If X  Y and Y  Z, then X  Z

(cont.) (Complementation for MVDs) If X  Y, then X  Z (where Z = R  (X  Y)) (Augmentation for MVDs) If X  Y and Z  W, then WX  YZ (Transitive rule for MVDs) If X  Y and Y  Z, then X  (Z  Y) (Relication rule FD to MVD) If X  Y, then X  Y (Coalescence rule for MVDs) If X  Y and there exists W such that (a) Z  Y, (b) W  Z, and (c) W  Y is empty, then X  Z

(cont.) Notes: Fourth Normal Form(4NF):
By rule I7, every FD is also an MVD I1 to I8 can derive the closure F+ of a set of dependencies F Fourth Normal Form(4NF): 3NF and BCNF do not deal with multivalued dependencies A relation schema with some non-trivial MVDs may not be a good design (see Figures 15.4 and 15.5) 4NF takes care of these problems, and implies BCNF (every relation in 4NF is also in BCNF)

(cont.) Formal definition of 4NF: Notes:
A relation schema R is in 4NF with respect to a set of (functional and multivalued) dependencies F if for every nontrivial multivalued dependency X  Y in F+, X is a superkey of R Check Fig15.4(a) is not in 4NF; ENAME  PNAME/DNAME But ENAME is not the superkey Notes: Since every FD is an MVD, 4NF implies BCNF If all dependencies in F are FDs, the definition for 4NF becomes the definition for BCNF

(cont.) There is an algorithm for decomposing R into 4NF relations such that the decomposition has the lossless join property with respect to a set F of FDs and MVDs on R Algorithm 4.5: Set D  {R} While there is a relation schema Q in D that is not in 4NF do begin Choose one Q in D that is not in 4NF; Find a nontrivial MVD X  Y in Q that violates 4NF; Replace Q in D by two relation schemas (Q  Y) and (X  Y) end;

(cont.) Algorithm 15.5 is based on property LJ1:
A decomposition D = {R1, R2} of R has the lossless join property with respect to F if and only if either: (a) (R1 R2)  (R1  R2) holds in F+, or (b) (R1 R2)  (R2  R1) holds in F+

Figure 15.5

2.2 Join Dependencies and 5NF
A join dependency JD(R1, R2, …, Rn) is a constraint on R Specifies that every legal instance r(R) should have a lossless join decomposition into R1, R2, …, Rn An MVD is a special case of a JD where n= 2 A JD(R1, R2, …, Rn) is a trivial JD if some Ri = R Fifth normal form (5NF): A relation schema R is in 5NF with respect to a set F of FDs, MVDs, and JDs if for every nontrivial JD(R1, R2, …, Rn), each Ri is a superkey of R

(cont.) 5NF is also called PJNF (project-join normal form) Note: Unfortunately, the theory of MVD and JD lossless join decomposition does not take NULL values into account

2.3 Inclusion Dependencies
Used to specify referential integrity and class/subclass constraints between two relations R and S An inclusion dependency R.X < S.Y specifies that at any point in time, if r(R) and s(S) are relation instances of R and S, then: X(r(R))  Y(s(S)) The attributes X of R and Y of S must be compatible

(cont.) Sound and complete inference rules for inclusion dependencies:
(ID1) R.X < R.X (ID2) If R.X < S.Y, where X = {A1, A2, …, An}, Y = {B1, B2, …, Bn}, and Ai corresponds to Bi, then Ai < Bi for 1  i  n (ID3) If R.X < S.Y and S.Y < T.Z, then R.X < T.Z So far, there are no normal forms based on inclusion dependencies Example: Department.DMGSSN < EMPLOYEE.SSN WORKS_ON.SSN < EMPLOYEE.SSN WORKS_ON.PNUMBER < Project.PNUMBER

Chapter 15 Relational Design Algorithms and Further Dependencies

Similar presentations

Presentation on theme: "Chapter 15 Relational Design Algorithms and Further Dependencies"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 15 Relational Design Algorithms and Further Dependencies

Similar presentations

Presentation on theme: "Chapter 15 Relational Design Algorithms and Further Dependencies"— Presentation transcript:

Similar presentations

About project

Feedback