Download presentation
Presentation is loading. Please wait.
Published byTamara Towsley Modified over 10 years ago
1
Logical Database Design (3 of 3) John Ortiz
2
Lecture 7Logical Database Design (2)2 Normalization If a relation is not in BCNF or 3NF, we refine it by decomposing it into two or more smaller relation schemas that are in the normal form. Decomposition has to be used carefully, since there are potential problems. What are desirable properties of a decomposition, and how to test them? How to obtain a decomposition with some desirable properties?
3
Lecture 7Logical Database Design (2)3 Decomposition of a Relation Let R be a relation schema. A decomposition of R, demoted by D = {R1, R2,..., Rn}, is a set of relation schemas such that R = R1 ... Rn. If {R1, R2,..., Rn} is a decomposition of R and r is an instance of R, then r R1 (r) R2 (r)... Rn (r) Information may be lost (i.e. wrong tuples may be added by the natural join) due to a decomposition.
4
Lecture 7Logical Database Design (2)4 An Example of Information Loss Before After SC SRSG SR
5
Lecture 7Logical Database Design (2)5 Lossless Join Decomposition Let R be a relation schema, and D = {R1, R2,..., Rn} be a decomposition of R. D is a lossless (non-additive) join decomposition of R if for every legal instance r of R, we have r = R1 (r) R2 (r)... Rn (r) Theorem: Let F be a set of FDs over R, and D = {R1, R2} be a decomposition of R. D is a lossless-join decomposition if and only if R1 R2 R1 - R2 is in F + ; or R1 R2 R2 - R1 is in F +.
6
Lecture 7Logical Database Design (2)6 Lossless Join: An Example Consider F = {B AH, L CAt} over Bank-Loans(Bank, Assets, Headquarter, Loan#, Customer, Amount). Let D = {Banks(B,A,H), Loans(B,L,C,At)}. Since Banks Loans = B AH = Banks - Loans is in F + (since it is already in F), D is a lossless-join decomposition. What if the decomposition contains more than two relations.
7
Lecture 7Logical Database Design (2)7 Test for Lossless Join * Algorithm TestLJ (Chase) Input: A relation schema R(A 1, …, A m ), a set of FDs F, and a decomposition D = {R 1, …, R n }. Output: Yes, if D is a Lossless join; no, otherwise. Method: 1.Create an n m table T (labeled by A i and R j ). 2.If R i contains A j, place a j at T i,j. Otherwise, place b ij at T i,j.
8
Lecture 7Logical Database Design (2)8 TestLJ (cont.) * 3.Repeat for each FD X Y in F do For all rows with identical symbols on X do make the symbols on Y identical. (choose a j over b ij whenever possible) Until no more change can be made. 4.Return yes if there is a row of a j ’s. Otherwise, return no.
9
Lecture 7Logical Database Design (2)9 TestLJ: An Example Continue with the previous example. Set up the table T. Enforce B AH. B A H L C At BAH a 1 a 2 a 3 b 14 b 15 b 16 BLCAt a 1 b 22 b 23 a 4 a 5 a 6 B A H L C At BAH a 1 a 2 a 3 b 14 b 15 b 16 BLCAt a 1 a 2 a 3 a 4 a 5 a 6 Need to repeat until no more changes.
10
Lecture 7Logical Database Design (2)10 Dependency-Preserving Decomposition Let F be a set of FDs over R, and D = {R 1, R 2,..., R n } be a decomposition of R. D is a dependency-preserving decomposition if F + = ( R1 (F) R2 (F) ... Rn (F)) + where for i = 1, …, n Ri (F) = { X Y | X Y F and XY R i }. Restrict FDs to local relations. If all “global” FDs can be derived from “local” FDs, all dependencies are preserved.
11
Lecture 7Logical Database Design (2)11 Dependency Preservation: An Example Consider F = {CS Z, Z C} over R(City, Street, Zipcode), and D ={R1(S, Z), R2(C, Z)}. Then R1 (F) = {} and R2 (F) = {Z C} (consider non-trivial FDs only) Since CS Z F + but CS Z ( R1 (F) R2 (F)) +, D is not dependency-preserving.
12
Lecture 7Logical Database Design (2)12 Test for Dependency Preservation Algorithm TestDP Input: A relation schema R, A set of FDs F over R, a decomposition D = {R 1, R 2,..., R n } of R. Output: Yes, if D is dependency-preserving; no, otherwise. Method: for every X Y F if R i such that XY R i then X Y is preserved;
13
Lecture 7Logical Database Design (2)13 TestDP (cont.) else W := X; repeat for i from 1 to n do W := W ((W R i ) + R i ); until there is no change to W; if Y W then X Y is preserved; if every X Y is preserved then return yes; else return no. Derive global FDs using only local FDs.
14
Lecture 7Logical Database Design (2)14 TestDP: An example Consider F = {A B, B C, C D, D A } over R(A, B, C, D), & D = {R1(A,B), R2(B,C), R3(C,D)}. Is D a dependency-preserving decomposition? Since AB R1, A B is preserved. Since BC R2, B C is preserved. Since CD R3, C D is preserved. Since DA is not in any one of the three relations, we need to compute W.
15
Lecture 7Logical Database Design (2)15 TestDP: An example (cont.) * Initialization: W = D; first iteration: W = D ((D AB) + AB) = D; W = D ((D BC) + BC) = D; W = D ((D CD) + CD) = D (D + CD) = D (ABCD CD) = CD; W changed from D to CD.
16
Lecture 7Logical Database Design (2)16 TestDP: An example (cont.) * second iteration: W = CD ((CD AB) + AB) = CD; W = CD ((CD BC) + BC) = CD (C + BC) = BCD; W = BCD ((BCD CD) + CD) = BCD; W changed from CD to BCD.
17
Lecture 7Logical Database Design (2)17 TestDP: An example (cont.) * third iteration: W = BCD ((BCD AB) + AB) = ABCD; Since A W, D A is also preserved. Hence, D is a dependency-preserving decomposition. W changed from BCD to ABCD, and will change no more, although the algorithm will have the forth iteration.
18
Lecture 7Logical Database Design (2)18 Normalization It is good to have BCNF relation schemas. If a relation schema is not in BCNF, then decompose it into a set of relation schemas: every new schema is in BCNF; it is lossless-join (can guarantee); it is dependency-preserving (no guarantee). If not possible to have all nice properties, be happy with a lossless join, dependency preserving 3NF decomposition (can guarantee)
19
Lecture 7Logical Database Design (2)19 Normalization to BCNF Algorithm LLJD-BCNF Input: R: A relation schema F: A set of FDs satisfied by R. Output: A lossless-join decomposition D = {R 1, …, R n }, such that each R i is in BCNF.
20
Lecture 7Logical Database Design (2)20 Normalization to BCNF (cont.) Method: D := {R}; while R i D that is not in BCNF do begin Find an FD X Y such that (1) R i is not BCNF because of X Y, and (2) XY R i ; D := D - R i {R i - Y, XY} end;
21
Lecture 7Logical Database Design (2)21 Normalization to BCNF (cont.) * Theorem: Algorithm LLJD-BCNF is correct. Proof (sketch): Every schema in D is in BCNF because the algorithm will not stop otherwise. D is a lossless-join decomposition because in each iteration, R i is decomposed into 2 smaller schemas (R i - Y) and XY and they satisfy the condition: (R i - Y) XY = X Y = (XY - (R i - Y)).
22
Lecture 7Logical Database Design (2)22 Normalization to BCNF: An Example Consider F = {B AH, L CAt} over Bank-Loans(Bank, Assets, Headquarter, Loan#, Customer, Amount), and a set of FDs, Candidate keys: LB Initialization: D = {BAHLCAt }
23
Lecture 7Logical Database Design (2)23 Normalize to BCNF: An Example * 1st iteration: R i = BAHLCAt is not in BCNF because B AH is not a trivial FD and B is not a superkey. Replace BAHLCAt by BAH and BLCAt. Hence: D = {BAH, BLCAt}. BAH is in BCNF because in BAH, B is a candidate key.
24
Lecture 7Logical Database Design (2)24 Normalize to BCNF: An Example * 2nd iteration: R i = BLCAt is not in BCNF because L CAt is not a trivial FD and L is not a superkey in BLCAt. Replace BLCAt by CLAt and BL. Hence, D = {BAH, CLAt, BL}. CLAt is in BCNF because in CLAt, L is a candidate key. BL is in BCNF (see theorem on next page). Final result: D = {BAH, CLAt, BL}. D happens to be dependency-preserving. Any relation schema with exactly two attributes is in BCNF.
25
Lecture 7Logical Database Design (2)25 Normalize to BCNF: Another Ex. * Consider R(City, Street, Zipcode), and F = {CS Z, Z C }. Candidate keys: CS, ZS. Initialization: D = {CSZ}; 1st iteration: R = CSZ is not in BCNF because Z C is not a trivial FD and Z is not a superkey. D = {ZC, ZS}. D is not dependency-preserving because CS Z is not preserved.
26
Lecture 7Logical Database Design (2)26 Equivalence of FD Sets Let F and G be two sets of FDs satisfied by R. F and G are equivalent, denoted by F G, if F + = G +. Example: F = {B CD, AD E, B A} and G = {B CDE, B ABC, AD E} are equivalent. Check to see that every FD in F is also in G + and that every FD in G is also in F +
27
Lecture 7Logical Database Design (2)27 Extraneous Attributes Let F be a set of FDs. F contains an extraneous attribute A if there is an FD X Y in F, such that either A X, and [F - {X Y} {X - {A} Y}] F; or A Y, and [F - {X Y} {X Y - {A} }] F. This is a “useless” attribute either at the left side or at the right side of an FD.
28
Lecture 7Logical Database Design (2)28 Summary A good schema should have three properties: BCNF (or 3NF if BCNF can not be obtained) Lossless join Dependency preserving Lossless join BCNF decomposition is guaranteed, need to check for dependency preservation Lossless join, dependency preserving 3NF decomposition is guaranteed (need to find the minimal cover)
29
Lecture 7Logical Database Design (2)29 Look Ahead Next topic: SQL Overview & DDL Read textbook: Chapter 8, 10.1-10.6
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.