Presentation is loading. Please wait.

Presentation is loading. Please wait.

Normalization Part II cs3431.

Similar presentations


Presentation on theme: "Normalization Part II cs3431."— Presentation transcript:

1 Normalization Part II cs3431

2 Decomposing Relations
StudentProf Greg Dave sName p2 p1 pNumber MM s2 s1 pName sNumber FDs: pNumber  pName Greg Dave sName p2 p1 pNumber s2 s1 sNumber Student MM pName Professor Greg Dave sName MM pName S2 S1 sNumber Student p2 p1 pNumber Professor cs3431

3 Decomposition Decomposition: Must be Lossless (no spurious tuples)
cs3431

4 Decomposition: Lossless Join
Greg Dave sName p2 p1 pNumber MM s2 s1 pName sNumber Greg Dave sName MM pName S2 S1 sNumber Student p2 p1 pNumber Professor StudentProf sNumber sName pNumber pName s1 Dave p1 MM p2 s2 Greg Spurious Tuples cs3431

5 Normalization Once decided, what is the algorithm for (lossless) decomposing? cs3431 21

6 Normalization Step : Decompose
Consider relation R with set of attributes AR. Consider a FD : A  B (such that no other attribute in (AR – A – B) is functionally determined by A). If A is not a superkey for R, we decompose R as: Create R’ with attributes (AR – B) Create R’’ with attributes A  B Key for R’’ = A Foreign key : R’ (A) references R’’ (A) cs3431

7 Example Decomposition Revisited
StudentProf sNumber sName pNumber pName s1 Dave p1 MM s2 Greg p2 FDs: pNumber  pName Student Professor sNumber sName pNumber s1 Dave p1 s2 Greg p2 pNumber pName p1 MM p2 FOREIGN KEY: Student (PNum) references Professor (PNum) cs3431

8 Schema Refinement : Normal Forms
Question : How decide if any refinement of schema is needed ? Idea : If a relation is in a certain normal form, then it is known that certain kinds of problems are avoided or minimized. cs3431 22

9 Normal Form: BCNF Boyce Codd Normal Form (BCNF):
For every non-trivial FD X  B in R, X is a superkey of R. cs3431

10 BCNF Example Relation: SCI (student, course, instructor) FDs:
instructor  course Decomposition: SI (student, instructor) Instructor (instructor, course) cs3431

11 Decomposition Algo into BCNF
Algorithm : Repeated application of this lossless decomposition method until all relations are in BCNF Result will be: relations that are in BCNF; lossless join decomposition, Note: algo is guaranteed to terminate. cs3431 5

12 Decomposition Example
Consider relation CSJDPQV, C is key, JP  C and SD  P. Decomposition: CSJDQV and SDP Is it lossless ? Yes ! Is it in BCNF ? Yes ! Problem: Can JP  C be checked? It requires a join of decomposed relations! cs3431 3

13 Decomposition : Dependency Preserving
Intuition: Can we check functional dependencies locally in each decomposed relation, and assure that globally all constraints are enforced by that? cs3431 3

14 Dependency Preserving Decompositions
Decomposition of R into X and Y is dependency preserving if (FX union FY ) + = F + Is below dependency preserving ? ABC, A B, B  C, C  A, decomposed into AB and BC. Is C  A preserved ? cs3431 4

15 Dependency Preserving Decompositions
Decomposition of R into X and Y is dependency preserving if (FX union FY ) + = F + Important to consider F +, not F, in above definition Projection of set of FDs F: If R is decomposed into X, Y, ... , then projection of F onto X (denoted FX ) is the set of FDs U  V in F+ (closure of F) such that U, V are in X. cs3431 4

16 BCNF and Dependency Preservation
In general, a dependency preserving decomposition into BCNF may not exist ! Example : CSZ, CS Z, Z C Not in BCNF. Can’t decompose while preserving fct dependency within one relation. cs3431 6

17 Example: Dependency Preservation
BCNF does not necessarily preserve FDs. student, course  instructor instructor  course Instructor SI student instructor Dave MM ER instructor course MM DB 1 ER SCI (from SI and Instructor) student instructor course Dave MM DB 1 ER SCI violates the FD student, course  instructor cs3431

18 Dependency Preservation
BCNF does not necessarily preserve FDs. But: 3NF can be found that guarantees to preserve FDs. cs3431

19 Normal Form : 3NF Third Normal Form (3NF): For every FD X  B in R,
either it is trivial FD, or X is a superkey of R, or B is a prime attribute (B is part of a key). cs3431

20 3NF Important: Lossless-join, dependency-preserving decomposition of R into a collection of 3NF relations always possible ! cs3431 24

21 3NF vs BCNF ? If R is in BCNF, obviously R is in 3NF.
If R is in 3NF, R may not be in BCNF. If R is in 3NF, some redundancy is possible. 3NF is a compromise to use when BCNF with good constraint enforcement is not achievable cs3431 24

22 Algorithm : Decomposition into 3NF
Decomposition algorithm again used, but typically can stop earlier. But how to ensure dependency preservation? Possible Idea: If X Y is not preserved, add relation XY. cs3431 7

23 3NF - example Lot (propNo, county, lotNum, area, price, taxRate)
Candidate key: <county, lotNum> FDs: county  taxRate area  price Decomposition: Lot (propNo, county, lotNum, area, price) County (county, taxRate) cs3431

24 3NF - example Lot (propNo, county, lotNum, area, price)
County (county, taxRate) Candidate key for Lot: <county, lotNum> FDs: county  taxRate area  price Decomposition: Lot (propNo, county, lotNum, area) County (county, taxRate) Area (area, price) cs3431

25 Extreme Example Consider relation R (A, B, C, D)
primary key (A, B, C), FDs B  D, and C  D. 3NF or not? R violates 3NF. Decomposing it, we get 2 relations as: R1 (A, B, C), R2 (B, D), But what about C  D ? R3 (C, D) cs3431

26 Extreme Example Consider relation R (A, B, C, D) with primary key (A, B, C), and FDs B  D, and C  D. R violates 3NF. Decomposing it, we get 3 relations as: R1 (A, B, C), R2 (B, D), R3 (C, D) Let us consider an instance where we need these 3 relations and how we do a natural join ⋈ R1 ⋈ R2: violates C  D R2 R1 A B C D a1 b1 c1 d1 a2 b2 d2 a3 c2 B D b1 d1 b2 d2 A B C a1 b1 c1 a2 b2 a3 c2 R3 R1 ⋈ R2 ⋈ R3: no FD is violated C D c1 d1 c2 d2 A B C D a1 b1 c1 d1 cs3431

27 Algorithm : Decomposition into 3NF
Idea 1: to ensure dependency preservation If X  Y is not preserved, add relation XY. Problem is that XY may violate 3NF! Idea 2 : Instead of the given set of FDs F, use a minimal cover for F. cs3431 7

28 Minimal Cover for a Set of FDs
Minimal cover G for a set of FDs F: Closure of F = closure of G. Right hand side of each FD in G is single attribute. If we modify G by deleting a FD or by deleting attributes from an FD in G, the closure changes. Example : If both J  C and JP  C, then only keep first one. cs3431 8

29 Minimal Cover for a Set of FDs
Theorem : Use minimum cover of FD+ in decomposition guarantees that the decomposition : lossless-join and dependency-preserving . cs3431 8

30 Algorithm for Minimal Cover
Decompose FD into one attribute on RHS Minimize left side of each FD Check each attribute on LHS to see if deleted while still preserving the equivalence to F+. Delete redundant FDs. Note: Several minimal covers may exist. cs3431 8

31 Example : Minimal Cover
Given : A  B, ABCD  E, EF  GH, ACDF  EG cs3431 8

32 Example : Minimal Cover
Given : A  B, ABCD  E, EF  GH, ACDF  EG Then the minimal cover is: A  B, ACD  E, EF  G and EF  H cs3431 8

33 3NF Decomposition Algorithm
Compute minimal cover G of F Decompose R using minimal cover G of FD into lossless decomposition of R. Each Ri is in 3NF Fi is projection of F onto Ri Identify dependencies in F not preserved: X  A Create relation XA : New relation XA preserves X  A cs3431 8

34 3NF Decomposition Algorithm
Why does it work ? Create relation XA : New relation XA preserves X  A X is key of XA, because G is minimal cover. Hence no Y subset X exists, with Y  A Also we know that A is a singleton. If another dependency exists in XA; can only imply attribute of X. cs3431 8

35 Summary Step 1: BCNF is a good form for relation
If a relation is in BCNF, it is free of redundancies that can be detected using FDs. Step 2 : If a relation is not in BCNF, we can try to decompose it into a collection of BCNF relations. Step 3: If a lossless-join dependency-preserving decomposition into BCNF is not possible (or unsuitable given typical queries), consider decomposition into 3NF. Note: Decompositions should be carried out while keeping performance requirements in mind. cs3431 9


Download ppt "Normalization Part II cs3431."

Similar presentations


Ads by Google