Download presentation
Presentation is loading. Please wait.
Published byIsabella Fletcher Modified over 8 years ago
1
Database Management Systems Course 236363 Faculty of Computer Science Technion – Israel Institute of Technology Lecture 7: Schema Normalization
2
Schema Anomalies Redundant storage –Repeatedly storing the same information Update anomaly –To change a repeated item, every occurrence should be changed Insertion anomaly –Some information cannot be stored without additional (possibly unavailable) information Deletion anomaly –Some information cannot be deleted without deleting additional (possibly desired) information studentemailcourseyearsemesterlecturer Almaalma@csDB2013SpringEldar Amiramir@eeDB2014WinterRoy
3
From ERD to Normalization We have learned how to design schemas using ERDs But it is often not enough for a proper translation into well designed relations ERD is limited in constraint representation; we need a more careful design to enforce such constraints It may be challenging to avoid anomalies when dependencies are complicated 3
4
Example name id addressStudent Enroll Track date name faculty campus Consult Consultant
5
Example A track has at most one consultant per faculty A track is contained in a single campus A consultant belongs to a single campus and faculty A faculty is in a single campus trackconsultantcampusfaculty track,facultyconsultant trackcampus consultantcampus,faculty trackconsultantfaculty facultycampus trackcampus What makes it “good”? Are there principles to follow? Can design be automated? Trackname faculty campus Consult Consultant facultycampus
6
The Refined Design Process (Normalization) 6 1. Define the involved attributes 2. Determine what dependencies hold in real life 3. Decide on desired properties 4. Decompose into multiple good (“normalized”) schemas 2a. Determine implied dependencies Which FDs allowed? Should all dependencies be enforced?
7
Notation During this lecture, we focus on schemas of a special type: a single relation over a set U of attributes, and a set F of FDs So, during this lecture a schema is simply a pair (U,F) where: – U is a finite set of attributes – F is a set of FDs over U –(In particular, we ignore the relation name and order among attributes) 7
8
Basic Terminology Let (U,F) be a schema Recall: A superkey is a set K of attributes such that K + contains every attribute in U Recall: A key is a superkey K that does not contain any other superkey –That is, if Y ⊊ K then Y is not a superkey Attributes of keys are called prime “Schema normalization” deals with the relationship between keys, prime attributes and nonprime attributes 8
9
History of Normal Forms 9 1970 1NF 2NF Nonprime attributes are not dependent on strict parts of keys Codd 3NF DB looks like a logical structure; assumed by default “Standard” normal form: a nonprime attribute can be determined only by a superkey 1971 BCNF 1974 Boyce & Codd 1977 4NF Fagin 1979 5NF A relation does not involve any “implicit” joins No nontrivial MVDs except for superkey No nontrivial FDs except for superkeys
10
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 10
11
Our Focus We mainly focus on BCNF and 3NF –Historically BCNF came after 3NF, but we start with BCNF since it is simpler In the end we will briefly review 4NF
12
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 12
13
Boyce-Codd Normal Form (BCNF) A schema (U,F) is in BCNF if every nontrivial FD implied by F has a superkey on its premise (lhs) That is, every XY in F + is such that – X is a superkey; or – Y ⊆ X
14
Examples 14 name, symbol,dean namesymbol, symboldean, deanname Faculty: BCNF follows,followed,fid follows,followedfid, fidfollows,followed Social network: BCNF track,faculty,consultant,campus Tracks: not BCNF state,city,street,zip state,city,streetzip, zipstate Address: not BCNF track,facultyconsultant, consultantfaculty, trackcampus, facultycampus
15
Can BCNF be Tested Efficiently? On the face of it, we need to consider every derived FD (exponentially many); however: T HEOREM : The following are equivalent: 1.The schema (U,F) is in BCNF (i.e., every nontrivial XY in F + is such that X is a superkey) 2.In every nontrivial XY in F, X is a superkey Hence, it suffices to check F Proof not given –But which direction is straightforward? So what would be an efficient BCNF testing? 15 Answer: For each XY in F, test whether U=Closure(F,X)
16
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 16
17
Third Normal Form (3NF) Recall: an attribute A is prime if it is a part of some key A schema is in 3NF if for every nonprime A and nontrivial derived XA, the set X is a superkey Equivalently, for every XA in F + at least one of the following holds: – X is a superkey –A∈X–A∈X – A is prime 17
18
Examples 18 name, symbol,dean namesymbol, symboldean, deanname Faculty: BCNF follows,followed,fid follows,followedfid, fidfollows,followed Social network: BCNF state,city,street,zip state,city,streetzip, zipstate Address: not BCNF 3NF not 3NF track,faculty,consultant,campus Tracks: not BCNF track,facultyconsultant, consultantfaculty, trackcampus, facultycampus
19
Testing 3NF The following algorithm works: For every nontrivial FD XY in F 1.Check whether X is a superkey 2.Check whether every attribute in Y\X is prime As we know, (1) can be tested efficiently What about (2)? –It is NP-complete! (hence, it is unlikely that it is solvable in polynomial time) And in fact, testing whether a schema is in 3NF is an NP-complete problem [Jou, Fischer1982] 19
20
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 20
21
Decomposition We can fix a “badly designed” schema by decomposing it into several smaller schemas But we need to do so correctly! –Do not change our intended information –Do not violate the FDs –Get a “well designed” fixed schema In this part, we will make the above formal First, we need a notation 21
22
Restricting a Set of FDs Let (U,F) be a schema, and let W be a subset of U We denote by F[W] the set of all the FDs XY in F + such that XY ⊆ W 22
23
Formal Definition A decomposition of a schema (U,F) is a collection (X 1,F 1 ), …, (X k,F k ) of schemas such that: – U = X 1 ∪⋯∪ X k That is, the X i cover all the attributes in U –For i=1,…,k we have (F i ) + =F + [X i ]= F [X i ] That is, each F i consists of the FDs imposed by F on X i 23
24
Decomposing and Composing Relations 24 A π X1X1 X2X2 X3X3 ⋯ XkXk r r1r1 r2r2 r3r3 rkrk (U,F)(X 1,F 1 )(X 2,F 2 )(X 3,F 3 )(X k,F k )
25
Representing F i Given the schema (U,F), it suffices to represent a decomposition using the collection {X 1,…, X k } without mentioning the FDs F i Since F i can be F [X i ] up to equivalence Problem: naively constructing F i as F + [X i ] can be impractical, since F + and F + [X i ] can be exponentially larger than U –This problem is unavoidable: It may be that F + [X i ] is not equivalent to any sub-exponential #FDs! –We keep this problem in mind – we will not assume that F + [X i ] can be instantiated efficiently 25
26
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 26
27
Obtaining Normal Forms Let N be a normal form (e.g., 3NF, BCNF) An N decomposition of a schema (U,F) is a decomposition {X 1,…,X k } of (U,F) such that each (X i,F[X i ]) is in N We will discuss 3NF decompositions and BCNF decompositions 27
28
Examples 28 ABCD AB, BC, ABCD, DB BC BCBC BD AD DBDBADAD 3NF decomposition? BCNF decomposition? ABCD ABC AD ABC CB ABC CB ABCD ABC ACD ABBCACABBCAC AB BC CD CD ACD Answer: BCNF, 3NF Answer: 3NF, not BCNF Answer: not 3NF, not BCNF
29
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 29
30
Good Decomposition? 30 personbuildingroom AlmaTaub152 AmirMeyer35 AhuvaMeyer246 personbuilding,room personbuilding AlmaTaub AmirMeyer AhuvaMeyer buildingroom Taub152 Meyer35 Meyer246
31
Lossless Decomposition Let {X 1,...,X k } be a decomposition of (U,F) We say that {X 1,...,X k } is a lossless decomposition of (U,F) if for all relations r over (U,F) we have: π X 1 (r) ⋯ π X k (r) = r Containment in one direction always holds: π X 1 (r) ⋯ π X k (r) ⊇ r What about the other direction? Depends on F ! 31
32
Example 1 32 personbuildingroom AlmaTaub152 AmirMeyer35 AhuvaMeyer246 personbuilding,room personbuilding AlmaTaub AmirMeyer AhuvaMeyer buildingroom Taub152 Meyer35 Meyer246 =? personbuilding AlmaTaub AmirMeyer AhuvaMeyer =? personroom Alma152 Amir35 Ahuva246 =? personroom Alma152 Amir35 Ahuva246 buildingroom Taub152 Meyer35 Meyer246
33
Example 2 33 personbuildingroom AlmaTaubt152 AmirMeyerm35 AhuvaMeyerm246 personbuilding,room roombuilding personbuilding AlmaTaub AmirMeyer AhuvaMeyer buildingroom Taubt152 Meyerm35 Meyerm246 =? personbuilding AlmaTaub AmirMeyer AhuvaMeyer =? personroom Almat152 Amirm35 Ahuvam246 =? personroom Almat152 Amirm35 Ahuvam246 buildingroom Taubt152 Meyerm35 Meyerm246
34
Decision Algorithm The definition of lossless is not constructive (check every possible relation) Next, we present a polynomial-time algorithm for this decision problem 34 Losslessness Testing Given:Goal: U, F, X 1,...,X k {X 1,...,X k } is a decomposition of (U,F) Determine whether {X 1,...,X k } is a lossless decomposition
35
The Case of Binary Decomposition T HEOREM : Let {X 1,X 2 } be a decomposition of (U,F). The following are equivalent: 1.F ⊨ X 1 ∩X 2 X 1 or F ⊨ X 1 ∩X 2 X 2 2.{X 1,X 2 } is a lossless decomposition 35 So what would be a decision algorithm in this case? Answer: test whether Closure(F, X 1 ∩X 2 ) contains either X 1 or X 2
36
Proof: 1 ⇒ 2 36 1.F ⊨ X 1 ∩X 2 X 1 or F ⊨ X 1 ∩X 2 X 2 2.{X 1,X 2 } is a lossless decomposition 1.F ⊨ X 1 ∩X 2 X 1 or F ⊨ X 1 ∩X 2 X 2 2.{X 1,X 2 } is a lossless decomposition A1A1 A2A2 A3A3 A4A4 A5A5 A1A1 A2A2 A3A3 a1a1 a2a2 a3a3 A2A2 A3A3 A4A4 A5A5 a2a2 a3a3 a4a4 a5a5 A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 a2a2 a3a3 a4a4 a5a5 t A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 a2a2 a3a3 x 14 x 15 x 21 a2a2 a3a3 a4a4 a5a5 We know that this is a subset of r, for some x’s F ⊨ A 2 A 3 A 4 A 5 A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 a2a2 a3a3 a4a4 a5a5 x 21 a2a2 a3a3 a4a4 a5a5 F ⊨ A 2 A 3 A 1 A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 a2a2 a3a3 x 14 x 15 a1a1 a2a2 a3a3 a4a4 a5a5 In any case, we had t to begin with... hence lossless π t t We need to prove that t is here!
37
Proof: not 1 ⇒ not 2 37 Let Y=(X 1 ∩X 2 ) + and suppose that X 1 ⊈ Y, X 2 ⊈ Y Construct a relation r={t,u} over U : – t[Y]=u[Y]=(0,...,0) – t[U\Y]=(1,...,1) ; u[U\Y]=(2,...,2) Claim 1: r ⊨ F –Proof similar to completeness of Armstrong’s axioms Claim 2: π X 1 (r) π X 2 (r) ≠ r –The join contains a row with both 1 s and 2 s 1.F ⊨ X 1 ∩X 2 X 1 or F ⊨ X 1 ∩X 2 X 2 2.{X 1,X 2 } is a lossless decomposition 1.F ⊨ X 1 ∩X 2 X 1 or F ⊨ X 1 ∩X 2 X 2 2.{X 1,X 2 } is a lossless decomposition A1A1 A2A2 A3A3 A4A4 A5A5 10001 20002 X1X1 X2X2 Y t u
38
Proof of Claim 1 Suppose not(r ⊨ F ). There exists X A such that F ⊨ X A that is violated. For a violation, it must be Y ⊇ X. But Y=Y +. So Y ⊇ X +. So A in Y. Thus there is no violation. Contradiction. 38
39
Illustration: not 1 ⇒ not 2 39 A1A1 A2A2 A3A3 A4A4 A5A5 10001 20002 A1A1 A2A2 A3A3 100 200 A2A2 A3A3 A4A4 A5A5 0001 0002 A1A1 A2A2 A3A3 A4A4 A5A5 10001 20002 10002 20001 1.F ⊨ X 1 ∩X 2 X 1 or F ⊨ X 1 ∩X 2 X 2 2.{X 1,X 2 } is a lossless decomposition 1.F ⊨ X 1 ∩X 2 X 1 or F ⊨ X 1 ∩X 2 X 2 2.{X 1,X 2 } is a lossless decomposition A 2 A 4, A 2 A 5 A 1 π
40
The General Case Next, we handle the general case of a decomposition (2 #schemas) 40 Losslessness Testing Given:Goal: U, F, X 1,...,X k {X 1,...,X k } is a decomposition of (U,F) Determine whether {X 1,...,X k } is a lossless decomposition
41
The Idea 41 A1A1 A2A2 A3A3 A4A4 A5A5 A1A1 A3A3 a1a1 a3a3 A2A2 A3A3 A4A4 a2a2 a3a3 a4a4 A4A4 A5A5 a4a4 a5a5 A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 a2a2 a3a3 a4a4 a5a5 t We need to prove that t is here! A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 x 12 a3a3 x 14 x 15 x 21 a2a2 a3a3 a4a4 x 25 x 31 x 32 x 33 a4a4 a5a5 We know that this is a subset of r, for some x’s r But some of the x’s may be known due to the FDs! F={A 3 A 5, A 4 A 5, A 5 A 2 A 4 } A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 x 12 a3a3 x 14 x 25 x 21 a2a2 a3a3 a4a4 x 25 x 31 x 32 x 33 a4a4 a5a5 A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 x 12 a3a3 x 14 a5a5 x 21 a2a2 a3a3 a4a4 a5a5 x 31 x 32 x 33 a4a4 a5a5 A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 a2a2 a3a3 a4a4 a5a5 x 21 a2a2 a3a3 a4a4 a5a5 x 31 x 32 x 33 a4a4 a5a5 t π A3A5A3A5 A4A5A4A5 A5A2A4A5A2A4
42
The General Case 1 st step: create the “known subset” –A table over U, one tuple t i for each X i : t i [A j ]=a j if X i contains A j, and t i [A j ]=x ij otherwise 2 nd step: chase –While the table changes do: Look for an FD violation and equate the conclusions “Equate” = change every occurrence of one to the other When equating a j with x ij, change x ij to a j 3 nd step: Return true iff there is a row of a i ’s 42 Losslessness Testing Given:Goal: U, F, X 1,...,X k {X 1,...,X k } is a decomposition of (U,F) Determine whether {X 1,...,X k } is a lossless decomposition
43
Example 43 A1A1 A2A2 A3A3 A4A4 A5A5 A1A1 A3A3 A2A2 A3A3 A4A4 A4A4 A5A5 Step 1: construct the known subset A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 x 12 a3a3 x 14 x 15 x 21 a2a2 a3a3 a4a4 x 25 x 31 x 32 x 33 a4a4 a5a5 F = {A 3 A 5, A 4 A 5, A 5 A 2 A 4 } A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 x 12 a3a3 x 14 x 25 x 21 a2a2 a3a3 a4a4 x 25 x 31 x 32 x 33 a4a4 a5a5 A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 x 12 a3a3 x 14 a5a5 x 21 a2a2 a3a3 a4a4 a5a5 x 31 x 32 x 33 a4a4 a5a5 A1A1 A2A2 A3A3 A4A4 A5A5 a1a1 a2a2 a3a3 a4a4 a5a5 x 21 a2a2 a3a3 a4a4 a5a5 x 31 x 32 x 33 a4a4 a5a5 Step 2: chase Step 3: return true
44
Think Why is this algorithm terminating in polynomial time? 44 Answer: Each iteration eliminates one symbol, and we have a polynomial #symbols
45
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 45
46
Preserving Dependencies 46 A π X1X1 X2X2 X3X3 ⋯ XkXk r r1r1 r2r2 r3r3 rkrk (U,F)(X 1,F 1 )(X 2,F 2 )(X 3,F 3 )(X k,F k ) Is F preserved given that each F i is preserved in each relation?
47
Example 1 47 ABCD AB, BC, ABCD, DB BC BCBC BD AD DBDB ADAD Are dependencies preserved in this decomposition? {AD,BD,BC} Answer: Yes!
48
Example 2 48 ABC BC AC CBCB ABC CB Are dependencies preserved in this decomposition? Is there any decomposition into binary schemas where dependencies are preserved? {BC,AC} Answer: No!
49
Formal Definition A decomposition X 1, …,X k of (U,F) is dependency preserving if for all relations r 1,...,r k over (X 1,F [X 1 ]), …, (X k,F [X k ]), respectively, r 1 ⋯ r k satisfies F Can we test whether a given decomposition has this property? T HEOREM : The following are equivalent: 1.For all r 1,...,r k over (X 1,F [X 1 ]),…,(X k,F [X k ]), respectively, the join r 1 ⋯ r k satisfies F 2.F + = (F [X 1 ] ∪⋯∪ F [X k ]) + 49
50
Testing for Dependency Preservation We need to test whether F + = (F 1 ∪⋯∪ F k ) + F + ⊇ F 1 ∪⋯∪ F k, so F + ⊇ (F 1 ∪⋯∪ F k ) + So, need to test whether F + ⊆ (F 1 ∪⋯∪ F k ) + It suffices to test whether each XY in F is implied by F 1 ∪⋯∪ F k –Or in other words, whether Y is a subset of the closure of X under F 1 ∪⋯∪ F k Next slide: efficient computation of the closure of X under F 1 ∪⋯∪ F k 50 Without explicitly calculating the F i ’s
51
Basic Claim If Z ⊆ X i then ( Z + F ∩ X i ) = Z + Fi 51
52
Closure w.r.t. a Decomposition 52 ClosureDecomp(X,F,X 1,...,X k ) { Y := X while(Y changes) for(i=1,...k) Y := Y ∪ ( Closure(Y∩X i,F) ∩ X i ) return Y } ClosureDecomp(X,F,X 1,...,X k ) { Y := X while(Y changes) for(i=1,...k) Y := Y ∪ ( Closure(Y∩X i,F) ∩ X i ) return Y } Given:Goal: U, F, X 1,...,X k, X {X 1,...,X k } is a decomposition of (U,F) X ⊆ U Compute the closure of X under F[X 1 ] ∪⋯∪ F[X k ]
53
Testing for Dependency Preservation 53 DepPreserving(X 1,...,X k,F) { for all (XY in F) if(Y ⊈ ClosureDecomp(X,F,X 1,...,X k )) return false return true } DepPreserving(X 1,...,X k,F) { for all (XY in F) if(Y ⊈ ClosureDecomp(X,F,X 1,...,X k )) return false return true } Dependency Preservation Testing Given:Goal: U, F, X 1,...,X k {X 1,...,X k } is a decomposition of (U,F) Determine whether {X 1,...,X k } is dependency preserving
54
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 54
55
Decomposition Algorithms Given a normal form N, we ask: –Is there always a lossless N decomposition? –Is there always a lossless & dependency preserving N decomposition? –Is there an efficient decomposition? We discuss 2 decomposition algorithms –BCNF decomposition Lossless –3NF decomposition Lossless, dependency preserving, p-time
56
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 56
57
Key Insight Recall: BCNF means that in every nontrivial XY, the set X is a superkey C LAIM : If (U,F) is not in BCNF, then there is a lossless decomposition {X 1,X 2 } with X 1,X 2 ⊊ U Proof: –Let XY be a BCNF violation ( X is not a superkey and Y is not a subset of X ) –Take X 1 =X + and X 2 =X ∪ (U\X + ) –Why are X 1 and X 2 strict subsets of U ? –Why lossless? Recall the theorem on binary lossless decompositions... 57
58
BCNF Decomposition 58 BCNFDec(U,F) { if ((U,F) in BCNF) return {U} Find a BCNF violation XY X 1 := Closure(X,F) F 1 := F + [X 1 ] X 2 := X ∪ (U\X 1 ) F 2 := F + [X 2 ] return BCNFDec(X 1,F 1 ) ∪ BCNFDec(X 2,F 2 ) } BCNFDec(U,F) { if ((U,F) in BCNF) return {U} Find a BCNF violation XY X 1 := Closure(X,F) F 1 := F + [X 1 ] X 2 := X ∪ (U\X 1 ) F 2 := F + [X 2 ] return BCNFDec(X 1,F 1 ) ∪ BCNFDec(X 2,F 2 ) }
59
Execution Example 59 ABCD AB, BC, ABCD, DB BCBC BCBC ABDABD ABDABD BCBCABD, DB BDBD BDBD ADAD ADAD DBDBADAD Are dependencies preserved in this decomposition? {AD,BD,BC} Answer: Yes, we already saw that previously
60
About the Algorithm Lossless –Proof idea: every step is lossless Exponential time in the worst case There is a polynomial-time algorithm for BCNF decomposition –[Tsou, Fischer, Decomposition of a relation scheme into Boyce-Codd Normal Form, 1982] The algorithm does not preserve dependencies! –But the problem is not with the algorithm... 60
61
Can Dependencies be Preserved? 61 ABC BC AC CBCB No BCNF decomposition of this schema preserves both dependencies (why?) Conclusion: Lossless BCNF decomposition is always possible; lossless & dependency-preserving BCNF decomposition may be impossible ABC CB
62
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 62
63
Algorithm for 3NF Decomposition We next describe an algorithm for 3NF decomposition First, some intuition 63
64
Intuition F={ AB, ABC, CB, DC } Idea: for dependency preservation, each XA becomes a schema AB ABC CD ABABABC CB BC CBCB DCDC Problem: not in 3NF Solution: minimal cover instead of F Problem: lossy Solution: add a key AD ABCD 1001 2002 ABC 100 200 CD 01 02 64
65
Reminder: Minimal Cover Let F be a set of FDs A minimal cover of F is a set G of FDs such that G + =F + with the following properties: –FDs in G have a single attribute on the right hand side; that is, they have the form XA –All FDs are required: no FD XA in G is such that G\{XA} ⊨ XA –All attributes are required: no FD XBA in G is such that G ⊨ XA 65
66
Producing A Minimal Cover Transform the FDs in F - a single attribute on the r.h.s., let G be the result Do 1.If there is a FD X A in G such that A in CLOSURE(X, G \ {X A} ) delete X A from G. 2.If there is XB A in G such that A in CLOSURE(X, G), replace XB A with X A in G. Until there are no more changes to G G is a minimal cover for the original F. Can show (1 2)* equivalent to 2* 1* but not to 1* 2* 66
67
Revised Example F={ AC, CB, DC } AC CD ACAC BC CBCB DCDC AD ABCD 1001 2002 { AB, ABC, CB, DC } min-cover AC 10 20 BC 00 AD 11 22 CD 01 02
68
68
69
Algorithm for 3NF Decomposition 69 3NFDec(U,F) { D = G := MinCover(F) for all (XA in G) do D := D ∪ {XA} if (no set in D is a superkey) D := D ∪ {FindKey(U,F)} D := RemoveConained(D) return D } 3NFDec(U,F) { D = G := MinCover(F) for all (XA in G) do D := D ∪ {XA} if (no set in D is a superkey) D := D ∪ {FindKey(U,F)} D := RemoveConained(D) return D } No need for schemas contained in others
70
About the Proof We will not prove the correctness here Still, what needs to be proved? –Resulting schemas are all in 3NF Follows from minimality of the cover –Dependencies are preserved Straightforward: all dependencies of the minimal cover are presented –Lossless What would the lossless-testing algorithm do when one X i is a key and dependencies are preserved? 70
71
Example Revisited 71 min-cover track,facultyconsultant trackcampus consultantfaculty facultycampus trackcampustrackconsultantfacultyconsultantfaculty removed contained trackconsultantfaculty trackconsultantcampusfaculty track,facultyconsultant trackcampus consultantcampus,faculty facultycampus facultycampus trackcampusfacultycampus
72
72
73
73
74
A problem? 74
75
Introduction Normal Forms BCNF 3NF Decomposition NF Decompositions Preserving Data Preserving Dependencies Decomposition Algorithms BCNF 3NF Note on 4NF Outline 75
76
Fourth Normal Form (4NF) Recall: An MVD has the form X ↠ Y where X and Y are disjoint sets of attributes –For every two tuples that agree on X, swapping their Y component doesn’t change the relation Recall: An MVD X ↠ Y is trivial (always holds) if and only if Y= ∅ or Y=U\X Recall: an FD XY can be viewed as a special type of the MVD X ↠ Y (why?) A schema (U,F), where F contains both FDs and MVDs, is in 4NF if every nontrivial FD/MVD has a superkey in its premise (lhs) –When all dependencies are FDs, same as BCNF
77
4NF Decomposition T HEOREM : Let (U,F) be a schema, where F contains both FDs and MVDs. Then F satisfies X ↠ Y iff for all relations r over U we have: r = π X ∪ Y (r) π X ∪ (U\Y) (r) Hence, the recursive decomposition algorithm for BCNF decomposition works here – Decompose(X ∪ Y) ∪ Decompose(X ∪ (U\Y)) –A polynomial time is known for special cases In particular, there is always a lossless 4NF decomposition –What about dependency preserving? –Answer: No! Even if there are only FDs (recall BCNF) 77
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.