CPSC-310 Database Systems

Slides:



Advertisements
Similar presentations
Schema Refinement: Normal Forms
Advertisements

Relational Design. DatabaseDesign Process Conceptual Modeling -- ER diagrams ER schema transformed to relational schema Designer may add additional integrity.
Normal Form Design addendum by C. Zaniolo. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Normal Form Design Compute the canonical cover.
Winter 2002Arthur Keller – CS 1804–1 Schedule Today: Jan. 15 (T) u Normal Forms, Multivalued Dependencies. u Read Sections Assignment 1 due. Jan.
1 Multivalued Dependencies Fourth Normal Form. 2 A New Form of Redundancy uMultivalued dependencies (MVD’s) express a condition among tuples of a relation.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Decompositions uDo we need to decompose a relation? wSeveral normal forms for relations. If schema in these normal forms certain problems don’t.
Database Management Systems Chapter 3 The Relational Data Model (III) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
Databases 1 Seventh lecture. Topics of the lecture Extended relational algebra Normalization Normal forms 2.
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Databases 1 Sixth lecture. 2 Functional Dependencies X -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms: BCNF, Third Normal Form Introduction to Multivalued Dependencies.
1 Lecture 8 Design Theory for Relational Databases (part 2) Slides from
4NF & MULTIVALUED DEPENDENCY By Kristina Miguel. Review  Superkey – a set of attributes which will uniquely identify each tuple in a relation  Candidate.
1 Database Design: DBS CB, 2 nd Edition Physical RDBMS Model: Schema Design and Normalization Ch. 3.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
1 CS122A: Introduction to Data Management Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li.
Lecture 11: Functional Dependencies
Advanced Normalization
Design Theory for Relational Databases
CS422 Principles of Database Systems Normalization
Design Theory for RDB Normal Forms.
Schedule Today: Next After that Normal Forms. Section 3.6.
Schema Refinement and Normal Forms
CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1.
Normalization First Normal Form (1NF) Boyce-Codd Normal Form (BCNF)
CS422 Principles of Database Systems Normalization
Database Design Dr. M.E. Fayad, Professor
Relational Database Design by Dr. S. Sridhar, Ph. D
Schedule Today: Jan. 23 (wed) Week of Jan 28
CS 480: Database Systems Lecture 22 March 6, 2013.
3.1 Functional Dependencies
Handout 4 Functional Dependencies
Advanced Normalization
BCNF and Normalization
Functional Dependencies and Normalization
CPSC-310 Database Systems
Module 5: Overview of Normalization
Normalization Murali Mani.
Multivalued Dependencies & Fourth Normal Form (4NF)
Cse 344 May 16th – Normalization.
Functional Dependencies and Relational Schema Design
Multivalued Dependencies & Fourth Normal Form
Multivalued Dependencies & Fourth Normal Form
Normalization Part II cs3431.
CPSC-310 Database Systems
Lecture 8: Database Design
Boyce-Codd Normal Form
Functional Dependencies
Third Normal Form.
Multivalued Dependencies
Anomalies Boyce-Codd Normal Form 3rd Normal Form
Database Design Dr. M.E. Fayad, Professor
Lecture 6: Functional Dependencies
Chapter 3: Design theory for relational Databases
Chapter 7a: Overview of Database Design -- Normalization
Functional Dependencies and Normalization
Functional Dependencies and Normalization
Lecture 09: Functional Dependencies
CS4222 Principles of Database System
Presentation transcript:

CPSC-310 Database Systems Professor Jianer Chen Room 315C HRBB Lecture #7

Boyce-Codd Normal Form (BCNF) Definition. A relation R is in Boyce-Codd Normal Form (BCNF) if every nontrivial FD X→A (i.e., AX) has its left side X a superkey.

Decomposition into BCNF Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) has (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C;

Why Does the Algorithm BCNF Work?

Why Does the Algorithm BCNF Work? Reasons: The algorithm Decomposition(X) always breaks a relation R into two relations with fewer attributes.

Why Does the Algorithm BCNF Work? Reasons: The algorithm Decomposition(X) always breaks a relation R into two relations with fewer attributes. Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2.

Why Does the Algorithm BCNF Work? Reasons: The algorithm Decomposition(X) always breaks a relation R into two relations with fewer attributes. Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2.

Why Does the Algorithm BCNF Work? Reasons: The algorithm Decomposition(X) always breaks a relation R into two relations with fewer attributes. Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) and (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C; Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2.

Why Does the Algorithm BCNF Work? Reasons: The algorithm Decomposition(X) always breaks a relation R into two relations with fewer attributes. Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) and (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C; Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2.

Why Does the Algorithm BCNF Work? Reasons: The algorithm Decomposition(X) always breaks a relation R into two relations with fewer attributes. A relation with only two attributes is always in BCNF Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) and (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C; Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2.

Why Does the Algorithm BCNF Work? Reasons: The algorithm Decomposition(X) always breaks a relation R into two relations with fewer attributes. A relation with only two attributes is always in BCNF Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) and (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C; Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2.

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)}

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}.

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) R2 = Drinkers2(name, beersLiked, manf)

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) R2 = Drinkers2(name, beersLiked, manf) Add R1 and R2 to C’; C’ = {(R1, T1), (R2, T2)}

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) R2 = Drinkers2(name, beersLiked, manf) Add R1 and R2 to C’; C’ = {(R1, T1), (R2, T2)} We are not done, yet since C’  ;

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) R2 = Drinkers2(name, beersLiked, manf) Add R1 and R2 to C’; C’ = {(R1, T1), (R2, T2)} We are not done, yet since C’  ; Pick R1 = Drinkers1(name, addr, favBeer), with the FD’s T1= { name→addr, name→favBeer}.

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) R2 = Drinkers2(name, beersLiked, manf) Add R1 and R2 to C’; C’ = {(R1, T1), (R2, T2)} We are not done, yet since C’  ; Pick R1 = Drinkers1(name, addr, favBeer), with the FD’s T1= { name→addr, name→favBeer}. {name} is the only key so R1 = Drinkers1 is in BCNF.

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) in BCNF , add to C R2 = Drinkers2(name, beersLiked, manf) Add R1 and R2 to C’; C’ = {(R1, T1), (R2, T2)} We are not done, yet since C’  ; Pick R1 = Drinkers1(name, addr, favBeer), with the FD’s T1= { name→addr, name→favBeer}. {name} is the only key so R1 = Drinkers1 is in BCNF.

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) in BCNF , add to C R2 = Drinkers2(name, beersLiked, manf) C’ = {(R2, T2)}

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) in BCNF , add to C R2 = Drinkers2(name, beersLiked, manf) Pick R2= Drinkers2(name, beersLiked, manf) from C’ with FDs T2= { beersLiked→manf}, the only key is {name, beersLiked}

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) in BCNF , add to C R2 = Drinkers2(name, beersLiked, manf) Pick R2= Drinkers2(name, beersLiked, manf) from C’ with FDs T2= { beersLiked→manf}, the only key is {name, beersLiked} So beersLiked→manf is a BCNF violator.

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) in BCNF , add to C R2 = Drinkers2(name, beersLiked, manf) Pick R2= Drinkers2(name, beersLiked, manf) from C’ with FDs T2= { beersLiked→manf}, the only key is {name, beersLiked} So beersLiked→manf is a BCNF violator. {beersLiked}+ = {beersLiked, manf}.

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) in BCNF , add to C R2 = Drinkers2(name, beersLiked, manf) Pick R2= Drinkers2(name, beersLiked, manf) from C’ with FDs T2= { beersLiked→manf}, the only key is {name, beersLiked} So beersLiked→manf is a BCNF violator. {beersLiked}+ = {beersLiked, manf}. Decompose Drinkers2: R3= Drinkers3(beersLiked, manf), T3 = {beersLiked→manf} R4= Drinkers4(name, beersLiked), no nontrivial FD

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) in BCNF , add to C R2 = Drinkers2(name, beersLiked, manf) Pick R2= Drinkers2(name, beersLiked, manf) from C’ with FDs T2= { beersLiked→manf}, the only key is {name, beersLiked} So beersLiked→manf is a BCNF violator. {beersLiked}+ = {beersLiked, manf}. Decompose Drinkers2: R3= Drinkers3(beersLiked, manf), T3 = {beersLiked→manf} R4= Drinkers4(name, beersLiked), no nontrivial FD Add R3, R4 to C’. They are in BCNF, so will be moved to C.

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) in BCNF , add to C R2 = Drinkers2(name, beersLiked, manf) The resulting decomposition of Drinkers: R1 = Drinkers1(name, addr, favBeer) R3 = Drinkers3(beersLiked, manf) R4 = Drinkers4(name, beersLiked)

Example R = Drinkers(name, addr, beersLiked, manf, favBeer) T = {name→addr, name→favBeer, beersLiked→manf} The only key is {name, beersLiked}, C’ = {(R, T)} Pick (R, T) from C’, and pick a BCNF violation name→addr in R Close the left side: {name}+ = {name, addr, favBeer}. Decomposed relations: R1 = Drinkers1(name, addr, favBeer) in BCNF , add to C R2 = Drinkers2(name, beersLiked, manf) The resulting decomposition of Drinkers: R1 = Drinkers1(name, addr, favBeer) R3 = Drinkers3(beersLiked, manf) R4 = Drinkers4(name, beersLiked) Notice: Drinkers1 tells about drinkers, Drinkers3 tells about beers, and Drinkers4 tells the relationship between drinkers and the beers they like.

Boyce-Codd Normal Form (BCNF) Definition. A relation R is in Boyce-Codd Normal Form (BCNF) if every nontrivial FD X→A (i.e., AX) has its left side X a superkey. Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) has (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C;

Boyce-Codd Normal Form (BCNF) Definition. A relation R is in Boyce-Codd Normal Form (BCNF) if every nontrivial FD X→A (i.e., AX) has its left side X a superkey. Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) has (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C; Remarks. Algorithm constructs relations in BCNF.

Boyce-Codd Normal Form (BCNF) Definition. A relation R is in Boyce-Codd Normal Form (BCNF) if every nontrivial FD X→A (i.e., AX) has its left side X a superkey. Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) has (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C; Remarks. Algorithm constructs relations in BCNF. No information in the relation R is changed by the algorithm.

Boyce-Codd Normal Form (BCNF) Definition. A relation R is in Boyce-Codd Normal Form (BCNF) if every nontrivial FD X→A (i.e., AX) has its left side X a superkey. Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) has (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C; Remarks. Algorithm constructs relations in BCNF. No information in the relation R is changed by the algorithm. This is because the algorithm Decomposition does not change information

Eliminating bad FD X → A (for all A) Remarks. Algorithm Decomposition(X) eliminates the bad FD X→A: * X→A is still an FD in R1=(X+) but now X is a superkey for R1; * X→A is not FD in R2=(X∪W) because A is not in X∪W. Does Decomposition(X) change (i.e., lose or add extra) information for R: No. Can R1 and R2 still have bad FDs: Yes. Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2. Algorithm Decomposed-FDs(R1) 1. T1 = ; \\ T1 is the FDs for R1 2. For each subset Y of S1 Do 2.1 compute Y+ using the FDs in R; 2.2 For each A in S1∩(Y+\Y) add Y → A to T1; 3. While changes Do 3.1 Drop from T1 those FDs that are derivable from the others; 3.2 For each XB → A in T1, if X → A is implied by T1, then replace XB → A in T1 by X → A.

Eliminating bad FD X → A (for all A) Remarks. Algorithm Decomposition(X) eliminates the bad FD X→A: * X→A is still an FD in R1=(X+) but now X is a superkey for R1; * X→A is not FD in R2=(X∪W) because A is not in X∪W. Does Decomposition(X) change (i.e., lose or add extra) information for R: No. Can R1 and R2 still have bad FDs: Yes. However, Decomposition(X) may not preserve FDs in R !! Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2. Algorithm Decomposed-FDs(R1) 1. T1 = ; \\ T1 is the FDs for R1 2. For each subset Y of S1 Do 2.1 compute Y+ using the FDs in R; 2.2 For each A in S1∩(Y+\Y) add Y → A to T1; 3. While changes Do 3.1 Drop from T1 those FDs that are derivable from the others; 3.2 For each XB → A in T1, if X → A is implied by T1, then replace XB → A in T1 by X → A.

Decomposition Does not Preserve FDs Example. Relation R(A,B,C) with FDs = {AB→C, C→B}

Decomposition Does not Preserve FDs Example. Relation R(A,B,C) with FDs = {AB→C, C→B} e.g., A = street address, B = city, C = zip code.

Decomposition Does not Preserve FDs Example. Relation R(A,B,C) with FDs = {AB→C, C→B} e.g., A = street address, B = city, C = zip code. There are two keys, {A,B} and {A,C}.

Decomposition Does not Preserve FDs Example. Relation R(A,B,C) with FDs = {AB→C, C→B} e.g., A = street address, B = city, C = zip code. There are two keys, {A,B} and {A,C}. So C→B is a BCNF violation.

Decomposition Does not Preserve FDs Example. Relation R(A,B,C) with FDs = {AB→C, C→B} e.g., A = street address, B = city, C = zip code. There are two keys, {A,B} and {A,C}. So C→B is a BCNF violation. The algorithm Decomposition breaks R(A,B,C) into R1(A,C) with no nontrivial FDs, and R2(B,C) with FDs {C→B}.

Decomposition Does not Preserve FDs Example. Relation R(A,B,C) with FDs = {AB→C, C→B} e.g., A = street address, B = city, C = zip code. There are two keys, {A,B} and {A,C}. So C→B is a BCNF violation. The algorithm Decomposition breaks R(A,B,C) into R1(A,C) with no nontrivial FDs, and R2(B,C) with FDs {C→B}. However, neither R1 nor R2 can enforce AB→C because neither contains all A, B, C.

Decomposition Does not Preserve FDs Example. Relation R(A,B,C) with FDs = {AB→C, C→B} e.g., A = street address, B = city, C = zip code. R1(A,C) with FDs {} and R2(B,C) with FDs {C→B}.

Decomposition Does not Preserve FDs Example. Relation R(A,B,C) with FDs = {AB→C, C→B} e.g., A = street address, B = city, C = zip code. R1(A,C) with FDs {} and R2(B,C) with FDs {C→B}. Valid R1(A,C) Valid R2(B,C) Street Addr Zip Code 545 Tech Sq. 02138 02139 City Zip Code Cambridge 02138 02139

Decomposition Does not Preserve FDs Example. Relation R(A,B,C) with FDs = {AB→C, C→B} e.g., A = street address, B = city, C = zip code. R1(A,C) with FDs {} and R2(B,C) with FDs {C→B}. Valid R1(A,C) Valid R2(B,C) Street Addr Zip Code 545 Tech Sq. 02138 02139 City Zip Code Cambridge 02138 02139 However, the natural join R1(A,C)⋈R2(B,C) is invalid for R(A,B,C) Street Addr City Zip Cold 545 Tech Sq. Cambridge 02138 02139 {Street Addr, City} → Zip (i.e., AB → C) is violated.

Eliminating bad FD X → A (for all A) Remarks. Algorithm Decomposition(X) eliminates the bad FD X→A: * X→A is still an FD in R1=(X+) but now X is a superkey for R1; * X→A is not FD in R2=(X∪W) because A is not in X∪W. Does Decomposition(X) change (i.e., lose or add extra) information for R: No. Can R1 and R2 still have bad FDs: Yes. However, Decomposition(X) may not preserve FDs in R !! Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2. Algorithm Decomposed-FDs(R1) 1. T1 = ; \\ T1 is the FDs for R1 2. For each subset Y of S1 Do 2.1 compute Y+ using the FDs in R; 2.2 For each A in S1∩(Y+\Y) add Y → A to T1; 3. While changes Do 3.1 Drop from T1 those FDs that are derivable from the others; 3.2 For each XB → A in T1, if X → A is implied by T1, then replace XB → A in T1 by X → A.

Eliminating bad FD X → A (for all A) Remarks. Algorithm Decomposition(X) eliminates the bad FD X→A: * X→A is still an FD in R1=(X+) but now X is a superkey for R1; * X→A is not FD in R2=(X∪W) because A is not in X∪W. Does Decomposition(X) change (i.e., lose or add extra) information for R: No. Can R1 and R2 still have bad FDs: Yes. However, Decomposition(X) may not preserve FDs in R !! Algorithm Decomposition(X) 1. Compute S1 = X+ using the FDs in R; 2. Let W be the set of attributes that are not in S1; 3. Make a relation R1 of schema S1; 4. Make a relation R2 with schema S2 = X ∪ W; 5. Compute the FDs for R1 and R2. Algorithm Decomposed-FDs(R1) 1. T1 = ; \\ T1 is the FDs for R1 2. For each subset Y of S1 Do 2.1 compute Y+ using the FDs in R; 2.2 For each A in S1∩(Y+\Y) add Y → A to T1; 3. While changes Do 3.1 Drop from T1 those FDs that are derivable from the others; 3.2 For each XB → A in T1, if X → A is implied by T1, then replace XB → A in T1 by X → A. This leads to

Boyce-Codd Normal Form (BCNF) Definition. A relation R is in Boyce-Codd Normal Form (BCNF) if every nontrivial FD X→A (i.e., AX) has its left side X a superkey. Algorithm BCNF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in BCNF 1. C = ; C’ = {(R, T)}; 2. While C’   Do 2.1 Pick (R’, T’) in C’ (and remove it from C’); 2.2 If (R’, T’) has a BCNF violator X; 2.3 Then Call Decomposition(X) to construct the two relations (R1, T1) has (R2, T2); add (R1, T1) and (R2, T2) to C’; 2.4 Else add (R’, T’) to C; Remarks. Algorithm constructs relations in BCNF. No information in the relation R is changed by the algorithm. The algorithm BCNF may not preserve FDs !!

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted We are looking for a constraint that may be less restricted but preserves FDs.

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted We are looking for a constraint that may be less restricted but preserves FDs. An attribute A is a prime if A is in some key.

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted We are looking for a constraint that may be less restricted but preserves FDs. An attribute A is a prime if A is in some key. The 3rd Normal Form (3NF) A relation R is in 3NF if for every nontrivial FD X→A of R, either X is a superkey or A is a prime.

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted We are looking for a constraint that may be less restricted but preserves FDs. An attribute A is a prime if A is in some key. The 3rd Normal Form (3NF) A relation R is in 3NF if for every nontrivial FD X→A of R, either X is a superkey or A is a prime. * R is in BCNF ⟹ R is in 3NF

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted We are looking for a constraint that may be less restricted but preserves FDs. An attribute A is a prime if A is in some key. The 3rd Normal Form (3NF) A relation R is in 3NF if for every nontrivial FD X→A of R, either X is a superkey or A is a prime. * R is in BCNF ⟹ R is in 3NF * X→A violates 3NF ⟹ X is not a superkey & A is not a prime.

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted We are looking for a constraint that may be less restricted but preserves FDs. An attribute A is a prime if A is in some key. The 3rd Normal Form (3NF) A relation R is in 3NF if for every nontrivial FD X→A of R, either X is a superkey or A is a prime. Re-look at the example: R(A,B,C) with FDs {AB→C, C→B}.

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted We are looking for a constraint that may be less restricted but preserves FDs. An attribute A is a prime if A is in some key. The 3rd Normal Form (3NF) A relation R is in 3NF if for every nontrivial FD X→A of R, either X is a superkey or A is a prime. Re-look at the example: R(A,B,C) with FDs {AB→C, C→B}. R has keys AB and AC:

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted We are looking for a constraint that may be less restricted but preserves FDs. An attribute A is a prime if A is in some key. The 3rd Normal Form (3NF) A relation R is in 3NF if for every nontrivial FD X→A of R, either X is a superkey or A is a prime. Re-look at the example: R(A,B,C) with FDs {AB→C, C→B}. R has keys AB and AC: R is not in BCNF because of C→B.

The 3rd Normal Form (3NF) Thus, BCNF may be too restricted We are looking for a constraint that may be less restricted but preserves FDs. An attribute A is a prime if A is in some key. The 3rd Normal Form (3NF) A relation R is in 3NF if for every nontrivial FD X→A of R, either X is a superkey or A is a prime. Re-look at the example: R(A,B,C) with FDs {AB→C, C→B}. R has keys AB and AC: R is not in BCNF because of C→B. However, R is in 3NF because in C→B, B is a prime

Recoverability and Dependency Preserving Two important properties of a decomposition: Recoverability: The original relations can be reconstructed from the decomposed relations. Dependency Preservation: The FDs in the original relation can be derived based on the FDs in the decomposed relations.

Decomposition into 3NF Algorithm 3NF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in 3NF 1. While changes Do 1.1 If X→A in T is implied by the other FDs in T Then remove X→A from T; 1.2 If XB→A is in T and X→A is implied by T Then replace XB→A in T by X→A; 2. For each X→A in T Do construct a table RX,A = X→A(R) with FD {X→A}; 3. If none of the relations in step 2 contains a key Then add another relation R’ whose schema is a key and with no non-trivial FDs.

Decomposition into 3NF Algorithm 3NF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in 3NF 1. While changes Do 1.1 If X→A in T is implied by the other FDs in T Then remove X→A from T; 1.2 If XB→A is in T and X→A is implied by T Then replace XB→A in T by X→A; 2. For each X→A in T Do construct a table RX,A = X→A(R) with FD {X→A}; 3. If none of the relations in step 2 contains a key Then add another relation R’ whose schema is a key and with no non-trivial FDs.

Decomposition into 3NF Algorithm 3NF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in 3NF 1. While changes Do 1.1 If X→A in T is implied by the other FDs in T Then remove X→A from T; 1.2 If XB→A is in T and X→A is implied by T Then replace XB→A in T by X→A; 2. For each X→A in T Do construct a table RX,A = X→A(R) with FD {X→A}; 3. If none of the relations in step 2 contains a key Then add another relation R’ whose schema is a key and with no non-trivial FDs.

Decomposition into 3NF Algorithm 3NF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in 3NF 1. While changes Do 1.1 If X→A in T is implied by the other FDs in T Then remove X→A from T; 1.2 If XB→A is in T and X→A is implied by T Then replace XB→A in T by X→A; 2. For each X→A in T Do construct a table RX,A = X→A(R) with FD {X→A}; 3. If none of the relations in step 2 contains a key Then add another relation R’ whose schema is a key and with no non-trivial FDs.

Decomposition into 3NF Algorithm 3NF(R, T) Input: A relation R and its FDs T Output: A collection C of relations in 3NF 1. While changes Do 1.1 If X→A in T is implied by the other FDs in T Then remove X→A from T; 1.2 If XB→A is in T and X→A is implied by T Then replace XB→A in T by X→A; 2. For each X→A in T Do construct a table RX,A = X→A(R) with FD {X→A}; 3. If none of the relations in step 2 contains a key Then add another relation R’ whose schema is a key and with no non-trivial FDs. It can be formally proved that the algorithm 3NF produces a set of relations in 3NF, which both is recoverable and FD preserving

Summary Two important properties of a decomposition: Recoverability: The original relations can be reconstructed from the decomposed relations. Dependency Preservation: The FDs in the original relation can be derived based on the FDs in the decomposed relations.

Summary Two important properties of a decomposition: Recoverability: The original relations can be reconstructed from the decomposed relations. Dependency Preservation: The FDs in the original relation can be derived based on the FDs in the decomposed relations. BCNF decomposition is recoverable but not necessarily FD preserving

Summary Two important properties of a decomposition: Recoverability: The original relations can be reconstructed from the decomposed relations. Dependency Preservation: The FDs in the original relation can be derived based on the FDs in the decomposed relations. BCNF decomposition is recoverable but not necessarily FD preserving 3NF decomposition is both recoverable and FD preserving

Summary Two important properties of a decomposition: Recoverability: The original relations can be reconstructed from the decomposed relations. Dependency Preservation: The FDs in the original relation can be derived based on the FDs in the decomposed relations. BCNF decomposition is recoverable but not necessarily FD preserving 3NF decomposition is both recoverable and FD preserving (however, it may introduce redundancy)