1. 2 3 Decompositions uDo we need to decompose a relation? wSeveral normal forms for relations. If schema in these normal forms certain problems don’t.

Slides:



Advertisements
Similar presentations
1 Lecture 7 Design Theory for Relational Databases (part 1) Slides based on
Advertisements

Schema Refinement: Normal Forms
Schema Refinement: Canonical/minimal Covers
Lecture 21 CS 157 B Revision of Midterm3 Prof. Sin-Min Lee.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
1 Loss-Less Joins. 2 Decompositions uDependency-preservation property: enforce constraints on original relation by enforcing some constraints on resulting.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
CS Algorithm : Decomposition into 3NF  Obviously, the algorithm for lossless join decomp into BCNF can be used to obtain a lossless join decomp.
Classroom Exercise: Normalization
1 Multivalued Dependencies Fourth Normal Form Source: Slides by Jeffrey Ullman.
1 The Relational Data Model Functional Dependencies.
Relational Design. DatabaseDesign Process Conceptual Modeling -- ER diagrams ER schema transformed to relational schema Designer may add additional integrity.
1 Multivalued Dependencies Fourth Normal Form. 2 Definition of MVD uA multivalued dependency (MVD) on R, X ->->Y, says that if two tuples of R agree on.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form Source: Slides by Jeffrey Ullman.
1 Functional Dependencies Meaning of FD’s Keys and Superkeys Inferring FD’s.
1 Functional Dependencies Meaning of FD’s Keys and Superkeys Inferring FD’s Source: slides by Jeffrey Ullman.
Normal Form Design addendum by C. Zaniolo. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Normal Form Design Compute the canonical cover.
Winter 2002Arthur Keller – CS 1804–1 Schedule Today: Jan. 15 (T) u Normal Forms, Multivalued Dependencies. u Read Sections Assignment 1 due. Jan.
1 Multivalued Dependencies Fourth Normal Form. 2 A New Form of Redundancy uMultivalued dependencies (MVD’s) express a condition among tuples of a relation.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form Source: Slides by Jeffrey Ullman.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
1 Functional Dependencies Why FD's Meaning of FD’s Keys and Superkeys Inferring FD’s Source: slides by Jeffrey Ullman.
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Database Management Systems Chapter 3 The Relational Data Model (III) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
Databases 1 Seventh lecture. Topics of the lecture Extended relational algebra Normalization Normal forms 2.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
Schema Refinement and Normalization. Functional Dependencies (Review) A functional dependency X  Y holds over relation schema R if, for every allowable.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
Functional Dependencies and Normalization 1 Instructor: Mohamed Eltabakh
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
Copyright, Harris Corporation & Ophir Frieder, The Process of Normalization.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
Design Theory for RDB Normal Forms. Lu Chaojun, SJTU 2 Redundant because these info may be figured out by using FD s1  … What’s Bad Design? Redundancy.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
1 The Relational Data Model Tables Schemas Conversion from E/R to Relations Functional Dependencies.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Databases 1 Sixth lecture. 2 Functional Dependencies X -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms: BCNF, Third Normal Form Introduction to Multivalued Dependencies.
1 Lecture 8 Design Theory for Relational Databases (part 2) Slides from
4NF & MULTIVALUED DEPENDENCY By Kristina Miguel. Review  Superkey – a set of attributes which will uniquely identify each tuple in a relation  Candidate.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
1 Database Design: DBS CB, 2 nd Edition Physical RDBMS Model: Schema Design and Normalization Ch. 3.
Design Theory for Relational Databases
Schedule Today: Next After that Normal Forms. Section 3.6.
Normalization First Normal Form (1NF) Boyce-Codd Normal Form (BCNF)
CPSC-310 Database Systems
Schedule Today: Jan. 23 (wed) Week of Jan 28
3.1 Functional Dependencies
Multivalued Dependencies & Fourth Normal Form (4NF)
Functional Dependencies
Multivalued Dependencies
Anomalies Boyce-Codd Normal Form 3rd Normal Form
CS4222 Principles of Database System
Presentation transcript:

1

2

3 Decompositions uDo we need to decompose a relation? wSeveral normal forms for relations. If schema in these normal forms certain problems don’t arise wThe two commonly used normal forms are third normal form (3NF) and Boyce-Codd normal form (BCNF) uWhat problems does decomposition cause? wLossless-join property: get original relation by joining the resulting relations wDependency-preservation property: enforce constraints on original relation by enforcing some constraints on resulting relations wQueries may require a join of decomposed relations!

4 Finding All Implied FD’s in a Projected Schema uMotivation: “normalization,” the process where we break a relation schema into two or more schemas. uExample: ABCD with FD’s AB ->C, C ->D, and D ->A. wDecompose into ABC, AD wWhat FD’s hold in ABC ? Not only AB ->C, but also C ->A ! uSame as F + method, but restrict to those FD’s that involve only attributes of the projected schema

5 Example: Projecting FD’s uABC with FD’s A ->B and B ->C. Project onto AC. wA + =ABC ; yields A ->B, A ->C. We do not need to compute AB + or AC +. wB + =BC ; yields B ->C. wC + =C ; yields nothing. wBC + =BC ; yields nothing. uResulting FD’s: A ->B, A ->C, andB ->C. uProjection onto AC : A ->C. wOnly FD that involves a subset of {A,C }.

6 Boyce-Codd Normal Form uWe say a relation R is in BCNF if whenever X ->Y is a nontrivial FD that holds in R, X is a superkey. wRemember: nontrivial means Y is not contained in X. wRemember, a superkey is any superset of a key (not necessarily a proper superset).

7 Example Drinkers(name, addr, beersLiked, manf, favBeer) FD’s: name->addr favBeer, beersLiked->manf uOnly key is {name, beersLiked}. uIn each FD, the left side is not a superkey. uAny one of these FD’s shows Drinkers is not in BCNF

8 Another Example Beers(name, manf, manfAddr) FD’s: name->manf, manf->manfAddr uOnly key is {name}. uname->manf does not violate BCNF, but manf->manfAddr does.

9 Decomposition into BCNF uGiven: relation R with FD’s F. uLook among the given FD’s for a BCNF violation X ->Y. wIf any FD following from F violates BCNF, then there will surely be an FD in F itself that violates BCNF. uCompute X +. wNot all attributes, or else X is a superkey.

10 Decompose R Using X -> Y uReplace R by relations with schemas: 1. R 1 = X R 2 = X (R – X + ) uProject given FD’s F onto the two new relations.

11 Example: BCNF Decomposition Drinkers(name, addr, beersLiked, manf, favBeer) F = name->addr, name -> favBeer, beersLiked->manf uPick BCNF violation name->addr. uClose the left side: {name} + = {name, addr, favBeer}. uDecomposed relations: 1.Drinkers1(name, addr, favBeer) 2.Drinkers2(name, beersLiked, manf)

12 Example -- Continued uWe are not done; we need to check Drinkers1 and Drinkers2 for BCNF. uProjecting FD’s is easy here. uFor Drinkers1(name, addr, favBeer), relevant FD’s are name->addr and name->favBeer. wThus, {name} is the only key and Drinkers1 is in BCNF.

13 Example -- Continued uFor Drinkers2(name, beersLiked, manf), the only FD is beersLiked->manf, and the only key is {name, beersLiked}. wViolation of BCNF. ubeersLiked + = {beersLiked, manf}, so we decompose Drinkers2 into: 1.Drinkers3(beersLiked, manf) 2.Drinkers4(name, beersLiked)

14 Example -- Concluded uThe resulting decomposition of Drinkers : 1.Drinkers1(name, addr, favBeer) 2.Drinkers3(beersLiked, manf) 3.Drinkers4(name, beersLiked) uNotice: Drinkers1 tells us about drinkers, Drinkers3 tells us about beers, and Drinkers4 tells us the relationship between drinkers and the beers they like.

15 Third Normal Form -- Motivation uThere is one structure of FD’s that causes trouble when we decompose. uAB ->C and C ->B. wE.g: A = street address, B = city, C = zip code. uThere are two keys, {A,B } and {A,C } (why?) uC ->B is a BCNF violation, so we must decompose into AC, BC.

16 street zip 545 Tech Sq Tech Sq city zip Cambridge02138 Cambridge02139 Join tuples with equal zip codes. street city zip 545 Tech Sq.Cambridge Tech Sq.Cambridge02139 Although no FD’s were violated in the decomposed relations, FD street city -> zip is violated by the database as a whole. ABC C We Cannot Enforce FD’s we cannot enforce the FD AB ->C by checking FD’s in these decomposed relations C ->B

17 3NF Let’s Us Avoid This Problem u3 rd Normal Form (3NF) modifies the BCNF condition so we do not have to decompose in this problem situation. uAn attribute is prime if it is a member of any key. uX ->A violates 3NF if and only if X is not a superkey, and also A is not prime or X ->A satisfies 3NF if and only if X is a superkey or A is prime

18 Example: 3NF uIn our problem situation with FD’s AB ->C and C ->B, we have keys AB and AC. uThus A, B, and C are each prime. uAlthough C ->B violates BCNF, it does not violate 3NF.

19 What 3NF and BCNF Give You uThere are two important properties of a decomposition: 1.Lossless Join : it should be possible to project the original relations onto the decomposed schema, and then reconstruct the original. 2.Dependency Preservation : it should be possible to check in the projected relations whether all the given FD’s are satisfied.

20 3NF and BCNF -- Continued uWe can get (1) with a BCNF decomposition. uWe can get both (1) and (2) with a 3NF decomposition. uBut we can’t always get (1) and (2) with a BCNF decomposition. wstreet-city-zip is an example.

21 3NF Synthesis Algorithm uWe can always construct a decomposition into 3NF relations with a lossless join and dependency preservation. uNeed minimal basis for the FD’s: 1.Right sides are single attributes. 2.No attribute can be removed from a left side. 3.No FD can be removed.

22 Computing Minimal Basis ustep 1: RHS of each FD is a single attribute. ustep 2: Eliminate unnecessary attributes from LHS. wAlgorithm: If FD XB  A  F (where B and A are single attributes) and X  A is entailed by F, then B was unnecessary ustep 3: Delete unnecessary FDs from F wAlgorithm: If F - {f} entails f, then f is unnecessary. If f is X  A then check if A  X + F-{f}

23 3NF Synthesis – (2) uOne relation for each FD in the minimal basis. wSchema is the union of the left and right sides. uIf no key is contained in an FD, then add one relation whose schema is some key.

24 Example u{A → B, ABCD → E, EF → GH, ACDF → EG} u Make RHS a single attribute: {A → B, ABCD → E, EF → G, EF → H, ACDF → E, ACDF → G} u Minimize LHS: ACD → E instead of ABCD → E u Eliminate redundant FDs w Can ACDF → G be removed? w Can ACDF → E be removed? u Final answer: {A → B, ACD → E, EF → G, EF → H}

25 Example u{A → B, ABCD → E, EF → GH, ACDF → EG}

26 Synthesizing a 3NF Schema ustep 1: Compute a minimal cover U, of F. The decomposition is based on U, but since U + = F + the same functional dependencies will apply ustep 2: Group all FDs in U with a common LHS, and then Partition U into sets U 1, U 2, … U n ustep 3: For each U i form a schema R i = set of attributes named in U i wEach FD of U will apply to some R i. Hence the decomposition is dependency preserving ustep 4: If no R i is a key of R, add schema R 0 = a key of R. wR 0 might be needed since not all attributes are necessarily contained in R 1  R 2 …  R n wThis guarantees lossless decomposition

27 Example uRelation: R=ABCDEFGH uFDs: {A → B, ABCD → E, EF → GH, ACDF → EG} uFind minimal cover: {A → B, ACD → E, EF → G, EF → H} u Combine LHS: {A → B, ACD → E, EF → GH} u New relations: AB, ACDE, EFGH u Is any of these a key? u No, so add one (e.g., ACDF) u Final tables: AB, ACDE, EFGH, ACDF

28 Why It Works uPreserves dependencies: each FD from a minimal basis is contained in a relation, thus preserved. uLossless Join: use the chase to show that the row for the relation that contains a key can be made all- unsubscripted variables. u3NF: hard part – a property of minimal bases.

29 BCNF via 3NF Synthesis uTry 3NF synthesis to decompose the original relation R uThis is loss-less and dependency preserving uIf all sub-relations in BCNF, done! uIf not, use the BCNF decomposition on the violating sub-relation uIf some FDs are not preserved, at least we tried hard!