1 Lecture 8 Design Theory for Relational Databases (part 2) Slides from

Slides:



Advertisements
Similar presentations
1 Lecture 7 Design Theory for Relational Databases (part 1) Slides based on
Advertisements

Schema Refinement: Normal Forms
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
1 Design Theory. 2 Minimal Sets of Dependancies A set of dependencies is minimal if: 1.Every right side is a single attribute 2.For no X  A in F and.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
1 Loss-Less Joins. 2 Decompositions uDependency-preservation property: enforce constraints on original relation by enforcing some constraints on resulting.
Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
1 Multivalued Dependencies Fourth Normal Form Source: Slides by Jeffrey Ullman.
1 The Relational Data Model Functional Dependencies.
Relational Design. DatabaseDesign Process Conceptual Modeling -- ER diagrams ER schema transformed to relational schema Designer may add additional integrity.
1 Multivalued Dependencies Fourth Normal Form. 2 Definition of MVD uA multivalued dependency (MVD) on R, X ->->Y, says that if two tuples of R agree on.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form Source: Slides by Jeffrey Ullman.
1 Functional Dependencies Meaning of FD’s Keys and Superkeys Inferring FD’s.
1 Functional Dependencies Meaning of FD’s Keys and Superkeys Inferring FD’s Source: slides by Jeffrey Ullman.
Normal Form Design addendum by C. Zaniolo. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Normal Form Design Compute the canonical cover.
Winter 2002Arthur Keller – CS 1804–1 Schedule Today: Jan. 15 (T) u Normal Forms, Multivalued Dependencies. u Read Sections Assignment 1 due. Jan.
1 Multivalued Dependencies Fourth Normal Form. 2 A New Form of Redundancy uMultivalued dependencies (MVD’s) express a condition among tuples of a relation.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form Source: Slides by Jeffrey Ullman.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Decompositions uDo we need to decompose a relation? wSeveral normal forms for relations. If schema in these normal forms certain problems don’t.
Database Management Systems Chapter 3 The Relational Data Model (III) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
Databases 1 Seventh lecture. Topics of the lecture Extended relational algebra Normalization Normal forms 2.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
CSC271 Database Systems Lecture # 28.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
Design Theory for RDB Normal Forms. Lu Chaojun, SJTU 2 Redundant because these info may be figured out by using FD s1  … What’s Bad Design? Redundancy.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
Databases 1 Sixth lecture. 2 Functional Dependencies X -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Chapter 7 Normalization Chapter 14 & 15 in Textbook.
Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms: BCNF, Third Normal Form Introduction to Multivalued Dependencies.
4NF & MULTIVALUED DEPENDENCY By Kristina Miguel. Review  Superkey – a set of attributes which will uniquely identify each tuple in a relation  Candidate.
1 Database Design: DBS CB, 2 nd Edition Physical RDBMS Model: Schema Design and Normalization Ch. 3.
Design Theory for Relational Databases
Schedule Today: Next After that Normal Forms. Section 3.6.
Normalization DBMS.
Normalization Dongsheng Lu Feb 21, 2003.
CPSC-310 Database Systems
Schedule Today: Jan. 23 (wed) Week of Jan 28
3.1 Functional Dependencies
BCNF and Normalization
Normalization.
Normalization.
Normalization and FD.
Normalization Dongsheng Lu Feb 21, 2003.
Multivalued Dependencies & Fourth Normal Form (4NF)
Functional Dependencies
Third Normal Form.
Multivalued Dependencies
Anomalies Boyce-Codd Normal Form 3rd Normal Form
Presentation transcript:

1 Lecture 8 Design Theory for Relational Databases (part 2) Slides from CS4416 Database Systems CS5122 Development of IS 2

2 First Normal Form (1NF) uFirst Normal Form wthe intersection of each row and column contains one and only one value.

3 clientNocNamepropert yNo rentrentStartownerNoownerName CR76John KayPG4 PG Jul-07 1-Sep-08 OW03 OW04 Tina Murphy Tony Shaw CR56Aline Stewart PG4 PG36 PG Sep Oct-07 1-Nov-09 OW03 OW04 Tina Murphy Tony Shaw Not in 1NF clientNocNamepropertyNorentrentStartownerNoownerName CR76John KayPG43501-Jul-07OW03Tina Murphy CR76John KayPG Sep-08OW04Tony Shaw CR56Aline StewartPG41501-Sep-06OW03Tina Murphy CR56Aline StewartPG Oct-07OW04Tony Shaw CR56Aline StewartPG Nov-09OW04Tony Shaw 1NF

4 Assume a property can be rented by only one client on a particular day and property owners don't change. FDs: propertyNo rentStart  clientNo rent clientNo  cName propertyNo rentStart  cName propertyNo  ownerNo propertyNo rentStart  ownerNo ownerNo  ownerName propertyNo rentStart  ownerName Key {properyNo, rentStart} Partial dependency clientNocNamepropertyNorentrentStartownerNoownerName CR76John KayPG43501-Jul-07OW03Tina Murphy CR76John KayPG Sep-08OW04Tony Shaw CR56Aline StewartPG41501-Sep-06OW03Tina Murphy CR56Aline StewartPG Oct-07OW04Tony Shaw CR56Aline StewartPG Nov-09OW04Tony Shaw Transitive dependency

5 Second Normal Form (2NF) uSecond Normal form wA relation R is in 2NF if it is in 1NF and every non-key attribute of R is fully functionally dependent on every key of R. wTransitive FDs can still be present. An FD X  A in relation R is a transitive FD if there exist a set of attributes Z in R that is neither a key nor a subset of any key of R and both X  Z and Z  A hold.

6 Not in 2NF clientNocNamepropertyNorentrentStartownerNoownerName CR76John KayPG43501-Jul-07OW03Tina Murphy CR76John KayPG Sep-08OW04Tony Shaw CR56Aline StewartPG41501-Sep-06OW03Tina Murphy CR56Aline StewartPG Oct-07OW04Tony Shaw CR56Aline StewartPG Nov-09OW04Tony Shaw propertyNoownerNoownerName PG4OW03Tina Murphy PG16OW04Tony Shaw PG4OW03Tina Murphy PG36OW04Tony Shaw PG16OW04Tony Shaw clientNocNamepropertyNorentrentStart CR76John KayPG43501-Jul-07 CR76John KayPG Sep-08 CR56Aline StewartPG41501-Sep-06 CR56Aline StewartPG Oct-07 CR56Aline StewartPG Nov-09 2NF

7 Boyce-Codd Normal Form uIntroduced to define a relation in 2NF without transitive FDs. uWe say a relation R is in BCNF if whenever X  Y is a nontrivial FD that holds in R, X is a superkey. wRemember: nontrivial means Y is not contained in X. wRemember, a superkey is any superset of a key (not necessarily a proper superset).

8 Example Drinkers(name, addr, beersLiked, manf, favBeer) FD’s: name  addr favBeer, beersLiked  manf uOnly key is {name, beersLiked}. uIn each FD, the left side is not a superkey. uAny one of these FD’s shows Drinkers is not in BCNF

9 Another Example Beers(name, manf, manfAddr) FD’s: name  manf, manf  manfAddr uOnly key is {name}. uname  manf does not violate BCNF, but manf  manfAddr does.

10 Decomposition into BCNF uGiven: relation R with FD’s F. uLook among the given FD’s for a BCNF violation X  Y. wIf any FD following from F violates BCNF, then there will surely be an FD in F itself that violates BCNF. uCompute X +. wNot all attributes, or else X is a superkey.

11 Decompose R Using X  Y uReplace R by relations with schemas: 1. R 1 = X R 2 = R – (X + – X ). uProject given FD’s F onto the two new relations.

12 Decomposition Picture R-X +R-X + XX +-XX +-X R2R2 R1R1 R

13 Example: BCNF Decomposition Drinkers(name, addr, beersLiked, manf, favBeer) F =name  addr, name  favBeer, beersLiked  manf uPick BCNF violation name  addr. uClose the left side: {name} + = {name, addr, favBeer}. uDecomposed relations: 1.Drinkers1(name, addr, favBeer) 2.Drinkers2(name, beersLiked, manf)

14 Example -- Continued uWe are not done; we need to check Drinkers1 and Drinkers2 for BCNF. uProjecting FD’s is easy here. uFor Drinkers1(name, addr, favBeer), relevant FD’s are name  addr and name  favBeer. wThus, {name} is the only key and Drinkers1 is in BCNF.

15 Example -- Continued uFor Drinkers2(name, beersLiked, manf), the only FD is beersLiked  manf, and the only key is {name, beersLiked}. wViolation of BCNF. ubeersLiked + = {beersLiked, manf}, so we decompose Drinkers2 into: 1.Drinkers3(beersLiked, manf) 2.Drinkers4(name, beersLiked)

16 Example -- Concluded uThe resulting decomposition of Drinkers : 1.Drinkers1(name, addr, favBeer) 2.Drinkers3(beersLiked, manf) 3.Drinkers4(name, beersLiked) uNotice: Drinkers1 tells us about drinkers, Drinkers3 tells us about beers, and Drinkers4 tells us the relationship between drinkers and the beers they like.

17 Third Normal Form -- Motivation uThere is one structure of FD’s that causes trouble when we decompose. uAB  C and C  B. wExample: A = street address, B = city, C = zip code. wAssuming there are no two streets with the same name in the same city. uThere are two keys, {A,B } and {A,C }. uC  B is a BCNF violation, so we must decompose into AC, BC.

18 We Cannot Enforce FD’s uThe problem is that if we use AC and BC as our database schema, we cannot enforce the FD AB  C by checking FD’s in these decomposed relations. uExample with A = street, B = city, and C = zip on the next slide.

19 An Unenforceable FD street (A)zip (C) 545 Tech Sq Tech Sq city (B)zip (C) Cambridge02138 Cambridge02139 Join tuples with equal zip codes. street (A) city (B)zip (C) 545 Tech Sq.Cambridge Tech Sq.Cambridge02139 Although no FD’s were violated in the decomposed relations, FD street city  zip is violated by the database as a whole.

20 3NF Let’s Us Avoid This Problem u3 rd Normal Form (3NF) modifies the BCNF condition so we do not have to decompose in this problem situation. uAn attribute is prime if it is a member of any key. uX  A violates 3NF if and only if X is not a superkey, and also A is not prime.

21 3NF Definitions uWe say a relation R is in 3NF if for each nontrivial FD X  A that holds in R, either X is a superkey of R or A is prime. uAlternative definition: R is in 3NF if every non-prime attribute of R is both fully functionally dependent on every key of R and non-transitively dependent on every key of R.

22 1NF > 2NF > 3NF > BCNF BCNF 1NF 2NF 3NF

23 Example: 3NF uIn our problem situation with FD’s AB  C and C  B, we have keys AB and AC. uThus A, B, and C are each prime. uAlthough C  B violates BCNF, it does not violate 3NF.

24 What 3NF and BCNF Give You uThere are two important properties of a decomposition: 1.Lossless Join : it should be possible to project the original relations onto the decomposed schema, and then reconstruct the original. 2.Dependency Preservation : it should be possible to check in the projected relations whether all the given FD’s are satisfied.

25 3NF and BCNF -- Continued uWe can get (1) with a BCNF decomposition. uWe can get both (1) and (2) with a 3NF decomposition. uBut we can’t always get (1) and (2) with a BCNF decomposition. wstreet-city-zip is an example.

26 3NF Synthesis Algorithm uWe can always construct a decomposition into 3NF relations with a lossless join and dependency preservation. uNeed minimal basis for the FD’s: 1.Right sides are single attributes. 2.No FD can be removed. 3.No attribute can be removed from a left side.

27 Constructing a Minimal Basis 1.Repeatedly try to remove an FD and see if the remaining FD’s are equivalent to the original. 2.Repeatedly try to remove an attribute from a left side and see if the resulting FD’s are equivalent to the original.

28 3NF Synthesis – (2) Step 1. Find a minimal basis G. Step 2. For each FD, say X  A in G, use XA as the schema of one of the relations in the decomposition. Step 3. If none of the attribute sets of relations from Step 2 is a superkey for R, add another relation whose schema is a key for R.

29 Example: 3NF Synthesis uRelation R = ABCD. uFD’s A  B and A  C. uDecomposition: AB and AC from the FD’s, plus AD for a key.

30 Why It Works uPreserves dependencies: each FD from a minimal basis is contained in a relation, thus preserved. uLossless Join: can be proved. u3NF: hard part – a property of minimal bases.