Schedule Today: Next After that Normal Forms. Section 3.6.

Slides:



Advertisements
Similar presentations
CS 319: Theory of Databases
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Lecture 21 CS 157 B Revision of Midterm3 Prof. Sin-Min Lee.
Logical Database Design (3 of 3) John Ortiz. Lecture 7Logical Database Design (2)2 Normalization  If a relation is not in BCNF or 3NF, we refine it by.
Schema Refinement and Normal Forms Given a design, how do we know it is good or not? What is the best design? Can a bad design be transformed into a good.
4NF and 5NF Prof. Sin-Min Lee Department of Computer Science.
Murali Mani Normalization. Murali Mani What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert,
Normal Form Design addendum by C. Zaniolo. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Normal Form Design Compute the canonical cover.
Winter 2002Arthur Keller – CS 1804–1 Schedule Today: Jan. 15 (T) u Normal Forms, Multivalued Dependencies. u Read Sections Assignment 1 due. Jan.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
Database Management Systems Chapter 3 The Relational Data Model (III) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
Databases 1 Seventh lecture. Topics of the lecture Extended relational algebra Normalization Normal forms 2.
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
3 Spring Chapter Normalization of Database Tables.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
1 Lecture 8 Design Theory for Relational Databases (part 2) Slides from
1 Database Design: DBS CB, 2 nd Edition Physical RDBMS Model: Schema Design and Normalization Ch. 3.
COP 6726: New Directions in Database Systems
Functional Dependency and Normalization
Advanced Normalization
Design Theory for Relational Databases
Schema Refinement and Normal Forms
Relational Database Design
CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1.
Module 5: Overview of Database Design -- Normalization
Normalization First Normal Form (1NF) Boyce-Codd Normal Form (BCNF)
CPSC-310 Database Systems
Relational Database Design by Dr. S. Sridhar, Ph. D
Schedule Today: Jan. 23 (wed) Week of Jan 28
Relational Database Design
CS 480: Database Systems Lecture 22 March 6, 2013.
3.1 Functional Dependencies
Advanced Normalization
BCNF and Normalization
Functional Dependencies and Normalization for Relational Databases
Functional Dependencies and Normalization
Module 5: Overview of Normalization
Normalization Murali Mani.
Functional Dependencies and Normalization
Multivalued Dependencies & Fourth Normal Form (4NF)
Sridhar Narayan Normalization Sridhar Narayan
Functional Dependencies and Relational Schema Design
Multivalued Dependencies & Fourth Normal Form
Multivalued Dependencies & Fourth Normal Form
Outline: Normalization
Normalization.
Normalization Part II cs3431.
CPSC-310 Database Systems
Lecture 8: Database Design
Normalization cs3431.
CS 405G: Introduction to Database Systems
Instructor: Mohamed Eltabakh
Normalization February 28, 2019 DB:Normalization.
Designing Relational Databases
Multivalued Dependencies
Anomalies Boyce-Codd Normal Form 3rd Normal Form
Lecture 6: Functional Dependencies
Chapter 3: Design theory for relational Databases
Chapter 7a: Overview of Database Design -- Normalization
Functional Dependencies and Normalization
Functional Dependencies and Normalization
Lecture 09: Functional Dependencies
CS4222 Principles of Database System
Presentation transcript:

Schedule Today: Next After that Normal Forms. Section 3.6. Relational Algebra. Read chapter 5 to page 199 After that SQL Queries. Read Sections 6.1-6.2. SCU J. Holliday - coen 178

Normalization Why? Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD for R, say X  A, then X is a superkey. “Nontrivial” = right-side attribute not in left side. Why? 1. Guarantees no redundancy due to FD’s. 2. Guarantees no update anomalies = one occurrence of a fact is updated, not all. 3. Guarantees no deletion anomalies = valid fact is lost when tuple is deleted. SCU J. Holliday - coen 178

Example of Problems FD’s: 1. name  addr 2. name  favoriteBeer Drinkers(name, addr, beerLiked, manf, favoriteBeer) FD’s: 1. name  addr 2. name  favoriteBeer 3. beerLiked  manf ???’s are redundant, since we can figure them out from the FD’s. Update anomalies: If Bill transfers to UC Berkeley, will we remember to change addr in each of his tuples? Deletion anomalies: If nobody likes Bud, we lose track of Bud’s manufacturer. SCU J. Holliday - coen 178

Another Example Each of the 3 given FD’s is a BCNF violation: Key = {name, beerLiked} Each of the given FD’s has a left side that is a proper subset of the key. Another Example Beers(name, manf, manfAddr). FD’s = name  manf, manf  manfAddr. Only key is name. Manf  manfAddr violates BCNF with a left side unrelated to any key. SCU J. Holliday - coen 178

Decomposition to Reach BCNF Given: relation R, and FD’s F. If there is a non-trivial FD in F, X  B, and X is not a superkey, then R is not in BCNF. Suppose relation R has BCNF violation X  B. We can decompose R into two or more relations so that each of the relations will be in BCNF. SCU J. Holliday - coen 178

2. Decompose R into X+ and (R–X+)  X. 1. Compute X+. Cannot be all attributes – why? 2. Decompose R into X+ and (R–X+)  X. 3. Find the FD’s for the decomposed relations. Project the FD’s from F = calculate all consequents of F that involve only attributes from X+ or only from (RX+)  X. R X+ X SCU J. Holliday - coen 178

Example 1 R= (A,B,C,D) F = {A BC, C D} 1. Compute X+. C+ = CD The key is A (why?) The functional dependency C D violates BCNF (why?) Decomposition: 1. Compute X+. C+ = CD 2. Decompose R: R1 = X+ and R2 = (R–X+)  X. R1 = CD R2 = ABC 3. Find the FD’s for the decomposed relations. (why?) SCU J. Holliday - coen 178

Example 2 R = Drinkers(name, addr, beerLiked, manf, favoriteBeer) F = 2. name  favoriteBeer 3. beerLiked  manf Pick BCNF violation name  addr. Close the left side: name + = name addr favoriteBeer. Decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beerLiked, manf) Projected FD’s (skipping a lot of work): For Drinkers1: name  addr and name  favoriteBeer. For Drinkers2: beerLiked  manf. SCU J. Holliday - coen 178

BCNF violations? (Repeating) Decomposed relations: Projected FD’s: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beerLiked, manf) Projected FD’s: For Drinkers1: name  addr and name  favoriteBeer. For Drinkers2: beerLiked  manf. BCNF violations? For Drinkers1, name is key and all left sides of FD’s are superkeys. For Drinkers2, {name, beerLiked} is the key, and beerLiked  manf violates BCNF. SCU J. Holliday - coen 178

Decompose Drinkers2 Close beerLiked+ = beerLiked, manf. First set of decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beerLiked, manf) Close beerLiked+ = beerLiked, manf. Decompose Drinkers2 into: Drinkers3(beersLiked, manf) Drinkers4(name, beersLiked) Resulting relations are all in BCNF: Drinkers3(beerLiked, manf) Drinkers4(name, beerLiked) SCU J. Holliday - coen 178

Why Decompose This Way? Eliminate unnecessary redundancy (update and delete anomalies) Loss-less join decomposition (recover original information with join on equality) Dependency preserving (efficient checking of constraints) SCU J. Holliday - coen 178

Lossless Join Decomposition If decomposition of a schema to avoid redundant info is not done carefully, we can lose information and generate extra tuples when we try to reconstruct the information from the original table. Consider the decomposition of emp-dept (ename, ssn, bdate, address, dnumber, dname, dmgrssn) into emp-mgr (ename, ssn, bdate, address, dmgrssn) dept (dnumber, dname, dmgrssn)   This decomposition solves the redundant info problem. However, there can be problems joining the tables emp-mgr and dept. SCU J. Holliday - coen 178

Lossless Join Decomposition emp-dept Lossless Join Decomposition John 222-33-4444 2/12/78 123 4th Street 5 CS 432-32-1234 Sue 3/22/71 500 5th Street 1 EX 119-99-7883 Mike 111-22-3333 5/25/75 23 A Street 3 TS Bob 9/21/62 568 Main Street   emp-mgr John 222-33-4444 2/12/78 123 4th Street 432-32-1234 Sue 3/22/71 500 5th Street 119-99-7883 Mike 111-22-3333 5/25/75 23 A Street Bob 9/21/62 568 Main Street   dept 1 EX 119-99-7883 3 TS 432-32-1234 5 CS SCU J. Holliday - coen 178  

If we try to recover the original info by doing a join of emp-mgr and dept, we get: emp-mgr join dept John 222-33-4444 2/12/78 123 4th Street 432-32-1234 3 TS 5 CS Sue 3/22/71 500 5th Street 119-99-7883 1 EX Mike 111-22-3333 5/25/75 23 A Street Bob 9/21/62 568 Main Street  There are some extra rows here!! Another way of looking at it is that we lost some information when we decomposed emp-dept into emp-mgr and dept. Why did this happen? We have the functional dependency(dnumber  dname dmgrssn), but not (dmgrssn  dnumber) and we joined on dmgrssn. SCU J. Holliday - coen 178

Lossless Join Decomposition In a lossless join decomposition into R1 and R2, at least one of the following dependencies is in F+ R1  R2  R1 or R1 R2  R2   Example: R = (A, B, C) F = { A  B, B  C} Decomposition is: R1 = (A, B) R2 = (B, C) This is a lossless join decomposition because R1  R2 = {B} and B  BC, so R1 R2  R2 What about the decomposition R1 = (A, C) R2 = (B, C) ? SCU J. Holliday - coen 178

Dependency Preserving Decomposition This property ensures that checking updates for violation of FD’s is efficient.   Dependency preservation: Let Fi be the set of dependencies in F+ that includes only attributes in Ri. The decomposition is dependency preserving if ( Fi )+ = F+ that is, for a 2 relation decomposition (F1  F2)+ = F+ Example: R = (A, B, C) F = { A  B, B  C} R1 = (A, B) R2 = (B, C) This is dependency preserving.   What about the decomposition R1 = (A, C) R2 = (B, C) ? SCU J. Holliday - coen 178

3NF One FD structure causes problems: If you decompose, you can’t check all the FD’s only in the decomposed relations. If you don’t decompose, you violate BCNF. Structure: AB  C and C  B. Example 1: title city  theatre and theatre  city. Example 2: street city  zip, zip  city. Keys: {A, B} and {A, C}, but C  B has a left side that is not a superkey. Decompose into BC and AC. But you can’t check the FD AB  C in only these relations. SCU J. Holliday - coen 178

“Elegant” Workaround Define the problem away. A relation R is in 3NF iff (if and only if) for every nontrivial FD X  A, either: 1. X is a superkey, or 2. A is prime = member of at least one key. Thus, the canonical problem goes away: you don’t have to decompose because all attributes are prime. SCU J. Holliday - coen 178

What 3NF Gives You There are two important properties of a decomposition: We should be able to recover from the decomposed relations the data of the original. Recovery involves projection and join. We should be able to check that the FD’s for the original relation are satisfied by checking the projections of those FD’s in the decomposed relations. You can always decompose into BCNF and satisfy (1). We can decompose into 3NF and satisfy both (1) and (2). But it is not always possible to decompose into BNCF and get both (1) and (2). Street-city-zip is an example of this point. SCU J. Holliday - coen 178

BCNF and 3NF BCNF: Whenever a non-trivial functional dependency XA holds in R, then X is a superkey of R. 3NF: Whenever a non-trivial functional dependency XA holds in R, then X is a superkey of R OR each attribute of A is a member of a candidate key (prime). SCU J. Holliday - coen 178

Exercise Consider the schema and 2 sets of FD’s F and E: emp-dept (ename, ssn, bdate, address, dnumber, dname, dmgrssn) F ={ ssn  ename bdate address dnumber, dnumber  dname dmgrssn } E = { ssn  ename address dnumber, ssn  dname bdate, dnumber  dname dmgrssn Are F and E equivalent? SCU J. Holliday - coen 178

3NF Decomposition Find the canonical form for F A canonical cover of a set of dependencies, F, has the following properties: ·        No functional dependency contains an extraneous attribute. That is, an attribute that can be removed from the dependency without changing the closure of F. ·  Each left side of a functional dependency in F is unique. SCU J. Holliday - coen 178

3NF Decomposition Algorithm 1.      Calculate the canonical cover of F 2.      set j = 0 3.      for each FD A  B in Fc, do if none of current schemas contain AB then j = j+1 Rj = (A B) 4 4. if none of the schemas in the result contains a candidate key for the original R, then: j = j + 1 and Rj = (any candidate key) SCU J. Holliday - coen 178

3NF Example Example: R = (A, B, C, D, E) F = {A  BC, C  DE, DE  A)   Soln: Candidate keys are: A , C, and DE F is in canonical form. R1 = (A, B, C), R2 = (C, D, E), R3 = (A, D, E) No natural join produces spurious tuples, so the decomposition is lossless. Dependencies are preserved. All relations are in 3NF. SCU J. Holliday - coen 178

Example - BCNF ** R is not in BCNF R = (b-name, b-city, assets, c-name, loan#, amount) F = {b-name  b-city assets, loan#  amount b-name} Primary key = {loan#, c-name}   ** R is not in BCNF "b-name  b-city assets" is a non-trivial FD that holds on R and "b-name  R" is not in F+ (that is, b-name is not a super key). Split R into R1 and R2 R1 = (b-name, b-city, assets) R2 = (b-name, c-name, loan#, amount) SCU J. Holliday - coen 178

Arthur Keller – CS 180 Winter 2002 Continued R1 = (b-name, b-city, assets) R2 = (b-name, c-name, loan#, amount) **Now R1 is in BCNF, but R2 is not (why?) So, we split R2 into R3 and R4 R3 = (b-name, loan#, amount) R4 = (c-name, loan#) The final decomposition is R1, R3, R4 Now R1 is in BCNF, but R2 is not (why? (loan#  amount b-name, but loan# is not a superkey of R2) ) SCU J. Holliday - coen 178

Exercise Find the keys. How should this be decomposed? R = (A, B, C, D, E) F = {A  BC, C  DE, DE  A)   SCU J. Holliday - coen 178

Answers Keys are: A , C, and DE R1 = (A, B, C), R2 = (C, D, E), R = (A, B, C, D, E) F = {A  BC, C  DE, DE  A) Keys are: A , C, and DE R1 = (A, B, C), R2 = (C, D, E), R3 = (A, D, E) The decomposition is lossless. Dependencies are preserved. All relations are in 3NF. SCU J. Holliday - coen 178