Lecture 09: Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)

Slides:



Advertisements
Similar presentations
1 Design Theory. 2 Minimal Sets of Dependancies A set of dependencies is minimal if: 1.Every right side is a single attribute 2.For no X  A in F and.
Advertisements

Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Lecture 6: Design Constraints and Functional Dependencies January 21st, 2004.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 227 Database Systems I Design Theory for Relational Databases.
1 Lecture 02: Conceptual Design Wednesday, October 6, 2010 Dan Suciu -- CSEP544 Fall 2010.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Lecture 10: Database Design XML Wednesday, October 20, 2004.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
CMSC424: Database Design Instructor: Amol Deshpande
1 Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design April 16 & 18, 2008.
Lecture #3 Functional Dependencies Normalization Relational Algebra Thursday, October 12, 2000.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Lecture 06 The Relational Data Model. 2 Outline Relational Data Model Functional Dependencies FDs in ER Logical Schema Design Reading Chapter 8.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #7 Matthew P. Johnson Stern School of Business, NYU Spring,
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
Boyce-Codd NF & Lossless Decomposition Professor Sin-Min Lee.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #6 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such.
Functional Dependencies and Relational Schema Design.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
1 Schema Design & Refinement (aka Normalization).
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
THIRD NORMAL FORM (3NF) A relation R is in BCNF if whenever a FD XA holds in R, one of the following statements is true: XA is a trivial FD, or X is.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Computing & Information Sciences Kansas State University Tuesday, 27 Feb 2007CIS 560: Database System Concepts Lecture 18 of 42 Tuesday, 27 February 2007.
Revisit FDs & BCNF Normalization 1 Instructor: Mohamed Eltabakh
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
1 Lecture 7: Normal Forms, Relational Algebra Monday, 10/15/2001.
Tallahassee, Florida, 2015 COP4710 Database Systems Relational Design Fall 2015.
1 Lecture 10: Database Design Wednesday, January 26, 2005.
Functional Dependencies and Relational Schema Design.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
Lecture 3: Conceptual Database Design and Schema Design April 12 th, 2004.
Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003.
1 Lecture 10: Database Design and Relational Algebra Monday, October 20, 2003.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
1 Lecture 08: E/R Diagrams and Functional Dependencies Friday, January 21, 2005.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
COMP 430 Intro. to Database Systems Normal Forms Slides use ideas from Chris Ré.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 Lecture 9: Database Design Wednesday, January 25, 2006.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
More on Decompositions and Third Normal Form CIS 4301 Lecture Notes Lecture /16/2006.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
Lecture 11: Functional Dependencies
3.1 Functional Dependencies
Problems in Designing Schema
Lecture 6: Design Theory
Cse 344 May 16th – Normalization.
Functional Dependencies and Relational Schema Design
Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design October 12 & 15, 2007.
Lecture 09: Functional Dependencies, Database Design
Lecture 8: Database Design
Functional Dependencies
Lecture 07: E/R Diagrams and Functional Dependencies
CSE544 Data Modeling, Conceptual Design
Terminology Product Attribute names Name Price Category Manufacturer
Lecture 6: Functional Dependencies
Lecture 11: Functional Dependencies
Lecture 09: Functional Dependencies
Presentation transcript:

Lecture 09: Functional Dependencies

Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)

Relational Schema Design Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies Person buys Product name price namessn date

Functional Dependencies Definition: A 1,..., A m → B 1,..., B n holds in R if:  t, t’  R, (t.A 1 =t’.A 1 ...  t.A m =t’.A m  t.B 1 =t’.B 1 ...  t.B m =t’.B m ) A1...AmB1...Bm if t, t’ agree here then t, t’ agree here t t’ R

Important Point! Functional dependencies are part of the schema! They constrain the possible legal data instances. At any point in time, the actual database may satisfy additional FD’s.

Examples EmpID → Name, Phone, Position Position → Phone but Phone → Position EmpIDNamePhonePosition E0045John Smith2376HR E1847Zheng Li2376QA E3123Lucy Larsen1264Developer E9923Zheng Li1264Developer

Formal definition of a key A key is a set of attributes A 1,..., A n s.t. for any other attribute B, A 1,..., A n → B A minimal key is a set of attributes which is a key and for which no subset is a key Note: book calls them superkey and key

Examples of Keys Product(name, price, category, color) name, category → price category → color Keys are: {name, category} and all supersets Enrollment(student, address, course, room, time) student → address room, time → course student, course → room, time Keys are: [in class]

Inference Rules for FD’s A 1, A 2, … A n →B 1, B 2, … B m Is equivalent to A 1, A 2, … A n →B 1 A 1, A 2, … A n →B 2 A 1, A 2, … A n →B m … Splitting rule and Combing rule A1...AmB1...Bm AND

Inference Rules for FD’s (continued) Trivial Rule Why ? A1...Am where i = 1, 2,..., n A 1, A 2, … A n → A i

Inference Rules for FD’s (continued) Transitive Closure Rule If and then Why? A 1, A 2, … A n → B 1, B 2, … B m B 1, B 2, … B m → C 1, C 2, … C k A 1, A 2, … A n → C 1, C 2, … C k

A1...AmB1...BmC1C1...CkCk

Enrollment(student, major, course, room, time) student → major major, course → room course → time What else can we infer ? [in class]

Closure of a set of Attributes Given a set of attributes {A 1, …, A n } and a set of dependencies S. Problem: find all attributes B such that: any relation which satisfies S also satisfies: A 1, A 2, … A n → B The closure of {A 1, …, A n }, denoted {A 1, …, A n } +, is the set of all such attributes B

Closure Algorithm Start with X={A 1, …, A n }. Until X doesn’t change do: if B 1, B 2, … B m → C is in S, and B 1, B 2, … B m are all in X and C is not in X then add C to X

Example The schema - R(A,B,C,D,E,F) A, B→C A,D→ E B→ D A,F→B Closure of {A,B}:X = {A, B,} Closure of {A, F}:X = {A, F,}

Why Is the Algorithm Correct ? Show the following by induction: –For every B in X: A 1, A 2, … A n → B Initially X = {A 1, A 2, … A n } holds Induction step: B 1, …, B m in X –Implies A 1, A 2, … A n → B 1, B 2, … B m –We also have B 1, B 2, … B m → C –By transitivity we have A 1, A 2, … A n → C This shows that the algorithm is sound; need to show it is complete

Relational Schema Design (or Logical Design) Main idea: Start with some relational schema Find out its FD’s Use them to design a better relational schema

Relational Schema Design Anomalies: Redundancy= repeated data Update anomalies= Fred moves to “Bellevue” Deletion anomalies= Fred drops all phone numbers: what is his city ? Recall set attributes (persons with several phones): SSN → Name, City, but not SSN → PhoneNumber NameSSNPhoneNumberCity Fred Seattle Fred Seattle Joe Westfield Joe Westfield

Relation Decomposition Break the relation into two: NameSSNCity Fred Seattle Joe Westfield SSNPhoneNumber

Relational Schema Design Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies Person buys Product name price namessn date

Decompositions in General R(A 1,..., A n ) Create two relations R 1 (B 1,..., B m ) and R 2 (C 1,..., C k ) such that: B 1,..., B m  C 1,..., C k = A 1,..., A n and: R 1 = projection of R on B 1,..., B m R 2 = projection of R on C 1,..., C k

Incorrect Decomposition Sometimes it is incorrect: NamePriceCategory Gizmo19.99Gadget OneClick24.99Camera DoubleClick29.99Camera Decompose on : Name, Category and Price, Category

Incorrect Decomposition NameCategory GizmoGadget OneClickCamera DoubleClickCamera PriceCategory 19.99Gadget 24.99Camera 29.99Camera NamePriceCategory Gizmo19.99Gadget OneClick24.99Camera OneClick29.99Camera DoubleClick24.99Camera DoubleClick29.99Camera When we put it back: Cannot recover information

Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Third Normal Form (3NF) = this lecture Boyce Codd Normal Form (BCNF) = this lecture Others...

Boyce-Codd Normal Form A simple condition for removing anomalies from relations: In English: Whenever a set of attributes of R determines one more attribute, it should determine all the attributes of R. A relation R is in BCNF if: Whenever there is a nontrivial dependency A 1,..., A n → B in R, {A 1,..., A n } is a key for R A relation R is in BCNF if: Whenever there is a nontrivial dependency A 1,..., A n → B in R, {A 1,..., A n } is a key for R Including superkeys

Example What are the dependencies? SSN → Name, City What are the keys? {SSN, PhoneNumber} Is it in BCNF? NameSSNPhoneNumberCity Fred Seattle Fred Seattle Joe Westfield Joe Westfield

Decompose it into BCNF NameSSNCity Fred Seattle Joe Westfield SSNPhoneNumber SSN → Name, City

Summary of BCNF Decomposition Find a dependency that violates the BCNF condition: A 1, A 2, … A n → B 1, B 2, … B m A’s Others B’s R1R2 Heuristics: choose B 1, B 2, … B m “as large as possible” Decompose: Exercise: Is there a 2- attribute relation that is not in BCNF ? Continue until there are no BCNF violations left.

Example Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN → name, age age → hairColor Decompose in BCNF (in class): Step 1: find all keys Step 2: now decompose

Other Example R(A,B,C,D) A → B, B → C Key: A, D Violations of BCNF: A → B, B → C, A → C, A → B,C Pick A → B,C: split R into R 1 (A,B,C), R 2 (A,D) What happens if we pick A → B first ?

Correct Decompositions A decomposition is lossless if we can recover: R(A,B,C) R1(A,B) R2(A,C) R’(A,B,C) should be the same as R(A,B,C) R’ is in general larger than R. Must ensure R’ = R Decompose Recover

Correct Decompositions Given R(A,B,C) s.t. A→B, the decomposition into R1(A,B), R2(A,C) is lossless

3NF: A Problem with BCNF FD’s: Character  Film;Film, Actor  Character So, there is a BCNF violation, and we decompose. Character  Film No FDs FilmActorCharacter ActorCharacter FilmCharacter

So What’s the Problem? No problem so far. All local FD’s are satisfied. Let’s put all the data back into a single table again: Violates the dependency: Film, Actor  Character! FilmCharacter Austin Powers Dr. Evil ActorCharacter Mike MyersAustin Powers Mike MyersDr. Evil FilmActorCharacter Austin PowersMike MyersAustin Powers Mike MyersDr. Evil What happened here?

Solution: 3rd Normal Form (3NF) A simple condition for removing anomalies from relations: 3NF has a dependency-preserving decomposition A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a key for R, or B is part of a key. A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a key for R, or B is part of a key. Including superkeys

Decomposition into 3NF Easy when we know BCNF decomposition (how?) Not-so-easy: if we want a dependency- preserving decomposition…  In the book (3.5)