Functional Dependencies and Relational Schema Design.

Slides:



Advertisements
Similar presentations
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala N ATIONAL I NSTITUTE OF T ECHNOLOGY A GARTALA Aug-Dec,2010 Normalization 2 CSE-503 :: D ATABASE.
Advertisements

Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Lecture 6: Design Constraints and Functional Dependencies January 21st, 2004.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Lecture 10: Database Design XML Wednesday, October 20, 2004.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
Nov 11, 2003Murali Mani Normalization B term 2004: lecture 7, 8, 9.
1 Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design April 16 & 18, 2008.
Lecture #3 Functional Dependencies Normalization Relational Algebra Thursday, October 12, 2000.
The principal problem that we encounter is redundancy, where a fact is repeated in more than one tuple. Most common cause: attempts to group into one relation.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Lecture 06 The Relational Data Model. 2 Outline Relational Data Model Functional Dependencies FDs in ER Logical Schema Design Reading Chapter 8.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #7 Matthew P. Johnson Stern School of Business, NYU Spring,
Databases 6: Normalization
Boyce-Codd NF & Lossless Decomposition Professor Sin-Min Lee.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #6 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Relational Schema Design (end) Relational Algebra Finally, querying the database!
Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such.
Functional Dependencies and Relational Schema Design.
1 Lecture 7: End of Normal Forms Outerjoins, Schema Creation and Views Wednesday, January 28th, 2004.
Introduction to Schema Refinement
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
Copyright © Curt Hill Schema Refinement III 4 th NF and 5 th NF.
1 Schema Design & Refinement (aka Normalization).
Database Management COP4540, SCS, FIU Relation Normalization (Chapter 14)
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Lecture 09: Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
1 Lecture 7: Normal Forms, Relational Algebra Monday, 10/15/2001.
Tallahassee, Florida, 2015 COP4710 Database Systems Relational Design Fall 2015.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
1 Lecture 10: Database Design Wednesday, January 26, 2005.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
Design Theory for RDB Normal Forms. Lu Chaojun, SJTU 2 Redundant because these info may be figured out by using FD s1  … What’s Bad Design? Redundancy.
Lecture 3: Conceptual Database Design and Schema Design April 12 th, 2004.
Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003.
1 Lecture 10: Database Design and Relational Algebra Monday, October 20, 2003.
Multivalued Dependencies and 4th NF CIS 4301 Lecture Notes Lecture /21/2006.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
COMP 430 Intro. to Database Systems Normal Forms Slides use ideas from Chris Ré.
Final Review Zaki Malik November 20, Basic Operators Covered.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 Lecture 9: Database Design Wednesday, January 25, 2006.
Normalization Or theoretical and common sense approaches to redesigning a database.
4NF & MULTIVALUED DEPENDENCY By Kristina Miguel. Review  Superkey – a set of attributes which will uniquely identify each tuple in a relation  Candidate.
More on Decompositions and Third Normal Form CIS 4301 Lecture Notes Lecture /16/2006.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
Lecture 11: Functional Dependencies
Advanced Normalization
COP4710 Database Systems Relational Design.
3.1 Functional Dependencies
Advanced Normalization
Problems in Designing Schema
Relational Design Theory
Cse 344 May 16th – Normalization.
Functional Dependencies and Relational Schema Design
Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design October 12 & 15, 2007.
Lecture 09: Functional Dependencies, Database Design
Relational Design Theory
Lecture 8: Database Design
Lecture 07: E/R Diagrams and Functional Dependencies
CSE544 Data Modeling, Conceptual Design
Terminology Product Attribute names Name Price Category Manufacturer
Relational Schema Design (end) Relational Algebra SQL (maybe)
Lecture 6: Functional Dependencies
Lecture 11: Functional Dependencies
Lecture 09: Functional Dependencies
Presentation transcript:

Functional Dependencies and Relational Schema Design

Decompositions in General A, A, … A 12n Let R be a relation with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such that: B, B, … B 12m C, C, … C 12l  A, A, … A 12n And -- R1 is the projection of R on -- R2 is the projection of R on B, B, … B 12m C, C, … C 12l

Incorrect Decomposition NameCategory GizmoGadget OneClickCamera DoubleClickCamera PriceCategory 19.99Gadget 24.99Camera 29.99Camera NamePriceCategory Gizmo19.99Gadget OneClick24.99Camera OneClick29.99Camera DoubleClick24.99Camera DoubleClick29.99Camera When we put it back: Cannot recover information NamePriceCategory Gizmo19.99Gadget OneClick24.99Camera DoubleClick29.99Camera Decompose on : Name, Category and Price, Category

Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Third Normal Form (3NF) = this lecture Boyce Codd Normal Form (BCNF) = this lecture Others...

Boyce-Codd Normal Form A simple condition for removing anomalies from relations: A relation R is in BCNF if and only if: Whenever there is a nontrivial dependency for R, it is the case that { } a super-key for R. A, A, … A 12n B 12n In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, should determine all the attributes of R.

Example Name SSN Phone Number Fred (201) Fred (206) Joe (908) Joe (212) What are the dependencies? SSN Name What are the keys? Is it in BCNF?

Decompose it into BCNF SSN Name Fred Joe SSN Phone Number (201) (206) (908) (212) SSN Name

What About This? Name Price Category Gizmo $19.99 gadgets OneClick $24.99 camera Name Price, Category

BCNF Decomposition Find a dependency that violates the BCNF condition: A, A, … A 12n B, B, … B 12m A’s Others B’s R1R2 Heuristics: choose B, B, … B “as large as possible” 12m Decompose: Find a 2-attribute relation that is not in BCNF. Continue until there are no BCNF violations left.

Example Decomposition Name SSN Age EyeColor PhoneNumber Functional dependencies: SSN Name, Age, Eye Color What if we also had an attribute Draft-worthy, and the FD: Age Draft-worthy Person: BNCF: Person1(SSN, Name, Age, EyeColor), Person2(SSN, PhoneNumber)

Correct Decompositions A decomposition is lossless if we can recover: R(A,B,C) { R1(A,B), R2(A,C) } R’(A,B,C) = R(A,B,C) R’ is in general larger than R. Must ensure R’ = R Decompose Recover

Decomposition Based on BCNF is Necessarily Lossless R(A, B, C), A  C BCNF: R1(A,B), R2(A,C) Some tuple (a,b,c) in R (a,b’,c’) also in R decomposes into (a,b) in R1 (a,b’) also in R1 and (a,c) in R2 (a,c’) also in R2 Recover tuples in R: (a,b,c), (a,b,c’), (a,b’,c), (a,b’,c’) also in R ? Can (a,b,c’) be a bogus tuple? What about (a,b’,c’) ?

3NF: A Problem with BCNF Unit Company Product Unit Company Unit Product FD’s: Unit  Company; Company, Product  Unit So, there is a BCNF violation, and we decompose. Unit  Company No FDs

So What’s the Problem? Unit Company Product Unit CompanyUnit Product Galaga99 UW Galaga99 databases Bingo UW Bingo databases No problem so far. All local FD’s are satisfied. Let’s put all the data back into a single table again: Galaga99 UW databases Bingo UW databases Violates the dependency: company, product -> unit!

Solution: 3rd Normal Form (3NF) A simple condition for removing anomalies from relations: A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R, or B is part of a key. A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } a super-key for R, or B is part of a key.

Multi-valued Dependencies SSN Phone Number Course (206) CSE (206) CSE (206) CSE (206) CSE-341 The multi-valued dependencies are: SSN Phone Number SSN Course

Definition of Multi-valued Dependecy Given R(A1,…,An,B1,…,Bm,C1,…,Cp) the MVD A1,…,An B1,…,Bm holds if: for any values of A1,…,An the “set of values” of B1,…,Bm is “independent” of those of C1,…Cp Given R(A1,…,An,B1,…,Bm,C1,…,Cp) the MVD A1,…,An B1,…,Bm holds if: for any values of A1,…,An the “set of values” of B1,…,Bm is “independent” of those of C1,…Cp

Definition of MVDs Continued Equivalently: the decomposition into R1(A1,…,An,B1,…,Bm), R2(A1,…,An,C1,…,Cp) is lossless Note: an MVD A1,…,An B1,…,Bm Implicitly talks about “the other” attributes C1,…Cp

Rules for MVDs If A1,…An B1,…,Bm then A1,…,An B1,…,Bm Other rules in the book

4 th Normal Form (4NF) R is in 4NF if whenever: A1,…,An B1,…,Bm is a nontrivial MVD, then A1,…,An is a superkey R is in 4NF if whenever: A1,…,An B1,…,Bm is a nontrivial MVD, then A1,…,An is a superkey Same as BCNF with FDs replaced by MVDs

Confused by Normal Forms ? 3NF BCNF 4NF In practice: (1) 3NF is enough, (2) don’t overdo it !