CS 405G: Introduction to Database Systems 16. Functional Dependency.

Slides:



Advertisements
Similar presentations
primary key constraint foreign key constraint
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Schema Refinement and Normal Forms Given a design, how do we know it is good or not? What is the best design? Can a bad design be transformed into a good.
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala N ATIONAL I NSTITUTE OF T ECHNOLOGY A GARTALA Aug-Dec,2010 Normalization 2 CSE-503 :: D ATABASE.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
+ Review: Normalization and data anomalies CSCI 2141 W2013 Slide set modified from courses.ischool.berkeley.edu/i257/f06/.../Lecture06_257.ppt.
Relational Normalization Theory. Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
Functional Dependencies and Normalization for Relational Databases.
Functional Dependency CS157a Sec. 2 Koichiro Hongo.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Databases 6: Normalization
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Introduction to Normalization CPSC 356 Database Ellen Walker Hiram College.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
CS 405G: Introduction to Database Systems 18. Normal Forms and Normalization.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Topic 10 Functional Dependencies and Normalization for Relational Databases Faculty of Information Science and Technology Mahanakorn University of Technology.
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies and Normalization for Relational Databases.
Chapter Functional Dependencies and Normalization for Relational Databases.
Ihr Logo Fundamentals of Database Systems Fourth Edition El Masri & Navathe Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Ihr Logo Fundamentals of Database Systems Fourth Edition El Masri & Navathe Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Logical Database Design (1 of 3) John Ortiz Lecture 6Logical Database Design (1)2 Introduction  The logical design is a process of refining DB schema.
1 Lecture 6: Schema refinement: Functional dependencies
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Revisit FDs & BCNF Normalization 1 Instructor: Mohamed Eltabakh
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#16: Schema Refinement & Normalization.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Schema Refinement SHIRAJ MOHAMED M | MIS 1. Learning Objectives  Identify update, insertion and deletion anomalies  Identify possible keys given an.
Deanship of Distance Learning Avicenna Center for E-Learning 1 Session - 7 Sequence - 2 Normalization Functional Dependencies Presented by: Dr. Samir Tartir.
CS 405G: Introduction to Database Systems
Normalization.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
CS 405G: Introduction to Database Systems Database Normalization.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Ch 7: Normalization-Part 1
CS542 1 Schema Refinement Chapter 19 (part 1) Functional Dependencies.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Al-Imam University Girls Education Center Collage of Computer Science 1 st Semester, 1432/1433H Chapter 10_part 1 Functional Dependencies and Normalization.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Functional Dependency and Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Database Management systems Subject Code: 10CS54 Prepared By:
Functional Dependencies and Normalization
Functional Dependencies and Normalization
Normalization.
Normalization cs3431.
CS 405G: Introduction to Database Systems
Instructor: Mohamed Eltabakh
Relational Database Design
Presentation transcript:

CS 405G: Introduction to Database Systems 16. Functional Dependency

9/8/2015Chen Univ of Kentucky Today’s Topic Functional Dependency. Normalization Decomposition BCNF

9/8/2015Chen Univ of Kentucky Motivation How do we tell if a design is bad, e.g., Enroll(SID, Sname, CID, Cname, grade)? This design has redundancy, because the name of an employee is recorded multiple times, once for each project the employee is taking SIDCIDSnameCnamegrade John SmithDBA 11239Ben LiuNETA 12349John SmithNETB Ben LiuDBC Susan SidhukDBB

9/8/2015Chen Univ of Kentucky SIDSname 1234John Smith 1123Ben Liu 1023Susan Sidhuk SIDCIDgrade A 11239A 12349B C B CIDCname 9NET 10DB

9/8/2015Chen Univ of Kentucky Why redundancy is bad? Waste disk space. What if we want to perform update operations to the relation INSERT an new project that no employee has been assigned to it yet. UPDATE the name of “John Smith” to “John L. Smith” DELETE the last employee who works for a certain project SIDCIDSnameCnamegrade John SmithDBA 11239Ben LiuNETA 12349John SmithNETB Ben LiuDBC Susan SidhukDBB

Functional Dependency Functional dependencies (FDs) are used to specify formal measures of the "goodness" of relational designs FDs are constraints that are derived from the meaning and interrelationships of the data attributes FDs and keys are used to define normal forms for relations 9/8/2015Chen Univ of Kentucky

9/8/2015Chen Univ of Kentucky Functional dependencies A functional dependency (FD) has the form X -> Y, where X and Y are sets of attributes in a relation R X -> Y means that whenever two tuples in R agree on all the attributes in X, they must also agree on all attributes in Y t 1 [X] = t 2 [X]  t 1 [Y] = t 2 [Y] XYZ abc a?? XYZ abc ab? Must be “b” Could be anything, e.g. d XYZ abc abd

9/8/2015Chen Univ of Kentucky FD examples Address (street_address, city, state, zip) street_address, city, state -> zip zip -> city, state zip, state -> zip? This is a trivial FD Trivial FD: LHS RHS zip -> state, zip? This is non-trivial, but not completely non-trivial Completely non-trivial FD: LHS ∩ RHS = ?

Functional Dependencies An FD is a property of the attributes in the schema R The constraint must hold on every relation instance r(R) If K is a key of R, then K functionally determines all attributes in R (since we never have two distinct tuples with t1[K]=t2[K]) 9/8/2015Chen Univ of Kentucky

9/8/2015Chen Univ of Kentucky Keys redefined using FD’s Let attr(R) be the set of all attributes of R, a set of attributes K is a (candidate) key for a relation R if K -> attr(R) - K, and That is, K is a “super key” No proper subset of K satisfies the above condition That is, K is minimal (full functional dependent) Address (street_address, city, state, zip) {street_address, city, state, zip} {street_address, city, zip} {street_address, zip} {zip} Super key Key Non-key

9/8/2015Chen Univ of Kentucky Reasoning with FDs Given a relation R and a set of FDs F Does another FD follow from F ? Are some of the FDs in F redundant (i.e., they follow from the others)? Is K a key of R? What are all the keys of R?

9/8/2015Chen Univ of Kentucky Attribute closure Given R, a set of FDs F that hold in R, and a set of attributes Z in R: The closure of Z (denoted Z + ) with respect to F is the set of all attributes {A 1, A 2, …} functionally determined by Z (that is, Z -> A 1 A 2 …) Algorithm for computing the closure Start with closure = Z If X -> Y is in F and X is already in the closure, then also add Y to the closure Repeat until no more attributes can be added

9/8/2015Chen Univ of Kentucky A more complex example WorkOn(EID, Ename, , PID, Pname, Hours) EID -> Ename, -> EID PID -> Pname EID, PID -> Hours (Not a good design, and we will see why later)

9/8/2015Chen Univ of Kentucky Example of computing closure F includes: EID -> Ename, -> EID PID -> Pname EID, PID -> Hours { PID, } + = ? Starting from: closure = { PID, } -> EID Add EID; closure is now { PID, , EID } EID -> Ename, Add Ename, ; closure is now { PID, , EID, Ename } PID -> Pname Add Pname; close is now { PID, Pname, , EID, Ename } EID, PID -> hours Add hours; closure is now all the attributes in WorksOn

9/8/2015Chen Univ of Kentucky Using attribute closure Given a relation R and set of FDs F Does another FD X -> Y follow from F ? Compute X + with respect to F If Y X +, then X -> Y follow from F Is K a super key of R? Compute K + with respect to F If K + contains all the attributes of R, K is a super key Is a super key K a key of R? Test where K’ = K – { a | a  K} is a superkey of R for all possible a

9/8/2015Chen Univ of Kentucky Rules of FDs Armstrong’s axioms Reflexivity: If Y X, then X -> Y Augmentation: If X -> Y, then XZ -> YZ for any Z Transitivity: If X -> Y and Y -> Z, then X -> Z Rules derived from axioms Splitting: If X -> YZ, then X -> Y and X -> Z Combining: If X -> Y and X -> Z, then X -> YZ

9/8/2015Chen Univ of Kentucky Using rules of FD’s Given a relation R and set of FDs F Does another FD X -> Y follow from F ? Use the rules to come up with a proof Example: F includes: EID -> Ename, ; -> EID; EID, PID -> Hours, Pid -> Pname PID, -> hours? -> EID (given in F ) PID, -> PID, EID (augmentation) PID, EID -> hours (given in F ) PID, -> hours (transitivity)

9/8/2015Chen Univ of Kentucky Example of redundancy WorkOn (EID, Ename, , PID, hour) We say X -> Y is a partial dependency if there exist a X’  X such that X’ -> Y e.g. EID, -> Ename, Otherwise, X -> Y is a full dependency e.g. EID, PID -> hours EIDPIDEname PnameHours John platform Ben 12349John Susan platform40

 Database normalization relates to the level of redundancy in a relational database’s structure.  The key idea is to reduce the chance of having multiple different version of the same data.  Well-normalized databases have a schema that reflects the true dependencies between tracked quantities.  Any increase in normalization generally involves splitting existing tables into multiple ones, which must be re-joined each time a query is issued. Database Normalization 9/8/2015Chen University of Kentucky19

9/8/2015Chen University of Kentucky20 Normalization A normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency. A normal form is a certification that tells whether a relation schema is in a particular state

Normal Forms Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NF. 3NF is widely considered to be sufficient. Normalizing beyond 3NF can be tricky with current SQL technology as of 2005 Full normalization is considered a good exercise to help discover all potential internal database consistency problems. 9/8/2015Chen University of Kentucky21

First Normal Form ( 1NF ) NF is to characterize a relation (not an attribute, a key, etc…) We can only say “this relation or table is in 1NF” A relation is in first normal form if the domain of each attribute contains only atomic values, and the value of each attribute contains only a single value from that domain. 9/8/2015Chen University of Kentucky22

9/8/2015Chen Univ of Kentucky

9/8/2015Chen University of Kentucky24 2 nd Normal Form An attribute A of a relation R is a nonprimary attribute if it is not part of any key in R, otherwise, A is a primary attribute. R is in (general) 2 nd normal form if every nonprimary attribute A in R is not partially functionally dependent on any key of R

Redundancy Example If a key will result a partial dependency of a nonprimary attribute. e.g. EID, PID -> Ename In this case, the attribute (Ename) should be separated with its full dependency key (EID) to be a new table. So, to check whether a table includes redundancy. Try every nonprimary attribute and check whether it fully depends on any key. 9/8/2015Chen University of Kentucky25

9/8/2015Chen Univ of Kentucky

Second normal Form ( 2NF ) 2NF prescribes full functional dependency on the primary key. It most commonly applies to tables that have composite primary keys, where two or more attributes comprise the primary key. It requires that there are no non-trivial functional dependencies of a non-key attribute on a part (subset) of a candidate key. A table is said to be in the 2NF if and only if it is in the 1NF and every non-key attribute is irreducibly dependent on the primary key 9/8/2015Chen University of Kentucky27

9/8/2015Chen University of Kentucky28 Decomposition Decomposition eliminates redundancy To get back to the original relation, use natural join. EIDPIDEname PnameHours John platform Ben 12349John Susan platform40 Decomposition EIDEname 1234John 1123Ben 1023Susan EIDPIDPnameHours B2B platform CRM CRM B2B platform40 Foreign key

9/8/2015Chen University of Kentucky29 Decomposition Decomposition may be applied recursively EIDPIDPnameHours B2B platform CRM CRM B2B platform40 PIDPname 10B2B platform 9CRM EIDPIDHours

9/8/2015Chen University of Kentucky30 Unnecessary decomposition Fine: join returns the original relation Unnecessary: no redundancy is removed, and now EID is stored twice-> EIDEname 1234John 1123Ben 1023Susan EIDEname 1234John Smith 1123Ben Liu 1023Susan Sidhuk EID

9/8/2015Chen University of Kentucky31 Bad decomposition Association between PID and hours is lost Join returns more rows than the original relation EIDPIDHours EIDPID EIDHours

9/8/2015Chen University of Kentucky32 Lossless join decomposition Decompose relation R into relations S and T attrs(R) = attrs(S) attrs(T) S = π attrs(S) ( R ) T = π attrs(T) ( R ) The decomposition is a lossless join decomposition if, given known constraints such as FD’s, we can guarantee that R = S T Any decomposition gives R S T (why?) A lossy decomposition is one with R S T

9/8/2015Chen University of Kentucky33 Loss? But I got more rows-> “Loss” refers not to the loss of tuples, but to the loss of information Or, the ability to distinguish different original tuples EIDPIDHours EIDPID EIDHours

9/8/2015Chen University of Kentucky34 Questions about decomposition When to decompose How to come up with a correct decomposition (i.e., lossless join decomposition)