3 Spring 20071 Chapter 3.6-7 Normalization of Database Tables.

Slides:



Advertisements
Similar presentations
Lecture 21 CS 157 B Revision of Midterm3 Prof. Sin-Min Lee.
Advertisements

Spring 2011 Instructor: Hassan Khosravi
4NF and 5NF Prof. Sin-Min Lee Department of Computer Science.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 227 Database Systems I Design Theory for Relational Databases.
Functional Dependencies. Babies At a birth, there is one baby (twins would be represented by two births), one mother, any number of nurses, and a doctor.
Chapter 5 Normalization of Database Tables
Normalization of Database Tables Special adaptation for INFS-3200
Multivalued Dependency Prof. Sin-Min Lee Department of Computer Science.
Need for Normalization
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
Normalization of Database Tables
Multivalued Dependency Prof. Sin-Min Lee Department of Computer Science.
The principal problem that we encounter is redundancy, where a fact is repeated in more than one tuple. Most common cause: attempts to group into one relation.
4 Chapter 4 Normalization Hachim Haddouti. 4 Hachim Haddouti, CH4, see also Rob & Coronel 2 In this chapter, you will learn: What normalization is and.
Winter 2002Arthur Keller – CS 1804–1 Schedule Today: Jan. 15 (T) u Normal Forms, Multivalued Dependencies. u Read Sections Assignment 1 due. Jan.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Ch 7: Normalization-Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
FUNCTIONAL DEPENDENCIES
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Database Normalization.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Database Design – Lecture 8
4NF (Multivalued Dependency), and 5NF (Join Dependency)
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management Peter Rob & Carlos Coronel.
© D. Wong Ch. 3 (continued)  Database design problems  Functional Dependency  Keys of relations  Decompositions based on Functional Dependency.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Design Process - Where are we?
Chapter 4 Normalization of Database Tables. 2 Database Tables and Normalization Table is basic building block in database design Table is basic building.
E-R Modeling: Table Normalization. Normalization of DB Tables Normalization ► Process for evaluating and correcting table structures determines the optimal.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
Design Theory for RDB Normal Forms. Lu Chaojun, SJTU 2 Redundant because these info may be figured out by using FD s1  … What’s Bad Design? Redundancy.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Multivalued Dependencies and 4th NF CIS 4301 Lecture Notes Lecture /21/2006.
Ch 7: Normalization-Part 1
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Week 4 Lecture Part 1 of 3 Normalization of Database Tables Samuel ConnSamuel Conn, Asst. Professor.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
4NF & MULTIVALUED DEPENDENCY By Kristina Miguel. Review  Superkey – a set of attributes which will uniquely identify each tuple in a relation  Candidate.
More on Decompositions and Third Normal Form CIS 4301 Lecture Notes Lecture /16/2006.
1 Database Design: DBS CB, 2 nd Edition Physical RDBMS Model: Schema Design and Normalization Ch. 3.
Chapter 5: Relational Database Design
Schedule Today: Next After that Normal Forms. Section 3.6.
Chapter 4: Relational Database Design
Schedule Today: Jan. 23 (wed) Week of Jan 28
3.1 Functional Dependencies
Normalization of Database Tables PRESENTED BY TANVEERA AKHTER FOR BCA 2ND YEAR dated:15/09/2015 DEPT. OF COMPUTER SCIENCE.
CPSC-310 Database Systems
Multivalued Dependencies & Fourth Normal Form (4NF)
Normalization of Database Tables Uploaded by: mysoftbooks.ml
Normalization of DB relations examples Fall 2015
Multivalued Dependencies
Presentation transcript:

3 Spring Chapter Normalization of Database Tables

3 Spring Normalization Normalization is process for assigning attributes to entities u Reduces data redundancies u Helps eliminate data anomalies u Produces controlled redundancies to link tables Normalization stages u 1NF - First normal form u 2NF - Second normal form u 3NF - Third normal form u 4NF - Fourth normal form

3 Spring Example: StarsMovies Owns Studios Starts-in title length year address Name address

3 Spring Problem Example Movies Update anomalies: If Harrison Ford’s phone # changes, must change it in each of his tuples. If Length value of Star Wars needs to be changed, must change all occurrences Deletion anomalies: If we delete Wayne’s World entries from database, we also loose all info about Dana Carvey & Mike Meyers

3 Spring Conversion to 1NF Repeating groups must be eliminated u Proper primary key developed Uniquely identifies each tuple u Dependencies can be identified undesirable dependencies allowed –Partial »based on part of composite primary key –Transitive »one nonprime attribute depends on another nonprime attribute

3 Spring Conversion to 1NF Cont. An attribute that is at least part of a key is known as a prime attribute or key attribute or primary key.

3 Spring Example Projects assigned to employees Each project has a number and a name Each employee has a number, a name, a job class Each employee working on a project, need to keep number of hours spent on project, and hourly rate. Project Assignments Table : ( PROJ_NUM, PROJ_NAME, EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS) What’s the Key for this relation?

3 Spring Data Organization: 1NF

3 Spring Dependency Diagram (1NF) PROJ_NUM, EMP_NUM --> PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS PROJ_NUM --> PROJ_NAME EMP_NUM-->EMP_NAME, JOB_CLASS, CHG_HOURS JOB_CLASS --> CHG_HOUR

3 Spring NF Summarized All key attributes defined Primary Key identified No repeating groups in table All attributes dependent on primary key

3 Spring NF Summarized In 1NF, but Includes no partial dependencies Partial dependency: u An attribute is functionally dependent on a portion of the primary key. Example: u PROJ_NUM  PROJ_NAME u EMP_NUM-->EMP_NAME, JOB_CLASS, CHG_HOURS

3 Spring Conversion to 2NF 1.Start with 1NF format: 2.Write each key component on a separate line 3.Write dependent attributes after each key component 4.Write original key on last line 5.Write any remaining attributes after original key 6.Each component is new table PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR) ASSIGN (PROJ_NUM, EMP_NUM, HOURS)

3 Spring NF Conversion Results

3 Spring NF Summarized In 1NF Includes no partial dependencies Still possible to exhibit transitive dependency u Attributes may be functionally dependent on non-key attributes

3 Spring Conversion to 3NF decompose table(s) to eliminate transitive functional dependencies PROJECT (PROJ_NUM, PROJ_NAME) ASSIGN (PROJ_NUM, EMP_NUM, HOURS) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS) JOB (JOB_CLASS, CHG_HOUR)

3 Spring NF Summarized In 2NF Contains no transitive dependencies

3 Spring Additional DB Enhancements Associate Manager with each project Job_Code is more concise and less error prone that job class Use naming convention to name attributes e.g. job_code, job_description, Job_chg_hour etc.

3 Spring Boyce-Codd Normal Form (BCNF) Formally, R is in BCNF if for every nontrivial FD for R, say X  A, X is a superkey. u “Nontrivial” = right-side attribute not in left side. u Trivial FDs examples A  A AB  A Informally: the only arrows in the FD diagram are arrows out of superkeys u Note: BCNF violation when relation has more than one superkey that overlap

3 Spring NF Table Not in BCNF What normal form?

3 Spring Decomposition of Table Structure to Meet BCNF

3 Spring Decomposition to Reach BCNF Setting: relation R with FDs F. Suppose relation R has BCNF violation X  B and X not a superkey.

3 Spring Compute X +. u Cannot be all attributes – why? 2. Decompose R into X + and (R–X + )  X. 3. Find the FD’s for the decomposed relations. u Project the FD’s from F = calculate all consequents of F that involve only attributes from X + or only from (R  X + )  X. R X + X

3 Spring Identify the violating FD E.g. : X 1 X 2 …X n  B 1 B 2 …B m Add to the right-hand side of FD as many attributes as are functionally determined by (X 1 X 2 …X n ) Decompose relation R into two relations: u One relation has all attributes Xs & Bs u Second relation has the Xs plus any other remaining attributes from R other than Bs Decomposition to Reach BCNF

3 Spring BCNF--Example Assume R(S, J, T) S: Student J: subject T: Teacher Student S is taught subject J by teacher T. Constraints: u For each subject, each student of that subject is taught by only one teacher u Each teacher teaches only one subject (but each subject is taught by several teachers)

3 Spring BCNF--Example SJT SmithMathProf. White SmithPhysicsProf. Green JaneMathProf. White JanePhysicsProf. Brown Functional Dependencies: S, J  T T  J

3 Spring BCNF--Example Candidate keys: {S, J} and {S, T} 3NF but not in BCNF Update anomaly: if we delete the info that Jane is studying Physics we also loose the info that Prof. Brown teaches Physics Solution: two relations R1{S, T} R2{T, J} S J T

3 Spring Decomposition Based on BCNF is Necessarily Correct Attributes A, B, C. FD: B  C Relations R1[A,B] R2[B, C] Tuples in R: (a, b, c) Tuples in R1: (a, b) Tuples in R2: (b, c) Natural join of R1 and R2 = (a, b, c)  original relation R can be reconstructed by forming the natural join of R1 and R2.

3 Spring Decomposition Based on BCNF is Necessarily Correct Attributes A, B, C. FD: B  C Relations R1[A,B] R2[B, C] Tuples in R: (a, b, c), (d, b, e) Tuples in R1: (a, b), (d, b) Tuples in R2: (b, c), (b, e) Tuples in the natural join of R1 and R2: (a,b,c), (a,b, e) (d, b, c), (d, b, e) Can (a,b,e), (d, b, c) be a bogus tuples?

3 Spring Decomposition Based on BCNF is Necessarily Correct Answer: No Because: B  C i.e. if 2 tuples have same B attribute then they must have the same C attribute.  (b,c) = (b,e)  (a, b,e) = (a, b,c) and (d, b, c) = (d, b, e)

3 Spring Theorem Any two-attribute relation is in BCNF.

3 Spring Decomposition Theorem Suppose we decompose a relation R(X, Y, Z) into R1(X, Y) and R2(X,Z) and project the R onto R1 and R2. Then join(R1, R2) is guaranteed to reconstruct R if and only if X  Y or X  Z Notice that whenever we decompose because of a BNCF violation, one of the above FDs holds.

3 Spring NF One FD structure causes problems in BCNF: If you decompose, you can’t recover all of the original FD’s. If you don’t decompose, you violate BCNF. Abstractly: AB  C and C  B. Example : street city  zip, and zip  city. BCNF violation: C  B has a left side that is not a superkey. Based on previous algorithm, decompose into BC and AC. u But the FD AB  C does not hold in new tables.

3 Spring Example A = street, B = city, C = zip. zip  city BCNF violation street city  zip It is a bad idea to decompose relation because you loose the ability to check the dependency: Decompose:

3 Spring Example zip  city street city  zip It is a bad idea to decompose relation because you loose the ability to check the dependency: Decompose:

3 Spring “Elegant” Workaround Define the problem away. A relation R is in 3NF iff (if and only if) for every nontrivial FD X  A, either: 1. X is a superkey, or 2. A is prime = member of at least one key. Thus, if we just normalize to the 3NF, the problem goes away.

3 Spring What 3NF and BCNF Give You There are two important properties of a decomposition: 1. Recovery : it should be possible to project the original relations onto the decomposed schema, and then reconstruct the original. 2. Dependency Preservation : it should be possible to check in the projected relations whether all the given FD’s are satisfied.

3 Spring NF and BCNF, Continued We can get (1) with a BCNF decomposition. We can get both (1) and (2) with a 3NF decomposition. But we can’t always get (1) and (2) with a BCNF decomposition. u street-city-zip is an example.

3 Spring Mutli-valued Dependencies Fourth Normal Form

3 Spring Definition of MVD A multivalued dependency is an assertion that two attributes (sets of attributes) are independent of one another. Formally: A multivalued dependency (MVD) on R, X ->->Y, says that if two tuples of R agree on all the attributes of X, then their components in Y may be swapped, and the result will be two tuples that are also in the relation.

3 Spring Example Actors(name, addr, phones, cars) with MVD Name  phones. nameaddrphonescars sueap1b1 sueap2b2 it must also have the same tuples with phones components swapped: nameaddrphonescars sueap2b1 sueap1b2 Note: we must check this condition for all pairs of tuples that agree on name, not just one pair.

3 Spring Example 2 An actor may have more than one address Key? What normal form? Note the redundancies MVD: name  street city read: name determines 1 or more street & city independent of all other attributes

3 Spring MVD Rules 1.Every FD is an MVD. u Because if X  Y, then swapping Y’s between tuples that agree on X doesn’t create new tuples.  Example, in Actors: name  addr. Note: the opposite is not true i.e. not every MVD is a FD 2.Complementation: if X  Y, then X  Z, where Z is all attributes not in X or Y.  Example: since name  phones holds in Actors, the name  addr cars.

3 Spring Splitting Doesn’t Hold name  street city holds, but name  street does not hold u Name does not determine 1 or more street independent of city. name  city does not hold

3 Spring Example 2 An actor may have more than one address MVD: name  street city read: name determines 1 or more street & city independent of all other attributes Also (complement MVD): name  title year

3 Spring Fourth Normal Form The redundancy that comes from MVD’s is not removable by putting the database schema in BCNF. There is a stronger normal form, called 4NF, that (intuitively) treats MVD’s as FD’s when it comes to decomposition, but not when determining keys of the relation.

3 Spring NF Eliminate redundancy due to multiplicative effect of MVD’s. Roughly: treat MVD’s as FD's for decomposition, but not for finding keys. Formally: R is in Fourth Normal Form if whenever MVD X  Y is nontrivial (Y is not a subset of X, and X  Y is not all attributes), then X is a superkey. u Remember, X  Y implies X  Y, so 4NF is more stringent than BCNF. Decompose R, using 4NF violation X  Y, into XY and X  (R—Y). R Y X

3 Spring Example Drinkers(name, addr, phones, cars) FD: name  addr Nontrivial MVD’s: name  phones name  cars. Only key: { name, phones, cars } All three dependencies above violate 4NF. Why? Successive decomposition yields 4NF relations: D1(name, addr) D2(name, phones) D3(name, cars)

3 Spring Example 2  name  street city  Decompose into: R1(name, street, city) R2(name, title, year)