Presentation is loading. Please wait.

Presentation is loading. Please wait.

3 Spring 20071 Chapter 3.6-7 Normalization of Database Tables.

Similar presentations


Presentation on theme: "3 Spring 20071 Chapter 3.6-7 Normalization of Database Tables."— Presentation transcript:

1 3 Spring 20071 Chapter 3.6-7 Normalization of Database Tables

2 3 Spring 20072 Normalization Normalization is process for assigning attributes to entities u Reduces data redundancies u Helps eliminate data anomalies u Produces controlled redundancies to link tables Normalization stages u 1NF - First normal form u 2NF - Second normal form u 3NF - Third normal form u 4NF - Fourth normal form

3 3 Spring 20073 Example: StarsMovies Owns Studios Starts-in title length year address Name address

4 3 Spring 20074 Problem Example Movies Update anomalies: If Harrison Ford’s phone # changes, must change it in each of his tuples. If Length value of Star Wars needs to be changed, must change all occurrences Deletion anomalies: If we delete Wayne’s World entries from database, we also loose all info about Dana Carvey & Mike Meyers

5 3 Spring 20075 Conversion to 1NF Repeating groups must be eliminated u Proper primary key developed Uniquely identifies each tuple u Dependencies can be identified undesirable dependencies allowed –Partial »based on part of composite primary key –Transitive »one nonprime attribute depends on another nonprime attribute

6 3 Spring 20076 Conversion to 1NF Cont. An attribute that is at least part of a key is known as a prime attribute or key attribute or primary key.

7 3 Spring 20077 Example Projects assigned to employees Each project has a number and a name Each employee has a number, a name, a job class Each employee working on a project, need to keep number of hours spent on project, and hourly rate. Project Assignments Table : ( PROJ_NUM, PROJ_NAME, EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS) What’s the Key for this relation?

8 3 Spring 20078 Data Organization: 1NF

9 3 Spring 20079 Dependency Diagram (1NF) PROJ_NUM, EMP_NUM --> PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS PROJ_NUM --> PROJ_NAME EMP_NUM-->EMP_NAME, JOB_CLASS, CHG_HOURS JOB_CLASS --> CHG_HOUR

10 3 Spring 200710 1NF Summarized All key attributes defined Primary Key identified No repeating groups in table All attributes dependent on primary key

11 3 Spring 200711 2NF Summarized In 1NF, but Includes no partial dependencies Partial dependency: u An attribute is functionally dependent on a portion of the primary key. Example: u PROJ_NUM  PROJ_NAME u EMP_NUM-->EMP_NAME, JOB_CLASS, CHG_HOURS

12 3 Spring 200712 Conversion to 2NF 1.Start with 1NF format: 2.Write each key component on a separate line 3.Write dependent attributes after each key component 4.Write original key on last line 5.Write any remaining attributes after original key 6.Each component is new table PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR) ASSIGN (PROJ_NUM, EMP_NUM, HOURS)

13 3 Spring 200713 2NF Conversion Results

14 3 Spring 200714 2NF Summarized In 1NF Includes no partial dependencies Still possible to exhibit transitive dependency u Attributes may be functionally dependent on non-key attributes

15 3 Spring 200715 Conversion to 3NF decompose table(s) to eliminate transitive functional dependencies PROJECT (PROJ_NUM, PROJ_NAME) ASSIGN (PROJ_NUM, EMP_NUM, HOURS) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS) JOB (JOB_CLASS, CHG_HOUR)

16 3 Spring 200716 3NF Summarized In 2NF Contains no transitive dependencies

17 3 Spring 200717 Additional DB Enhancements Associate Manager with each project Job_Code is more concise and less error prone that job class Use naming convention to name attributes e.g. job_code, job_description, Job_chg_hour etc.

18 3 Spring 200718 Boyce-Codd Normal Form (BCNF) Formally, R is in BCNF if for every nontrivial FD for R, say X  A, X is a superkey. u “Nontrivial” = right-side attribute not in left side. u Trivial FDs examples A  A AB  A Informally: the only arrows in the FD diagram are arrows out of superkeys u Note: BCNF violation when relation has more than one superkey that overlap

19 3 Spring 200719 3NF Table Not in BCNF What normal form?

20 3 Spring 200720 Decomposition of Table Structure to Meet BCNF

21 3 Spring 200721 Decomposition to Reach BCNF Setting: relation R with FDs F. Suppose relation R has BCNF violation X  B and X not a superkey.

22 3 Spring 200722 1. Compute X +. u Cannot be all attributes – why? 2. Decompose R into X + and (R–X + )  X. 3. Find the FD’s for the decomposed relations. u Project the FD’s from F = calculate all consequents of F that involve only attributes from X + or only from (R  X + )  X. R X + X

23 3 Spring 200723 Identify the violating FD E.g. : X 1 X 2 …X n  B 1 B 2 …B m Add to the right-hand side of FD as many attributes as are functionally determined by (X 1 X 2 …X n ) Decompose relation R into two relations: u One relation has all attributes Xs & Bs u Second relation has the Xs plus any other remaining attributes from R other than Bs Decomposition to Reach BCNF

24 3 Spring 200724 BCNF--Example Assume R(S, J, T) S: Student J: subject T: Teacher Student S is taught subject J by teacher T. Constraints: u For each subject, each student of that subject is taught by only one teacher u Each teacher teaches only one subject (but each subject is taught by several teachers)

25 3 Spring 200725 BCNF--Example SJT SmithMathProf. White SmithPhysicsProf. Green JaneMathProf. White JanePhysicsProf. Brown Functional Dependencies: S, J  T T  J

26 3 Spring 200726 BCNF--Example Candidate keys: {S, J} and {S, T} 3NF but not in BCNF Update anomaly: if we delete the info that Jane is studying Physics we also loose the info that Prof. Brown teaches Physics Solution: two relations R1{S, T} R2{T, J} S J T

27 3 Spring 200727 Decomposition Based on BCNF is Necessarily Correct Attributes A, B, C. FD: B  C Relations R1[A,B] R2[B, C] Tuples in R: (a, b, c) Tuples in R1: (a, b) Tuples in R2: (b, c) Natural join of R1 and R2 = (a, b, c)  original relation R can be reconstructed by forming the natural join of R1 and R2.

28 3 Spring 200728 Decomposition Based on BCNF is Necessarily Correct Attributes A, B, C. FD: B  C Relations R1[A,B] R2[B, C] Tuples in R: (a, b, c), (d, b, e) Tuples in R1: (a, b), (d, b) Tuples in R2: (b, c), (b, e) Tuples in the natural join of R1 and R2: (a,b,c), (a,b, e) (d, b, c), (d, b, e) Can (a,b,e), (d, b, c) be a bogus tuples?

29 3 Spring 200729 Decomposition Based on BCNF is Necessarily Correct Answer: No Because: B  C i.e. if 2 tuples have same B attribute then they must have the same C attribute.  (b,c) = (b,e)  (a, b,e) = (a, b,c) and (d, b, c) = (d, b, e)

30 3 Spring 200730 Theorem Any two-attribute relation is in BCNF.

31 3 Spring 200731 Decomposition Theorem Suppose we decompose a relation R(X, Y, Z) into R1(X, Y) and R2(X,Z) and project the R onto R1 and R2. Then join(R1, R2) is guaranteed to reconstruct R if and only if X  Y or X  Z Notice that whenever we decompose because of a BNCF violation, one of the above FDs holds.

32 3 Spring 200732 3NF One FD structure causes problems in BCNF: If you decompose, you can’t recover all of the original FD’s. If you don’t decompose, you violate BCNF. Abstractly: AB  C and C  B. Example : street city  zip, and zip  city. BCNF violation: C  B has a left side that is not a superkey. Based on previous algorithm, decompose into BC and AC. u But the FD AB  C does not hold in new tables.

33 3 Spring 200733 Example A = street, B = city, C = zip. zip  city BCNF violation street city  zip It is a bad idea to decompose relation because you loose the ability to check the dependency: Decompose:

34 3 Spring 200734 Example zip  city street city  zip It is a bad idea to decompose relation because you loose the ability to check the dependency: Decompose:

35 3 Spring 200735 “Elegant” Workaround Define the problem away. A relation R is in 3NF iff (if and only if) for every nontrivial FD X  A, either: 1. X is a superkey, or 2. A is prime = member of at least one key. Thus, if we just normalize to the 3NF, the problem goes away.

36 3 Spring 200736 What 3NF and BCNF Give You There are two important properties of a decomposition: 1. Recovery : it should be possible to project the original relations onto the decomposed schema, and then reconstruct the original. 2. Dependency Preservation : it should be possible to check in the projected relations whether all the given FD’s are satisfied.

37 3 Spring 200737 3NF and BCNF, Continued We can get (1) with a BCNF decomposition. We can get both (1) and (2) with a 3NF decomposition. But we can’t always get (1) and (2) with a BCNF decomposition. u street-city-zip is an example.

38 3 Spring 200738 Mutli-valued Dependencies Fourth Normal Form

39 3 Spring 200739 Definition of MVD A multivalued dependency is an assertion that two attributes (sets of attributes) are independent of one another. Formally: A multivalued dependency (MVD) on R, X ->->Y, says that if two tuples of R agree on all the attributes of X, then their components in Y may be swapped, and the result will be two tuples that are also in the relation.

40 3 Spring 200740 Example Actors(name, addr, phones, cars) with MVD Name  phones. nameaddrphonescars sueap1b1 sueap2b2 it must also have the same tuples with phones components swapped: nameaddrphonescars sueap2b1 sueap1b2 Note: we must check this condition for all pairs of tuples that agree on name, not just one pair.

41 3 Spring 200741 Example 2 An actor may have more than one address Key? What normal form? Note the redundancies MVD: name  street city read: name determines 1 or more street & city independent of all other attributes

42 3 Spring 200742 MVD Rules 1.Every FD is an MVD. u Because if X  Y, then swapping Y’s between tuples that agree on X doesn’t create new tuples.  Example, in Actors: name  addr. Note: the opposite is not true i.e. not every MVD is a FD 2.Complementation: if X  Y, then X  Z, where Z is all attributes not in X or Y.  Example: since name  phones holds in Actors, the name  addr cars.

43 3 Spring 200743 Splitting Doesn’t Hold name  street city holds, but name  street does not hold u Name does not determine 1 or more street independent of city. name  city does not hold

44 3 Spring 200744 Example 2 An actor may have more than one address MVD: name  street city read: name determines 1 or more street & city independent of all other attributes Also (complement MVD): name  title year

45 3 Spring 200745 Fourth Normal Form The redundancy that comes from MVD’s is not removable by putting the database schema in BCNF. There is a stronger normal form, called 4NF, that (intuitively) treats MVD’s as FD’s when it comes to decomposition, but not when determining keys of the relation.

46 3 Spring 200746 4NF Eliminate redundancy due to multiplicative effect of MVD’s. Roughly: treat MVD’s as FD's for decomposition, but not for finding keys. Formally: R is in Fourth Normal Form if whenever MVD X  Y is nontrivial (Y is not a subset of X, and X  Y is not all attributes), then X is a superkey. u Remember, X  Y implies X  Y, so 4NF is more stringent than BCNF. Decompose R, using 4NF violation X  Y, into XY and X  (R—Y). R Y X

47 3 Spring 200747 Example Drinkers(name, addr, phones, cars) FD: name  addr Nontrivial MVD’s: name  phones name  cars. Only key: { name, phones, cars } All three dependencies above violate 4NF. Why? Successive decomposition yields 4NF relations: D1(name, addr) D2(name, phones) D3(name, cars)

48 3 Spring 200748 Example 2  name  street city  Decompose into: R1(name, street, city) R2(name, title, year)


Download ppt "3 Spring 20071 Chapter 3.6-7 Normalization of Database Tables."

Similar presentations


Ads by Google