Download presentation
Presentation is loading. Please wait.
1
Database Normalization
2
Designing Good Schemas
We know how to create schemas, but ... how do we create good schemas? what does good mean? Schema quality measurements: semantics of the attributes minimal redundancy minimal frequency of null values
3
Functional Dependences
A column Y of relational table R is functionally dependent up on column X of relational table R if and only if: Each value of X in R associated with each value of Y at any given time
4
Functional dependences
Y is functional dependent up on X same as values of X identify values of Y If X Y then XZYZ IF XY and Y Z then XZ X Y means that Y depend on X or X identify Y
5
Examples S# Ename {S#, P#} Hours
If for each value of S#, there are exactly one corresponding value for sname, state, city then: S# Sname Sate City
6
Example If {S#, p#} Qty S# P# QTY
7
Redundancy Example Where’s the redundancy?
8
Redundancy Example
9
Example FDs Proper FDs Transitive FDs Partial Key FD Partial Key FDs
10
Normal Forms Each normal form is a set of conditions on a schema that guarantees certain properties (relating to redundancy and update anomalies) The two commonly used normal forms are third normal form (3NF) and Boyce-Codd normal form (BCNF)
11
Normalization 0NF 1NF 2NF 3NF BCNF 4NF 5NF remove multi-valued
attributes 1NF 2NF 3NF partial dependencies transitive BCNF 4NF 5NF remove remaining FD anomal dependencies multivalue anomalies
12
1 NF First normal form is NO multi-valued attributes
No composite attribute No nested relation We create new table or new field (telephone, visiting)
13
1NF Normalization Proper translation from ER multi-value attributes will achieve 1NF. Still not a good solution, since we have redundancy in Dnumber and Dmgr_ssn. (This will be handled by 2NF.)
14
2 NF form Second normal form that if primary key is multiple attribute and non-key attribute depend on part of primary key S# P# Hours Cname pname Loc
15
2NF Normalization Move the partial key and dependent attributes to a new relation.
16
Transitive Dependencies
X → Y is a transitive dependency (PD) if there exists Z ⊈ any key such that X → Z → Y TDs can cause redundancy if there are multiple values of X that determine the same value of Z the value of Y for that value of Z is stored multiple times 3NF normalization: move (Z,Y) to new relation in which Z is the primary key
17
3 NF The relation in 3NF if it is 2 NF and every non-key attribute is non-transitively dependent on primary key
18
3NF Normalization Create new relation to hold the attributes in the transitive FD. LHS of transitive FD becomes PK of new relation.
19
Transitive Dependency Example
DEPT COURSE SECTION ROOM INSTR I_OFFICE I_OFFICE (instructor's office) is determined by the non-PK attribute INSTR DEPT COURSE SECTION COMP 51 1 2 163 53 ROOM WPC122 WPC219 WPC130 INSTR DOHERTY CLIBURN BOWRING CARMAN I_OFFICE CSB109 CSB107 CSB108 CSB104
20
NF Decomposition: Foreign Keys
DEPT COURSE SECTION ROOM INSTR I_OFFICE DEPT COURSE SECTION ROOM INSTR Decomposition: INSTR I_OFFICE
21
Why? Normalization Goal = BCNF = Boyce-Codd Normal Form =
all FD’s follow from the fact “key everything.” Formally, R is in BCNF if for every nontrivial FD for R, say X A, then X is a superkey. “Nontrivial” = right-side attribute not in left side. Why? 1. Guarantees no redundancy due to FD’s. 2. Guarantees no update anomalies = one occurrence of a fact is updated, not all. 3. Guarantees no deletion anomalies = valid fact is lost when tuple is deleted. Arthur Keller – CS 180
22
Boyce-Codd Normal Form
Sample data for Course Section table Because Prefix Department, we know that (Prefix, Num, SecNum) could also be a primary key for this table. Department Prefix Num SecNum CourseName Instructor Mathematics Math 101 1 Algebra I Al Jeebra 2 201 Calculus I Kal Kuelus Philosophy Phil Greek Thought Arie Stottle 202 Euro Thought Mike Angelo Marketing Mktg 410 Marketing Strategy Marc Ekking SpMkg 401 Advanced Sports Marketing Hulk Hogan
23
Example Students(name, addr, phones, CarLiked)
A student’s phones are independent of the cars they like. Thus, each of a student’s phones appears with each of the cars they like in all combinations. This repetition is unlike redundancy due to FD’s, of which name->addr is the only one.
24
Example Only key is {name, CarsLiked}.
Students(name, addr, CarLiked, manf, favCar) FD’s: name->addr favCar, carsLiked->manf Only key is {name, CarsLiked}. In each FD, the left side is not a superkey. Any one of these FD’s shows Students is not in BCNF
25
Boyce-Codd Normal Form
We say a relation R is in BCNF if whenever X ->A is a nontrivial FD that holds in R, X is a superkey. Remember: nontrivial means A is not a member of set X. Remember, a superkey is any superset of a key (not necessarily a proper superset).
26
Example Students(name, addr, CarsLiked, manf, favCar)
F = name->addr, name -> favCar, CarsLiked->manf Pick BCNF violation name->addr. Close the left side: {name}+ = {name, addr, favCar}. Decomposed relations: Students1(name, addr, favCar) Students2(name, CarsLiked, manf)
27
3NF and BCNF 3rd Normal Form (3NF) modifies the BCNF condition so we do not have to decompose in this problem situation. X ->A violates 3NF if and only if X is not a superkey, and also A is not prime.
28
Exercises The following relation schema is not in third normal form (3NF) Is this an example of a transitive dependency or a partial key dependency? Give an equivalent schema that is in 3NF. SID FROM_CITY TO_CITY DISTANCE SHIPMENT WEIGHT
29
Exercises This relation has been proposed to track Pacific alumni: Alumni( SID, LastName, FirstName, Degree, YearAwarded, Phone). Pacific allows students to receive multiple degrees, possibly in different years. Identify all FDs. Give a new schema that is in third normal form.
30
Exercises Consider the following relation schema: Movie(title, genre, length, actor, sag_id, studio, studio_addr) Every movie has a unique title. A movie may have multiple actors. Each actor has a unique sag_id. An actor may appear in multiple movies. A movie has exactly one studio, but a studio may produce more than one movie. Each studio has exactly one address. Identify all functional dependencies. Normalize the schema to 3NF.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.