Download presentation
Presentation is loading. Please wait.
1
COP 6726: New Directions in Database Systems
Normalization
2
Outline Goal: Measure the quality of the logical model Normalization First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF)
3
What is problem? z-no name address c-id c-name c-loc grade z1 James
5th st 1 Database CM128 A 2 Data mining CM25 3 Graph theory CM100 A+ 4 Optimization CM21 C z2 Paul 11th ave B 5 Network EE221 z3 Jin 9th st D
4
What is problem? z-no name address c-id c-name c-loc grade z1 James
5th st 1 Database CM128 A 2 Data mining CM25 3 Graph theory CM100 A+ 4 Optimization CM21 C z2 Paul 11th ave B 5 Network EE221 z3 Jin 9th st D
5
Normalization Normalization is the process of decomposing large tables into smaller tables in order to eliminate redundant data and duplicate data and to avoid problems with inserting, deleting, or updating data. Goal: Information preservation Minimum redundancy.
6
Top Down Approach z-no name address c-id c-name c-loc grade z-no name
7
Semantics of attributes is clear.
Informal Guideline Semantics of attributes is clear. Reducing redundant information in tuples. Reducing NULL values in tuples. Disallowing the possibility of generating spurious tuples
9
Insertion Anomalies Insert a new employee
Employee works for ‘Research’ Department. Employee does not work for any department. Insert a new department tuple Department without an employee
10
Deletion Anomalies Delete a department. ‘Research’ department
11
Update Anomalies Update a department.
Update manager in ‘Research’ department
12
Functional Dependency (FD)
Given two tuples t1 and t2 ,and two attributes (or columns) X and Y, If t1[X] = t2 [X], then t1 [Y] = t2 [Y] Notation: X Y Ssn {Ename, Bdate, Address, Dnumber} Dnumber {Dname, Dmgr_ssn} {SSN Pnumber} {Hour} Ssn Ename Pnumber {Pname, Plocatoin}
13
Functional Dependency (FD)
If X is a candidate key in a relation R, then X R If X Y in R, then we cannot say Y X Ssn {Ename, Bdate, Address, Dnumber} Dnumber {Dname, Dmgr_ssn} {SSN Pnumber} {Hour} Ssn Ename Pnumber {Pname, Plocatoin}
14
Functional Dependency (FD)
A functional dependency is a property of the semantics or meaning of the attributes. Ssn Ename : Ssn uniquely determines the employee name. Pnumber {Pname, Plocation} : Project’s number uniquely determines the project names and location. {Ssn, Pnumber} Hours : Ssn and Pnumber uniquely determines the number of hours.
15
Normalization Normalization process takes a relation schema though a series of tests to certify whether it satisfies a certain normal form. Minimize redundancy Minimize the insertion, deletion, and update anomalies. Normal Form (NF) 1 NF, 2NF, 3NF, Boyce-Codd normal form (BCNF)
16
Normalization Non-addictive Join (or lossless join) property
Spurious tuples generation problem does not occurs after decomposition. Dependency property Each functional dependency is preserved after decomposition.
17
Basic Concept Super key Candidate key Primary key
Prime attribute is a member of some candidate key. Non-prime attribute is not a prime attribute.
18
Basic Concept Dependency Preservation Property
Every functional dependency should be preserved after decomposition. Non-addictive Join (or lossless Join) Property
19
First Normal Form An attribute is single atomic value.
20
First Normal Form
21
Second Normal Form FD1 : {Ssn, Pnumber} Hours FD2 : Ssn Ename
Every non-prime attribute is full functionally dependent on any key. FD1 : {Ssn, Pnumber} Hours FD2 : Ssn Ename FD3 : Pnumber {Pname, Plocation}
22
Second Normal Form Every non-prime attribute is full functionally dependent on any key. FD1 : {Ssn, Pnumber} Hours FD2 : Ssn Ename FD3 : Pnumber {Pname, Plocation}
23
Third Normal Form Given FD X A,
(a) X is a superkey or (b) A is a prime attribute Transitive dependency FD1 : Ssn {Ename, Bdate, Address, Dnumber} FD2 : Dnumber {Dname, Dmgr_ssn}
24
Third Normal Form Given FD X A,
(a) X is a superkey or (b) A is a prime attribute
25
1 NF, 2NF, 3NF
26
Example FD1 : Property_id# {Country_name, Lot#, Area, Price, Tax_rate} FD2 : {County_name, Lot#} {Property_id#, Area, Price, Tax_rate} FD3 : County_name Tax_rate FD4 : Area Price
27
Example FD1 : Property_id# {Country_name, Lot#, Area, Price, Tax_rate} FD2 : {County_name, Lot#} {Property_id#, Area, Price, Tax_rate} FD3 : County_name Tax_rate FD4 : Area Price
28
Example FD1 : Property_id# {Country_name, Lot#, Area, Price, Tax_rate} FD2 : {County_name, Lot#} {Property_id#, Area, Price, Tax_rate} FD3 : County_name Tax_rate FD4 : Area Price
29
Normalization (Top Down Approach)
30
BCNF (Boyce Codd Normal Form)
Given FD X A, X is always a super key FD1 : Property_id# {Country_name, Lot#, Area} FD2 : {County_name, Lot#} {Property_id#, Area} FD5 : Area Country_name Third Normal Form Given FD X A, (a) X is a superkey or (b) A is a prime attribute.
31
BCNF (Boyce Codd Normal Form)
FD1 : Property_id# {Country_name, Lot#, Area} FD2 : {County_name, Lot#} {Property_id#, Area} FD5 : Area Country_name
32
Take Home Message Functional Dependency (FD) Non-addictive join property Dependency preservation property Normalization First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.