Presentation is loading. Please wait.

Presentation is loading. Please wait.

COP 6726: New Directions in Database Systems

Similar presentations


Presentation on theme: "COP 6726: New Directions in Database Systems"— Presentation transcript:

1 COP 6726: New Directions in Database Systems
Normalization

2 Outline Goal: Measure the quality of the logical model Normalization First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF)

3 What is problem? z-no name address c-id c-name c-loc grade z1 James
5th st 1 Database CM128 A 2 Data mining CM25 3 Graph theory CM100 A+ 4 Optimization CM21 C z2 Paul 11th ave B 5 Network EE221 z3 Jin 9th st D

4 What is problem? z-no name address c-id c-name c-loc grade z1 James
5th st 1 Database CM128 A 2 Data mining CM25 3 Graph theory CM100 A+ 4 Optimization CM21 C z2 Paul 11th ave B 5 Network EE221 z3 Jin 9th st D

5 Normalization Normalization is the process of decomposing large tables into smaller tables in order to eliminate redundant data and duplicate data and to avoid problems with inserting, deleting, or updating data. Goal: Information preservation Minimum redundancy.

6 Top Down Approach z-no name address c-id c-name c-loc grade z-no name

7 Semantics of attributes is clear.
Informal Guideline Semantics of attributes is clear. Reducing redundant information in tuples. Reducing NULL values in tuples. Disallowing the possibility of generating spurious tuples

8

9 Insertion Anomalies Insert a new employee
Employee works for ‘Research’ Department. Employee does not work for any department. Insert a new department tuple Department without an employee

10 Deletion Anomalies Delete a department. ‘Research’ department

11 Update Anomalies Update a department.
Update manager in ‘Research’ department

12 Functional Dependency (FD)
Given two tuples t1 and t2 ,and two attributes (or columns) X and Y, If t1[X] = t2 [X], then t1 [Y] = t2 [Y] Notation: X  Y Ssn  {Ename, Bdate, Address, Dnumber} Dnumber  {Dname, Dmgr_ssn} {SSN Pnumber}  {Hour} Ssn  Ename Pnumber  {Pname, Plocatoin}

13 Functional Dependency (FD)
If X is a candidate key in a relation R, then X  R If X  Y in R, then we cannot say Y  X Ssn  {Ename, Bdate, Address, Dnumber} Dnumber  {Dname, Dmgr_ssn} {SSN Pnumber}  {Hour} Ssn  Ename Pnumber  {Pname, Plocatoin}

14 Functional Dependency (FD)
A functional dependency is a property of the semantics or meaning of the attributes. Ssn  Ename : Ssn uniquely determines the employee name. Pnumber  {Pname, Plocation} : Project’s number uniquely determines the project names and location. {Ssn, Pnumber}  Hours : Ssn and Pnumber uniquely determines the number of hours.

15 Normalization Normalization process takes a relation schema though a series of tests to certify whether it satisfies a certain normal form. Minimize redundancy Minimize the insertion, deletion, and update anomalies. Normal Form (NF) 1 NF, 2NF, 3NF, Boyce-Codd normal form (BCNF)

16 Normalization Non-addictive Join (or lossless join) property
Spurious tuples generation problem does not occurs after decomposition. Dependency property Each functional dependency is preserved after decomposition.

17 Basic Concept Super key Candidate key Primary key
Prime attribute is a member of some candidate key. Non-prime attribute is not a prime attribute.

18 Basic Concept Dependency Preservation Property
Every functional dependency should be preserved after decomposition. Non-addictive Join (or lossless Join) Property

19 First Normal Form An attribute is single atomic value.

20 First Normal Form

21 Second Normal Form FD1 : {Ssn, Pnumber}  Hours FD2 : Ssn  Ename
Every non-prime attribute is full functionally dependent on any key. FD1 : {Ssn, Pnumber}  Hours FD2 : Ssn  Ename FD3 : Pnumber  {Pname, Plocation}

22 Second Normal Form Every non-prime attribute is full functionally dependent on any key. FD1 : {Ssn, Pnumber}  Hours FD2 : Ssn  Ename FD3 : Pnumber  {Pname, Plocation}

23 Third Normal Form Given FD X  A,
(a) X is a superkey or (b) A is a prime attribute Transitive dependency FD1 : Ssn  {Ename, Bdate, Address, Dnumber} FD2 : Dnumber {Dname, Dmgr_ssn}

24 Third Normal Form Given FD X  A,
(a) X is a superkey or (b) A is a prime attribute

25 1 NF, 2NF, 3NF

26 Example FD1 : Property_id#  {Country_name, Lot#, Area, Price, Tax_rate} FD2 : {County_name, Lot#} {Property_id#, Area, Price, Tax_rate} FD3 : County_name  Tax_rate FD4 : Area  Price

27 Example FD1 : Property_id#  {Country_name, Lot#, Area, Price, Tax_rate} FD2 : {County_name, Lot#} {Property_id#, Area, Price, Tax_rate} FD3 : County_name  Tax_rate FD4 : Area  Price

28 Example FD1 : Property_id#  {Country_name, Lot#, Area, Price, Tax_rate} FD2 : {County_name, Lot#} {Property_id#, Area, Price, Tax_rate} FD3 : County_name  Tax_rate FD4 : Area  Price

29 Normalization (Top Down Approach)

30 BCNF (Boyce Codd Normal Form)
Given FD X  A, X is always a super key FD1 : Property_id#  {Country_name, Lot#, Area} FD2 : {County_name, Lot#} {Property_id#, Area} FD5 : Area  Country_name Third Normal Form Given FD X  A, (a) X is a superkey or (b) A is a prime attribute.

31 BCNF (Boyce Codd Normal Form)
FD1 : Property_id#  {Country_name, Lot#, Area} FD2 : {County_name, Lot#} {Property_id#, Area} FD5 : Area  Country_name

32 Take Home Message Functional Dependency (FD) Non-addictive join property Dependency preservation property Normalization First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF)


Download ppt "COP 6726: New Directions in Database Systems"

Similar presentations


Ads by Google