PMIT-6102 Advanced Database Systems

Slides:



Advertisements
Similar presentations
1/22/20091 Study the methods of first, second, third, Boyce-Codd, fourth and fifth normal form for relational database design, in order to eliminate data.
Advertisements

PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
NORMALIZATION FIRST NORMAL FORM (1NF): A relation R is in 1NF if all attributes have atomic value = one value for an attribute = no repeating groups =
1 Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
CS263:Revision on Normalisation
Normalization I.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Chapter 5 Normalization Transparencies © Pearson Education Limited 1995, 2005.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Normalization II. Boyce–Codd Normal Form (BCNF) Based on functional dependencies that take into account all candidate keys in a relation, however BCNF.
Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from Update anomalies : Insertions Deletions Modification.
Week 6 Lecture Normalization
DBSQL 4-1 Copyright © Genetic Computer School 2009 Chapter 4 Database Design.
Lecture 12 Inst: Haya Sammaneh
Fundamentals, Design, and Implementation, 9/e. Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/2 Copyright.
IS 230Lecture 8Slide 1 Normalization Lecture 9. IS 230Lecture 8Slide 2 Lecture 8: Normalization 1. Normalization 2. Data redundancy and anomalies 3. Spurious.
Normalization. 2 Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification of various types of update anomalies.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Announcements Read 5.8 – 5.13 for Monday Project Step 3, due Monday 10/18 Homework 4, due Friday 10/15 – by (or turn in Monday in class)
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Database Design (Normalizations) DCO11310 Database Systems and Design By Rose Chang.
Logical Database Design Relational Model. Logical Database Design Logical database design: process of transforming conceptual data model into a logical.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
Normalization Transparencies
Schema Refinement and Normal Forms 20131CS3754 Class Notes #7, John Shieh.
Switch off your Mobiles Phones or Change Profile to Silent Mode.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Chapter 13 Normalization Transparencies. 2 Chapter 13 - Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification.
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 13 Normalization Transparencies Last Updated: 08 th May 2011 By M. Arief
Chapter 10 Normalization Pearson Education © 2009.
1 Functional Dependencies and Normalization Chapter 15.
1 5 Chapter 5 Database Design 1: Some Normalization Examples Spring 2006.
Design Process - Where are we?
What is normalization ? Proposed by Codd in 1972 Takes a relation through a series of steps to certify whether it satisfies a certain normal form Initially.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Switch off your Mobiles Phones or Change Profile to Silent Mode.
Normalization.
Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Ch 7: Normalization-Part 1
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Objectives of Normalization  To create a formal framework for analyzing relation schemas based on their keys and on the functional dependencies among.
ITD1312 Database Principles Chapter 4C: Normalization.
IT-501 Database Management Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Logical Database Design and Relational Data Model Muhammad Nasir
Lecture # 17 Chapter # 10 Normalization Database Systems.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Chapter 8 Relational Database Design Topic 1: Normalization Chuan Li 1 © Pearson Education Limited 1995, 2005.
Relational Normalization Theory
Normalization Karolina muszyńska
Database Normalization
Module 5: Overview of Normalization
Some Normalization Examples
Normalization Dale-Marie Wilson, Ph.D..
Chapter 8 – Part2 Database Design.
Chapter 8 – Part2 Database Design.
國立臺北科技大學 課程:資料庫系統 2015 fall Chapter 14 Normalization.
Chapter 7a: Overview of Database Design -- Normalization
Some Normalization Examples
Presentation transcript:

PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University

Lecture 02 Relational Database Design Normalization

Outline Overview of Relational DBMS Normalization

Normalization The aim of normalization is to eliminate various anomalies (or undesirable aspects) of a relation in order to obtain “better” relations. The following four problems might exist in a relation scheme: Repetition anomaly Update anomaly Insertion anomaly Deletion anomaly

Repetition Anomaly The NAME,TITLE, SAL attribute values are repeated for each project that the employee is involved in. Waste of space Complicates updates Contrary to the spirit of databases ENO EMP ENAME TITLE SAL J. Doe Elect. Eng. 40000 M. Smith 34000 Analyst A. Lee Mech. Eng. 27000 J. Miller Programmer 24000 B. Casey Syst. Anal. L. Chu R. Davis E1 E2 E3 E4 E5 E6 E7 E8 J. Jones 24 PNO RESP DUR P1 Manager 12 P2 6 P3 Consultant 10 P4 Engineer 48 18 36 40

Update Anomaly If any attribute of project (say SAL of an employee) is updated, multiple tuples have to be updated to reflect the change. ENO EMP ENAME TITLE SAL J. Doe Elect. Eng. 40000 M. Smith 34000 Analyst A. Lee Mech. Eng. 27000 J. Miller Programmer 24000 B. Casey Syst. Anal. L. Chu R. Davis E1 E2 E3 E4 E5 E6 E7 E8 J. Jones 24 PNO RESP DUR P1 Manager 12 P2 6 P3 Consultant 10 P4 Engineer 48 18 36 40

Insertion Anomaly It may not be possible to store information about a new project until an employee is assigned to it. ENO EMP ENAME TITLE SAL J. Doe Elect. Eng. 40000 M. Smith 34000 Analyst A. Lee Mech. Eng. 27000 J. Miller Programmer 24000 B. Casey Syst. Anal. L. Chu R. Davis E1 E2 E3 E4 E5 E6 E7 E8 J. Jones 24 PNO RESP DUR P1 Manager 12 P2 6 P3 Consultant 10 P4 Engineer 48 18 36 40

Deletion Anomaly If an engineer, who is the only employee on a project, leaves the company, his personal information cannot be deleted, or the information about that project is lost. May have to delete many tuples. ENO EMP ENAME TITLE SAL J. Doe Elect. Eng. 40000 M. Smith 34000 Analyst A. Lee Mech. Eng. 27000 J. Miller Programmer 24000 B. Casey Syst. Anal. L. Chu R. Davis E1 E2 E3 E4 E5 E6 E7 E8 J. Jones 24 PNO RESP DUR P1 Manager 12 P2 6 P3 Consultant 10 P4 Engineer 48 18 36 40

What to do? Take each relation individually and “improve” it in terms of the desired characteristics Normal forms Atomic values (1NF) Can be defined according to keys and dependencies. Functional Dependencies ( 2NF, 3NF, BCNF) Multivalued dependencies (4NF) Projection-join dependencies (5NF) Normalization Normalization is a process of concept separation which applies a top-down methodology for producing a schema by subsequent refinements and decompositions. Do not combine unrelated sets of facts in one table; each relation should contain an independent set of facts. Universal relation assumption

Normalization Issues How do we decompose a schema into a desirable normal form? What criteria should the decomposed schemas follow in order to preserve the semantics of the original schema? Reconstructability: recover the original relation  no spurious joins Lossless decomposition: no information loss Dependency preservation: the constraints (i.e., dependencies) that hold on the original relation should be enforceable by means of the constraints (i.e., dependencies) defined on the decomposed relations.

A Lossy Decomposition

Example of Lossless-Join Decomposition Decomposition of R = (A, B, C) R1 = (A, B) R2 = (B, C) A B C A B B C   1 2 A B   1 2 1 2 A B r A,B(r) B,C(r) A B C A (r) B (r)   1 2 A B

Stages of Normalization Unnormalized (UDF) First normal form (1NF) Remove repeating groups Second normal form (2NF) Remove partial dependencies Third normal form (3NF) Remove transitive dependencies Boyce-Codd normal form (BCNF) Remove remaining functional dependency anomalies Fourth normal form (4NF) Remove multivalued dependencies Fifth normal form (5NF) Remove remaining anomalies

Repeating Groups A repeating group is an attribute (or set of attributes) that can have more than one value for a primary key value. Example We have the following relation that contains staff and department details and a list of telephone contact numbers for each member of staff. staffNo job dept dname city contact Number SL10 Salesman 10 Sales Stratford 018111777, 018111888, 079311122 SA51 Manager 20 Accounts Barking 017111777 DS40 Clerk Null OS45 30 Operations 079311555 Repeating Groups are not allowed in a relational design, since all attributes have to be ‘atomic’ - i.e., there can only be one value per cell in a table!

Repeating Groups Multivalued Attributes (or repeating groups): non-key attributes or groups of non-key attributes the values of which are not uniquely identified by (directly or indirectly) (not functionally dependent on) the value of the Primary Key (or its part). STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250, MSI 415 3.00 125 Johnson MSI 331

Functional Dependency Formal Definition: Attribute B is functionally dependant upon attribute A (or a collection of attributes) if a value of A determines a single value of attribute B at any one time. Formal Notation: A  B This should be read as ‘A determines B’ or ‘B is functionally dependant on A’. A is called the determinant and B is called the object of the determinant. staffNo job dept dname SL10 Salesman 10 Sales SA51 Manager 20 Accounts DS40 Clerk 20 Accounts OS45 Clerk 30 Operations Example: staffNo  job staffNo  dept staffNo  dname dept  dname Functional Dependencies

Functional Dependency Compound Determinants: If more than one attribute is necessary to determine another attribute in an entity, then such a determinant is termed a composite determinant. Full Functional Dependency: Only of relevance with composite determinants. This is the situation when it is necessary to use all the attributes of the composite determinant to identify its object uniquely. order# line# qty price A001 001 10 200 A002 001 20 400 A002 002 20 800 A004 001 15 300 Example: (Order#, line#)  qty (Order#, line#)  price Full Functional Dependencies

Functional Dependency Partial Functional Dependency: This is the situation that exists if it is necessary to only use a subset of the attributes of the composite determinant to identify its object uniquely. student# unit# room grade 9900100 A01 TH224 2 9900010 14 9901011 A02 JS075 3 9900001 16 (student#, unit#)  grade Full Functional Dependencies unit#  room Partial Functional Dependencies Repetition of data!

Functional Dependency Partial Dependency – when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key. Partial Dependency

Transitive Dependency Definition: A transitive dependency exists when there is an intermediate functional dependency. Formal Notation: If A  B and B  C, then it can be stated that the following transitive dependency exists: A  B  C Example: staffNo  dept dept  dname staffNo  dept  dname Transitive Dependencies Repetition of data! staffNo job dept dname SL10 Salesman 10 Sales SA51 Manager 20 Accounts DS40 Clerk OS45 30 Operations

Transitive Dependency Transitive Dependency – when a non-key attribute determines another non-key attribute. Transitive Dependency

Normal Forms: Review Unnormalized – There are multivalued attributes or repeating groups 1 NF – No multivalued attributes or repeating groups. 2 NF – 1 NF plus no partial dependencies 3 NF – 2 NF plus no transitive dependencies

Example 1: Determine NF ISBN  Title ISBN  Publisher All attributes are directly or indirectly determined by the primary key; therefore, the relation is at least in 1 NF ISBN  Title ISBN  Publisher Publisher  Address

Example 1: Determine NF ISBN  Title ISBN  Publisher The relation is at least in 1NF. There is no COMPOSITE primary key, therefore there can’t be partial dependencies. Therefore, the relation is at least in 2NF ISBN  Title ISBN  Publisher Publisher  Address

Example 1: Determine NF ISBN  Title ISBN  Publisher Publisher is a non-key attribute, and it determines Address, another non-key attribute. Therefore, there is a transitive dependency, which means that the relation is NOT in 3 NF. ISBN  Title ISBN  Publisher Publisher  Address

Example 1: Determine NF ISBN  Title ISBN  Publisher We know that the relation is at least in 2NF, and it is not in 3 NF. Therefore, we conclude that the relation is in 2NF. ISBN  Title ISBN  Publisher Publisher  Address

Example 1: Determine NF ISBN  Title ISBN  Publisher In your solution you will write the following justification: No M/V attributes, therefore at least 1NF No partial dependencies, therefore at least 2NF There is a transitive dependency (Publisher  Address), therefore, not 3NF Conclusion: The relation is in 2NF ISBN  Title ISBN  Publisher Publisher  Address

Example 2: Determine NF Product_ID  Description All attributes are directly or indirectly determined by the primary key; therefore, the relation is at least in 1 NF

The relation is at least in 1NF. Example 2: Determine NF Product_ID  Description The relation is at least in 1NF. There is a COMPOSITE Primary Key (PK) (Order_No, Product_ID), therefore there can be partial dependencies. Product_ID, which is a part of PK, determines Description; hence, there is a partial dependency. Therefore, the relation is not 2NF. No sense to check for transitive dependencies!

Example 2: Determine NF Product_ID  Description We know that the relation is at least in 1NF, and it is not in 2 NF. Therefore, we conclude that the relation is in 1 NF.

Example 2: Determine NF Product_ID  Description In your solution you will write the following justification: 1) No M/V attributes, therefore at least 1NF 2) There is a partial dependency (Product_ID  Description), therefore not in 2NF Conclusion: The relation is in 1NF

Example 3: Determine NF Part_ID  Description Part_ID  Price Comp_ID and No are not determined by the primary key; therefore, the relation is NOT in 1 NF. No sense in looking at partial or transitive dependencies. Part_ID  Description Part_ID  Price Part_ID, Comp_ID  No Part_ID Descr Price Comp_ID No

Example 3: Determine NF Part_ID  Description Part_ID  Price In your solution you will write the following justification: There are M/V attributes; therefore, not 1NF Conclusion: The relation is not normalized. Part_ID  Description Part_ID  Price Part_ID, Comp_ID  No

Bringing a Relation to 1NF

Bringing a Relation to 1NF Option 1: Make a determinant of the repeating group (or the multivalued attribute) a part of the primary key. Composite Primary Key

Bringing a Relation to 1NF Option 2: Remove the entire repeating group from the relation. Create another relation which would contain all the attributes of the repeating group, plus the primary key from the first relation. In this new relation, the primary key from the original relation and the determinant of the repeating group will comprise a primary key.

Bringing a Relation to 1NF

Bringing a Relation to 2NF Composite Primary Key

Bringing a Relation to 2NF Goal: Remove Partial Dependencies Partial Dependencies Composite Primary Key

Bringing a Relation to 2NF Remove attributes that are dependent from the part but not the whole of the primary key from the original relation. For each partial dependency, create a new relation, with the corresponding part of the primary key from the original as the primary key.

Bringing a Relation to 2NF

Bringing a Relation to 3NF Goal: Get rid of transitive dependencies. Transitive Dependency

Bringing a Relation to 3NF Remove the attributes, which are dependent on a non-key attribute, from the original relation. For each transitive dependency, create a new relation with the non-key attribute which is a determinant in the transitive dependency as a primary key, and the dependent non-key attribute as a dependent.

Bringing a Relation to 3NF

Unnormalised Normal Form (UNF) ORDER (order-no, order-date, cust-no, cust-name, cust-add, (prod-no, prod-desc, unit-price, ord-qty, line-total)*, order-total

First Normal Form (1NF) Definition: A relation is in 1NF if, and only if, all its underlying attributes contain atomic values only. Remove repeating groups into a new relation A repeating group is shown by a pair of brackets within the relational schema. ORDER (order-no, order-date, cust-no, cust-name, cust-add, (prod-no, prod-desc, unit-price, ord-qty, line-total)*, order-total Steps from UNF to 1NF: Remove the outermost repeating group (and any nested repeated groups it may contain) and create a new relation to contain it. Add to this relation a copy of the PK of the relation immediately enclosing it. Name the new entity (appending the number 1 to indicate 1NF) Determine the PK of the new entity Repeat steps until no more repeating groups.

Example - UNF to 1NF ORDER (order-no, order-date, cust-no, cust-name, cust-add, (prod-no, prod-desc, unit-price, ord-qty, line-total)*, order-total 1. Remove the outermost repeating group (and any nested repeated groups it may contain) and create a new relation to contain it. (rename original to indicate 1NF) ORDER-1 (order-no, order-date, cust-no, cust-name, cust-add, order-total (prod-no, prod-desc, unit-price, ord-qty, line-total) 2. Add to this relation a copy of the PK of the relation immediately enclosing it. ORDER-1 (order-no, order-date, cust-no, cust-name, cust-add, order-total (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total) 3. Name the new entity (appending the number 1 to indicate 1NF) ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total) 4. Determine the PK of the new entity ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)

Second Normal Form (2NF) Definition: A relation is in 2NF if, and only if, it is in 1NF and every non-key attribute is fully dependent on the primary key. Remove partial functional dependencies into a new relation Steps from 1NF to 2NF: Remove the offending attributes that are only partially functionally dependent on the composite key, and place them in a new relation. Add to this relation a copy of the attribute(s) which are the determinants of these offending attributes. These will automatically become the primary key of this new relation. Name the new entity (appending the number 2 to indicate 2NF) Rename the original entity (ending with a 2 to indicate 2NF)

Example - 1NF to 2NF ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total) 1. Remove the offending attributes that are only partially functionally dependent on the composite key, and place them in a new relation. ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total) (prod-desc, unit-price) 2. Add to this relation a copy of the attribute(s) which determines these offending attributes. These will automatically become the primary key of this new relation.. (prod-no, prod-desc, unit-price) ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total) 3. Name the new entity (appending the number 2 to indicate 2NF) PRODUCT-2 (prod-no, prod-desc, unit-price) 4. Rename the original entity (ending with a 2 to indicate 2NF) ORDER-LINE-2 (order-no, prod-no, ord-qty, line-total)

Third Normal Form (3NF) Definition: A relation is in 3NF if, and only if, it is in 2NF and every non-key attribute is non-transitively dependent on the primary key. Remove transitive dependencies into a new relation Steps from 2NF to 3NF: Remove the offending attributes that are transitively dependent on non-key attribute(s), and place them in a new relation. Add to this relation a copy of the attribute(s) which are the determinants of these offending attributes. These will automatically become the primary key of this new relation. Name the new entity (appending the number 3 to indicate 3NF) Rename the original entity (ending with a 3 to indicate 3NF)

Example - 2NF to 3NF ORDER-2 (order-no, order-date, cust-no, cust-name, cust-add, order-total 1. Remove the offending attributes that are transitively dependent on non-key attributes, and place them in a new relation. (cust-name, cust-add ) ORDER-2 (order-no, order-date, cust-no, order-total 2. Add to this relation a copy of the attribute(s) which determines these offending attributes. These will automatically become the primary key of this new relation.. (cust-no, cust-name, cust-add ) ORDER-2 (order-no, order-date, cust-no, order-total 3. Name the new entity (appending the number 3 to indicate 3NF) CUSTOMER-3 (cust-no, cust-name, cust-add ) 4. Rename the original entity (ending with a 3 to indicate 3NF) ORDER-3 (order-no, order-date, cust-no, order-total

Example - Relations in 3NF CUSTOMER-3 (cust-no, cust-name, cust-add ) ORDER-3 (order-no, order-date, cust-no, order-total ORDER-LINE-2 (order-no, prod-no, ord-qty, line-total) PRODUCT-2 (prod-no, prod-desc, unit-price) CUSTOMER ORDER ORDER-LINE PRODUCT places placed by contains part of shows belongs to cust-no order-no prod-no order-no, prod-no

Case Study on Normalization Consider the table EMP_DEPT_PROJ And the following dependencies which exist in the above table:

Steps to Normalize the database Thus we will have 3 tables

Steps to Normalize the database Let us now identify the transitive dependency and remove it.

Steps to Normalize the database Let us now identify the non key determinants and remove them. Thus we will have 5 tables

Thank You