Database Management System. Normalization Normalization is the process of decomposing relations with anomalies to produce smaller, well structured relations.

Slides:



Advertisements
Similar presentations
The Relational Model System Development Life Cycle Normalisation
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
1 Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Normalization I.
Chapter 5 Normalization Transparencies © Pearson Education Limited 1995, 2005.
1 Minggu 10, Pertemuan 19 Normalization (cont.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
NORMALIZATION N. HARIKA (CSC).
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from Update anomalies : Insertions Deletions Modification.
FUNCTIONAL DEPENDENCIES
Week 6 Lecture Normalization
Lecture 12 Inst: Haya Sammaneh
Chapter 6 Normalization 正規化. 6-2 In This Chapter You Will Learn:  更動異常  How tables that contain redundant data can suffer from update anomalies ( 更動異常.
Normalization. 2 Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification of various types of update anomalies.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Chapter 13 Normalization Transparencies. 2 Last Class u Access Lab.
Concepts of Database Management, Fifth Edition
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
1 Pertemuan 23 Normalisasi Matakuliah: >/ > Tahun: > Versi: >
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Logical Database Design Relational Model. Logical Database Design Logical database design: process of transforming conceptual data model into a logical.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
Normalization Transparencies
CSC271 Database Systems Lecture # 28.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Normalization Ioan Despi 2 The basic objective of logical modeling: to develop a “good” description of the data, its relationships and its constraints.
Chapter 13 Normalization Transparencies. 2 Chapter 13 - Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
Chapter 13 Normalization © Pearson Education Limited 1995, 2005.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 13 Normalization Transparencies Last Updated: 08 th May 2011 By M. Arief
Chapter 10 Normalization Pearson Education © 2009.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Chapter 13 Normalization Transparencies. 2 Chapter 13 - Objectives u How to undertake process of normalization. u How to identify most commonly used normal.
Lecture Nine: Normalization
Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.
© Pearson Education Limited, Normalization Bayu Adhi Tama, M.T.I. Faculty of Computer Science University of Sriwijaya.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
ITD1312 Database Principles Chapter 4C: Normalization.
Logical Database Design and Relational Data Model Muhammad Nasir
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Chapter 8 Relational Database Design Topic 1: Normalization Chuan Li 1 © Pearson Education Limited 1995, 2005.
Normalization.
Advanced Normalization
Normalization Karolina muszyńska
Normalization DBMS.
Advanced Normalization
Chapter 14 Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Chapter 14 & Chapter 15 Normalization Pearson Education © 2009.
Chapter 14 Normalization – Part I Pearson Education © 2009.
Normalization – Part II
Normalization Dale-Marie Wilson, Ph.D..
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Chapter 14 Normalization.
Chapter 14 Normalization.
Normalization February 28, 2019 DB:Normalization.
國立臺北科技大學 課程:資料庫系統 2015 fall Chapter 14 Normalization.
Chapter 14 Normalization Pearson Education © 2009.
Presentation transcript:

Database Management System

Normalization Normalization is the process of decomposing relations with anomalies to produce smaller, well structured relations. It is a technique for producing a set of relations with desirable properties, given the data requirements of an enterprise. The process of normalization was first produced developed by E. F. Codd. It is basically a process of efficiently organizing data in a database. Purpose is to produce an anomaly free design. 2

Goal of Normalization There are two goals of the normalization process.  Eliminate redundant data (for example, storing same data in more than one table) and  Ensure data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensures that data is logically stored. 3

Normalization Details Normalization is often performed as a series of tests on a relation to determine whether it satisfies or violates the requirements of a given normal form. When a requirement is not met, the relation violating the requirement must be decomposed into relations that individually meet the requirements of normalization. The process of normalization is a formal method that identifies relations based on their primary key (or candidate keys in the case of BCNF) and the functional dependencies among their attributes. As normalization proceeds, the relations become progressively more restricted in format, and also less vulnerable to update anomalies. 4

Normalization Details A strongly recommended step. Normalized design makes the maintenance of database easier. Normalization applied on each table of a DB design. Performed after the logical database design. Informally also performed during conceptual DB design. 5

Normal Forms Different forms or levels of normalization called first, second, third and so on forms. Each form has got certain conditions. If a table fulfils the condition(s) for a normal form then the table is in that normal form. Three normal forms were initially proposed, called first (1NF), second (2NF) and third (3NF) normal forms. Subsequently, a stronger definition of third normal form was introduced by R. Boyce and E. F. Codd, referred to as Boyce- Codd Normal Form (BCNF). Higher normal forms that go beyond BCNF were introduced later. For example, there are fourth (4NF) and fifth (5NF) normal forms. However, these later normal forms deal with practical situations that are very rare. 6

Steps in Normalization 7

Anomalies An inconsistent, incomplete or incorrect state of database Four types of anomalies are of concern here; Redundancy, insertion, deletion and updation. 8

Un-normalized Form A table (not a relation) that contains one or more repeating groups. 9

First Normal Form 10

Functional Dependency Normalization is based on functional dependencies (FDs). A type of relationship between attributes of a relation. Definition: If A and B are attributes of a relation R, then B is functionally dependent on A if each value of A in R is associated with exactly one value of B; written as A B It does not mean that A derives B, although it may be the case sometime. Means that if we know value of A then we can precisely determine a unique value of B. Attribute of set of attributes on the left side are called determinant and on the right are called dependents Like R (a, b, c, d, e) a b, c, d d d, e 11

FD Diagrammatic Representation 12

FD Example STD(stId, stName, stAdr, prName, credits) stId stName, stAdr, prName, credits prName credits stIdstNamestAdrprNameprCrdts S1020Sohail DarI-8 IslamabadMCS64 S1038Shoaib AliG-6 IslamabadBCS132 S1015Tahira EjazL Rukh WahMCS64 S1015Tahira EjazL Rukh WahMCS132 S1018Arif ZiaE-8, Islamabad.BIT134 13

First Normal Form (1NF) A relation in which the intersection of each row and column contains one and only one value. A relation is in 1NF if and only if each attribute is single valued for each tuple. This means that each attribute in each row, or each cell of the table, contains only one value. No repeating fields or groups are allowed. 14

1NF Contd. There are two common approaches to achieve 1NF.  1 st Approach – Flattening: In the first approach, we remove the repeating groups by entering appropriate data in the empty columns of rows containing the repeating data. With this approach, redundancy is introduced into the resulting relation, which is subsequently removed during the normalization process.  2 nd Approach – Decomposition: In the second approach, we remove the repeating group by placing the repeating data along with a copy of the original key attributes in a separate relation. A primary key is identified for the new relation. 15

Un-normalized Table stIdstNamestAdrMajorCourseIDCourseTitleInstNameInstLocGrade S101SohailIslamabadIT CS101, CS102, CS 421 IT Basics, Programming, db Saima, Ahmad, Irum Swabi, Peshawar, Kohat B, C, B S102ShoaibIslamabadHR Mgt11, Mgt12 Mgt Basics, HRM Nadia, Maira Swabi, Peshawar A, B S103TahirPeshawarFinance Mgt 11, Mgt 13 Mgt Basics, Accounting Nadia, Rabbia Swabi, Lahore B, B S104KamranKohatIT CS101, CS 202 IT Basics, Op. Systems Saima, Tamleek Swabi, Peshawar A, B S105ArifPeshawarITCS503NetworkingZahidCharsaddaC 16

Relations after applying 1NF Relation1 (stID, stName, stAdr, Major) Relation2 (stID, CourseID, CourseTitle, InstName, InstLoc, Grade) stIdstNamestAdrMajor S101SohailIslamabadIT S102ShoaibIslamabadHR S103TahirPeshawarFinance S104KamranKohatIT S105ArifPeshawarIT Relation 1 17

Relations after applying 1NF stIdCourseIDCourseTitleInstNameInstLocGrade S101CS101IT BasicsSaimaSwabiB S101CS102ProgrammingAhmad,Peshawar,C S101CS 421dbIrumKohatB S102Mgt11,Mgt BasicsNadiaSwabiA S102Mgt12HRMMairaPeshawarB S103Mgt 11Mgt BasicsNadiaSwabiB S103Mgt 13AccountingRabbiaLahoreB S104CS101IT BasicsSaimaSwabiA S104CS 202Op. SystemsTamleekPeshawarB S105CS 503NetworkingZahidCharsaddaC Relation 2 18

Problems in 1NF Update Problem: If we want to change courseTitle from ‘IT Basics’ to ‘Intro. To IT’, we must bring this change in every record in which this course title appears. Thus updating can be a lengthy process. Insertion Problem: If we want to add a new course, we cannot do so until a student is enrolled in that course, because both stId and CourseId are primary key in relation2. Deletion Problem: If we delete student with id S105, the information about course with ID CS503 will also be deleted. 19

Second Normal Form 20

Full Functional Dependency 2NF is based on the concept of full functional dependency. Definition: Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A. A functional dependency is full functional dependency if removal of any attribute from A results in the dependency not being sustained any more. Partial Functional Dependency: If A and B are attributes of R, B is partially dependent on A if B is dependent on any proper subset of A. If primary key is not composite key, then there is no chance of partial functional dependency. 21

Example: Full Functional Dependency stID, CourseID CourseName stID, CourseID InstName stID, stName Major stID Major stName Major stID, CourseID Grade 22

Second Normal Form (2NF) 2NF applies to relations with composite keys, that is, relations with a primary key composed of two or more attributes. A relation with a single attribute primary key is automatically in at least 2NF. A relation that is in first normal form and every non- primary key attribute is fully functionally dependent on the primary key. The normalization from 1NF relations to 2NF involves the removal of partial dependencies. If a partial dependency exists, we remove the functionally dependent attributes from the relation by placing them in a new relation along with a copy of their determinant. 23

Example – 2NF Relation1 is in 2NF because of non-composite primary key. Relation2 has following functional dependencies:  stID, CourseID Grade (FFD)  CourseID CourseTitle (PD)  CourseID InstName (PD)  CourseID InstLoc (PD) 24

Result of applying 2NF Relation1 (stID, stName, stAdr, Major) Relation2 (stID, CourseID, Grade) Relation3 (CourseID, CourseTitle, InstName, InstLoc) stIdCourseIDGrade S101CS101B S101CS102C S101CS 421B S102Mgt11,A S102Mgt12B S103Mgt 11B S103Mgt 13B S104CS101A S104CS 202B S105CS 503C Relation 2 25

Result of applying 2NF CourseIDCourseTitleInstNameInstLoc CS101IT BasicsSaimaSwabi CS102ProgrammingAhmad,Peshawar, CS 421dbIrumKohat Mgt11,Mgt BasicsNadiaSwabi Mgt12HRMMairaPeshawar Mgt 13AccountingRabbiaLahore CS 202Op. SystemsTamleekPeshawar CS 503NetworkingZahidCharsadda Relation 3 26

Steps for converting 1NF to 2NF The process for transforming a 1NF table to 2NF is: Identify any determinants other than the composite key, and the columns they determine. Create and name a new table for each determinant and the unique columns it determines. Move the determined columns from the original table to the new table. The determinant becomes the primary key of the new table. Delete the columns you just moved from the original table except for the determinant which will serve as a foreign key. The original table may be renamed to maintain semantic meaning. 27

Analysis of 2NF Update Problem: If we want to change courseTitle from ‘IT Basics’ to ‘Intro. To IT’, we can easily do so as each course title appears only once in Relation 3. Insertion Problem: If we want to add a new course, we can easily do so in Relation 3, irrespective of whether a student is enrolled in it or not. Deletion Problem: If we delete student with id S105, the information about course with ID CS503 wont be deleted now. But if we delete a row from Relation3, we may loose data regarding an instructor. So deletion anomaly remains in 2NF. 28

Third Normal Form 29

Transitive Dependency Third normal form is based on transitive dependency. Transitive Dependency is a condition where A, B and C are attributes of a relation such that if A  B and B  C, then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or C). It is a type of functional dependency. 30

Example: Transitive Dependency CourseID InstName InstName InstLoc So CourseID InstLoc 31

Third Normal Form A relation that is in 1NF and 2NF and in which no non- primary-key attribute is transitively dependent on the primary key. If a transitive dependency exists, we remove the transitively dependent attribute(s) from the relation by placing the attribute(s) in a new relation long with a copy of the determinant. 32

Example – 3NF Relation1 and Relation2 are in 3NF. Relation3 has transitive dependency shown on slide

Result of applying 3NF Relation1 (stID, stName, stAdr, Major) Relation2 (stID, CourseID, Grade) Relation3 (CourseID, CourseTitle, InstName) Relation4 (InstName, InstLoc) 34

Result of applying 3NF 35 CourseIDCourseTitleInstName CS101IT BasicsSaima CS102ProgrammingAhmad, CS 421dbIrum Mgt11,Mgt BasicsNadia Mgt12HRMMaira Mgt 13AccountingRabbia CS 202Op. SystemsTamleek CS 503NetworkingZahid Relation 3 Relation 4 InstNameInstLoc SaimaSwabi Ahmad,Peshawar, IrumKohat NadiaSwabi MairaPeshawar RabbiaLahore TamleekPeshawar ZahidCharsadda

Steps for converting 2NF to 3NF The process of transforming a table into 3NF is: Identify any determinants, other than primary key, and the columns they determine. Create and name a new table for each determinant and the unique columns it determines. Move the determined columns from the original table to the new table. The determinant becomes the primary key of the new table. Delete the columns you just moved from the original table except for the determinant which will serve as a foreign key. The original table may be renamed to maintain semantic meaning. 36

Analysis of 3NF Deletion anomaly no more exists. If we delete a row from Relation3, it will not affect data bout instructors as there is now an individual table (Relation4) containing data about instructors. 37

Exercise Normalize the following relations. WORK (projName, projMgr, empId, hours, empName, budget, startDate, salary, empMgr, empDept) PROPERTY(PropertyNo, Paddress, InspDate, InspTime, Staff_No, Sname, Comments) CUSTOMER ORDER (CustName, OrderNo, ProdNo, ProdDesc, Qty, CustAddress, DateOrdered) Assumption: A customer can have multiple orders but an order can be for only 1 product. CustName and OrderNo preassigned as keys. 38

Boyce-Codd Normal Form 39

Boyce-Codd Normal Form Boyce-Codd normal form (BCNF) is based on functional dependencies that take into account all candidate keys in a relation. Definition: A relation is in BCNF, if and only if, every determinant is a candidate key. 40

Difference of 3NF and BCNF The difference between 3NF and BCNF is that for a functional dependency A  B, 3NF allows this dependency in a relation if B is a primary key attribute and A is not a candidate key, whereas BCNF insists that for this dependency to remain in a relation, A must be a candidate key. Therefore, BCNF is a stronger form of 3NF, such that every relation in BCNF is also in 3NF. However, a relation in 3NF is not necessarily in BCNF. BCNF differs from 3NF only when there are more than one candidate keys and the keys are composite and overlapping. 41

Violating BCNF Violation of BCNF is quite rare, since it may only happen under specific conditions. The potential to violate BCNF may occur in a relation that:  Contains two (or more) composite candidate keys;  The candidate keys overlap, that is have at least one attribute in common. 42

Example- BCNF Maj_GPAAdvisorMajorSID 4.0HammadPhysics KamranMaths LatifLiterature AsifMaths HammadPhysics678 Relation with sample data Maj_GPAAdvisorMajorSID Functional dependencies

Example- BCNF (Contd.) The primary key for this relation is the composite key consisting of SID and Major. Thus the two attributes Advisor and Maj_GPA are functionally dependent on this key. This reflects the constraint that although a given student may have more than one major, for each major a student has exactly one advisor and one GPA. There is a second functional dependency in this relation: Major is functionally dependent on Advisor. That is, each advisor advises in exactly one major. That means that a key attribute (Major) is functionally dependent on a non-key attribute (Advisor).

Example- BCNF (Anomalies) The above relation is clearly in 3NF, since there are no partial functional dependencies and no transitive dependencies. Because of the functional dependency between Major and Advisor, there are anomalies in this relation. Update Problem: Suppose that in Physics the advisor Hammad is replaced by Adil. This change must be made in two (or more) rows in the table. Insertion Problem: Suppose we want to insert a row with the information that Ali advises in IT. This cannot be done until at least one student majoring in IT is assigned Ali as an advisor. Deletion Problem: If student number 789 withdraws from school, we loose the information that Asif advises in Maths.

Converting a relation to BCNF A relation that is in 3NF can be converted to relations in BCNF using a simple two-step process. In the first step, the relation is modified so that the determinant in the relation that is not a candidate key becomes a component of the primary key of the revised relation. The attribute that is functionally dependent on that determinant becomes a non-key attribute. The second step is to decompose the relation to eliminate the partial functional dependency.

Result of applying BCNF 47 Maj_GPAAdvisorSID MajorAdvisor

Result of applying BCNF (Contd.) 48 Maj_GPAAdvisorSID 4.0Hammad Kamran Latif Asif Hammad678 Relation1 MajorAdvisor PhysicsHammad MathsKamran LiteratureLatif MathsAsif Relation2

49

50

51

52

53

54

55

56