Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 1- 1.

Slides:



Advertisements
Similar presentations
5 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
Advertisements

Chapter 5 Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Normalisation The theory of Relational Database Design.
Normalization of Database Tables Special adaptation for INFS-3200
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Normalization of Database Tables
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
CS263:Revision on Normalisation
The Relational Database Model:
Normalization of Database Tables
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Project and Data Management Software
Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Databases 6: Normalization
NORMALIZATION N. HARIKA (CSC).
Chapter 8 Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Normalization Rules for Database Tables Northern Arizona University College of Business Administration.
DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Week 6 Lecture Normalization
Modelling Techniques - Normalisation Description and exemplification of normalisation.Description and exemplification of normalisation. Creation of un-normalised.
Concepts and Terminology Introduction to Database.
Database Systems: Design, Implementation, and Management Tenth Edition
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Chapter 4 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
1 DATABASE SYSTEMS DESIGN IMPLEMENTATION AND MANAGEMENT INTERNATIONAL EDITION ROB CORONEL CROCKETT Chapter 7 Normalisation.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
In this chapter, you learn about the following: ❑ Anomalies ❑ Dependency and determinants ❑ Normalization ❑ A layman’s method of understanding normalization.
Chapter 7 1 Database Principles Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that.
CORE 2: Information systems and Databases NORMALISING DATABASES.
Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Further Normalization I
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
Chapter 10 Normalization Pearson Education © 2009.
Lecture 8: Database Concepts May 4, Outline From last lecture: creating views Normalization.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Normalization Hour1,2 Presented & Modified by Mahmoud Rafeek Alfarra.
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Normalisation FORM RULES 1NF 2NF 3NF. What is normalisation of data? The process of Normalisation organises your database to: Reduce or minimise redundant.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Normalization Karolina muszyńska
Normalization of Database Tables PRESENTED BY TANVEERA AKHTER FOR BCA 2ND YEAR dated:15/09/2015 DEPT. OF COMPUTER SCIENCE.
Chapter 6 Normalization of Database Tables
Normalization Dale-Marie Wilson, Ph.D..
Presentation transcript:

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 1- 1

Chapter 6: Functional Dependencies & Normalization Dr. Hassan Ismail Abdalla

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Objectives Normalisation is a technique for analyzing and modelling data within an organisation It aims to facilitate the use of shared information by reducing the amount of redundancy in stored data Data normalisation aims to derive record structures which avoid anomalies in Insertion (Occurs when it is impossible to store a fact until another fact is known) Deletion (Occurs when the deletion of a fact causes other still relevant facts to be deleted) Modification (Occurs when a change in a fact causes multiple modifications to be necessary) Data normalisation ensures single valuedness of facts

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe The Process of Normalisation Usually three steps (in industry) giving rise to First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) In academia Boyce -Codd Normal Form (BCNF) Fourth Normal Form (4NF) At each step we consider relationships between an entity's attributes These relationships are known as functional dependencies

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Steps in Data Normalisation UNORMALISED ENTITY step1... remove repeating groups 1st NORMAL FORM step2... remove partial dependencies 2nd NORMAL FORM remove indirect dependencies step4... step3... 3rd NORMAL FORM remove multi-dependencies 4th NORMAL FORM step4.. every determinate a key BOYCE-CODD NF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Relational Rules A number of rules are applied to the tables so that they can be manipulated and redundancy removed 1. The ordering of rows is not significant 2.The ordering of columns is not significant (column has a distinct name) 3.The intersection of each row/column can contain only one value, multiple values are not allowed 4. Each row in a table must be distinct The process of normalisation seeks to establish tables which conform to a further more rigorous set of rules.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Relational Rules The following table does not conform to the rules above Major Cost centre code Minor Cost centre code DescriptionAmount Sales, Marketing Stock Control Production Accounts Sales, Marketing Sales, Marketing

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Relational Rules Cont… In the above table there are number of ways in which the rules are broken: There are multiple values for description e.g. in row 1 Sales, Marketing. The ordering of rows is significant, the major cost code 0002 is intended to apply to rows 3 and 4 Rows 5 and 6 are the same.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Redundancy vs. Duplication It is important to distinguish between redundancy and duplicated data when considering normalisation Duplicated data exists when an attribute has two or more identical values in a table. Redundancy exist if data can be deleted without any information being lost. Redundancy may be viewed as unnecessary duplication

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Redundancy vs. Duplication If the part description on line 3 of the table is deleted, no information is lost since the description for part number p876 can still be determined from the table. Supplier No Part No Part Description s123 p876 fan belt s125 p873 master cylinder s125 p876 fan belt The redundancy can be eliminated by splitting the above table into two tables SupplierNo PartNo PartNo Part Description s123 p876 p876 fan belt s125 p873 p873 master cylinder s125 p876 It should be noted that no information is lost by representing the original table in the two separate tables

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Attributes - Identifiers v An entity identifier uniquely determines an occurence on the entity A Superkey - a combination of attributes that uniquely identify a row When more than one identifier exists we have Candidate identifiers (Keys) - minimal superkey Primary Key - designated Supplier# Supplier-name Supp-add SUPPLIER

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Functional Dependency A functional dependency is a constraint between two sets of attributes from the database B is functionally dependent on A if a value of A uniquely determines a value of B

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe More Examples of Functional Dependency

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Attributes - Repeating Groups When a group of attributes has multiple values then we say there is a repeating group of attributes in the entity v (BRANCH_NAME, BRANCH_ADDRESS) is a repeating group

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Repeating Groups Consider the situation where a customer makes a number of orders to a company. This may be represented in a table as follows: Customer No Customer Name Order No C123 Aldridge O678 C123 Aldridge O789 C123 Aldridge O791 C131 Archer O649 C131 Archer O682 C151 Grundy 0655

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Repeating Groups It can be seen that Order No may be repeated for a given customer. In this situation Order No is said to be a repeating group. In the above example Order No is the only attribute in the repeating group, this is not usually the case. It can be seen that there is redundant duplication present in the attribute Customer Name. The repeating group, and hence the redundancy, can be eliminated by splitting the table into two. In order to preserve the amount of information after the split, the two tables must share at least one attribute. In the example below Customer No is present in both tables.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Repeating Groups Splitting results in the following: CustomerNo CustomerName CustomerNo OrderNo C123 Aldridge C123 O678 C131 Archer C123 O789 C151 Grundy C123 O791 C131 O649 C131 O682 C

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Steps in Normalisation 1. List Data in an Unnormalised Table In this stage data items are extracted from the source and listed in a simple tabular format. Note the unnormalised tables does not conform to the table rules above. For example the following which represents the training record for a company EmpNo EmpName DeptNo DeptName CourseNo CourseName Rating 123 J Smith 21 Systems S2 SSADM Poor S3 dBaseIV Average S5 Data Anal Good 137 D Brown 23 Operations O1 JCL Good O9 Cobol Good 154 J Patel 21 Systems S2 SSADM Average

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Steps in Normalisation …Cont Select a key for the table. Where multiple attributes are necessary to uniquely identify a row, choose the compound key with the minimum number of attributes. EmpNo is selected in the above. 2. Remove Repeating Groups (First Normal Form) In this stage the attributes which are dependent on and repeat for another given attribute are separated into another table This is done by filling in the blank attribute values & then splitting the table Emp Emp Dept Dept Course Course Rating No Name No Name No Name 123 J Smith 21 Systems S2 SSADM Poor 123 J Smith 21 Systems S3 dBaseIV Average 123 J Smith 21 Systems S5 Data Anal Good 137 D Brown 23 Operations O1 JCL Good 137 D Brown 23 Operations O9 Cobol Good 154 J Patel 21 Systems S2 SSADM Average

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Steps in Normalisation …Cont On removing the repeating group the above Figure becomes: EmpNo EmpName DeptNo DeptName EmpNo CourseNo CourseName Rating 123 Smith 21 Systems 123 S2 SSADM Poor 137 D Brown 23 Operations123 S3 dBaseIV Average 154 J Patel 21 Systems 123 S5 Data Anal Good 137 O1 JCL Good 137 O9 Cobol Good (Fig 01) 154 S2 SSADM Average (Fig 02) The key EmpNo has been incorporated into the table containing the repeating group to preserve the overall information This step will have to be repeated for each repeating group in the table

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe First Normal Form Any un-normalised entity type is transformed to 1NF Remove all repeating attribute groups Repeating attribute groups become new entity types in their own right The identifier of the original entity type must be an attribute (but not necessarily an identifier) of the derived entity type. Any 'hidden' entities are identified

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Second Normal Form A relation is in 2NF if it is in 1NF and each non identifying attribute depends upon the whole identifier Remove Part Key Dependencies Attributes which are dependent on part of a compound key are put into a separate table along with that part of the compound key. In Fig 02 EmpNo and Course No together may be considered to be a compound key since both are required to identify a row in the table. Separating the attributes which are only concerned with Course No gives: EmpNo CourseNo Rating CourseNo CourseName 123 S2 Poor S2 SSADM 123 S3 Average S3 dBaseIV 123 S5 Good S5 Data Anal 137 O1 Good O1 JCL 137 O9 Good O9 Cobol 154 S2 Average (Fig 03)(Fig 04)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Third normal Form A relation is in 3NF if it is in 2NF and all non identifying attributes are independent A relation in 2NF is transformed in 3NF Determine functional dependencies between non identifying attributes Decompose relation into new relations

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Third Normal Form Remove Transitive Dependency and Inter-Key Dependency Separating attributes which are dependent on another attribute other than the primary key within the table Dependency between non-key attributes is known as ‘transitive dependency’ In Fig (01), it can be seen that Dept Name is dependent on Dept No. Splitting the table in (Fig 01) gives: EmpNo EmpName DeptNoDeptNo DeptName 123 J Smith Systems 137 D Brown Operations 154 J Patel 21 (Fig 05) (Fig 06)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Third Normal Form Note that Dept No is retained in the table in (Fig 05), to preserve the information content. Dept No is an example of a foreign key, since it is of one table and also a key of another table. Figures 03, 04, 05 and 06 together represent third normal form.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Table Type Notation The example above would give rise to the following Table Type Notation: Unnormalised Form Emp No, Emp Name, Dept No, Dept Name, (Course No, Course Name, Rating) First Normal Form Emp No, Emp Name, Dept No, Dept Name Emp No, Course No, Course Name, Rating Second Normal Form Emp No, Emp Name, Dept No, Dept Name(unchanged from1NF) Course No, Course Name Emp No, Course No, Rating

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Table Type Notation Third Normal Form Emp No, Emp Name, Dept No* Dept No, Dept Name Course No, Course Name (unchanged from 2NF) Emp No,Course No, Rating (unchanged from 2NF) Note the conventions employed; parenthesis for a repeating group, underlining for a key or compound key, and an asterisk for a foreign key.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Boyce-Codd Normal Form (BCNF) A relation is in BCNF if every determinant is a key For a relation with only one candidate key, 3NF and BCNF are equivalent Violation of BCNF is rare & may occur in a relations that Contains two (or more) candidate keys 3NF is concerned with FDs between primary key and the nonkey attributes and with transitive dependencies. A relation may still have redundancy problems with 3NF as it ignores relationships between or within candidate keys. The rule for producing tables in BCNF is that each determinant must be a candidate identifier.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Boyce-Codd Normal Form (BCNF) To achieve this, where a table contains a determinant which is not an identifier, the table is split into two. The non-identifying determinant is put into the new table along with those attributes which are dependent on it. For example considering a relation, Directory: (Employee_no, Emp_name, Dept_name, Room_no,Tel_no) Where: No employee works for more than one department Many employees may occupy one room Employee numbers are unique, names may not be No room is shared by between departments

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Boyce-Codd Normal Form (BCNF) Where the following FDs hold: Employee_no -> Emp_name, Dept_name, Room_no,Tel_no Room_no -> Dept_name Here all attributes are are dependent on Employee_no - the primary key Room_no is also a determinant but not a candidate key. This violates the definition of BCNF and therefore Directory table must be decomposed into two relations EMP (Employee_no, Emp_Name, Room_no,Tel_no) ALLOC (Room_no, Dept_name)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Advantages of Normalization Greater overall database organization will be gained The amount of unnecessary redundant data is reduced Data integrity is easily maintained within the database The database & application design processes are much more flexible Security is easier to manage

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Disadvantages of Normalization Produces lots of tables with a relatively small number of columns Probably requires joins in order to put the information back together in the way it needs to be used – effectively reversing the normalization Impacts computer performance (CPU, I/O, memory)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Conclusions Data Normalisation is a bottom-up technique that ensures the basic properties of the relational model no duplicate tuples no nested relations Data normalisation is often used as the only technique for database design - implementation view A more appropriate approach is to complement conceptual modeling with data normaliztion

END OF CHAPTER SIX