Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.

Slides:



Advertisements
Similar presentations
Normalisation The theory of Relational Database Design.
Advertisements

Chapter 5 Normalization of Database Tables
Systems Development Life Cycle
Fundamentals, Design, and Implementation, 9/e Chapter 4 The Relational Model and Normalization.
The Database Approach u Emphasizes the integration of data across the organization.
© 2005 by Prentice Hall Chapter 3a Database Design Modern Systems Analysis and Design Fourth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Normalization of Database Tables
1 © Prentice Hall, 2002 Chapter 5: Logical Database Design and the Relational Model Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B.
Normalization I.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Chapter 5 Normalization of Database Tables
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
TM 6-1 Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Logical Database Design and the Relational Database Professor Chen.
Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from Update anomalies : Insertions Deletions Modification.
Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall 9.1.
Week 6 Lecture Normalization
Lecture 12 Inst: Haya Sammaneh
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Concepts and Terminology Introduction to Database.
Copyright, Harris Corporation & Ophir Frieder, Normal Forms “Why be normal?” - Author unknown Normal.
Normalization. 2 Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification of various types of update anomalies.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Database Systems: Design, Implementation, and Management Tenth Edition
Concepts of Database Management, Fifth Edition
Database Management COP4540, SCS, FIU Relation Normalization (Chapter 14)
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
The Relational Model and Normalization R. Nakatsu.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
資料庫正規化 Database Normalization 取材自 AIS, 6 th edition By Gelinas et al.
Logical Database Design Relational Model. Logical Database Design Logical database design: process of transforming conceptual data model into a logical.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
Chapter 7 1 Database Principles Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, Modified by Dr. Mathis 3-1 David M. Kroenke’s Chapter Three: The Relational.
Normalization Ioan Despi 2 The basic objective of logical modeling: to develop a “good” description of the data, its relationships and its constraints.
Object-Relational Modeling. What Is a Relational Data Model? Based on the concept of relations (tables of data) Relationships established by matching.
Unit 4 Object Relational Modeling. Key Concepts Object-Relational Modeling outcomes and process Relational data model Normalization Anomalies Functional.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 10 Normalization Pearson Education © 2009.
Database Design – Lecture 8
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Relational Model & Normalization Relational terminology Anomalies and the need for normalization Normal forms Relation synthesis De-normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Chapter 10 Designing Databases. Objectives:  Define key database design terms.  Explain the role of database design in the IS development process. 
1 © Prentice Hall, 2002 ITD1312 Database Principles Chapter 4B: Logical Design for Relational Systems -- Transforming ER Diagrams into Relations Modern.
Brian Thoms.  Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain.
Logical Database Design and the Relational Model.
6-1 © Prentice Hall, 2007 Topic 6: Object-Relational Modeling Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
Chapter 8: Object-Relational Modeling Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich, Jeffrey A. Hoffer.
8-1 © Prentice Hall, 2007 Chapter 8: Object-Relational Modeling Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
RELATIONAL TABLE NORMALIZATION. Key Concepts Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive.
Lecture 4: Logical Database Design and the Relational Model 1.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Objectives of Normalization  To create a formal framework for analyzing relation schemas based on their keys and on the functional dependencies among.
Logical Database Design and Relational Data Model Muhammad Nasir
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 4: PART C LOGICAL.
Database Normalization. What is Normalization Normalization allows us to organize data so that it: Normalization allows us to organize data so that it:
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Logical Design & the Relational Model
Normalization.
Functional Dependency and Normalization
Normalization Karolina muszyńska
Chapter 5: Logical Database Design and the Relational Model
Example Question–Is this relation Well Structured? Student
Normalization.
Presentation transcript:

Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second normal form (2NF) Transitive functional dependency Third normal form (3NF) Practical consideration

Contains a minimum amount of redundancy Allows users to modify, insert and delete the rows in a table without errors or inconsistencies EMPLOYEE1 - Emploee1 is a well structured relation. - Any modification to an employee’s data such as a change in salary, is confined to one row of the table. A well structured relation

- This table has a considerable amount of redundancy e.g. EMP ID, NAME, DEPT, and SALARY appear in two separate rows for some employees - If the salary of those employees change, we must record this information in two or more rows. - Therefore, this is not a well structured relation. EMLOYEE2 Is this a well structured relation?

Redundancies in a table may result in errors and inconsistencies (called anomalies) when a use attempts to update the data in the table Three types of anomalies –Insertion anomaly –Deletion anomaly –Modification anomaly Why minimize redundancies?

If we want to add a new employee to EMPLOYEE2, the user must supply values for EMPID and COURSE. This is because the primary key values cannot be NULL. In reality, employee should be able to enter employee data without supplying course data Insertion anomaly

Deletion anomaly If the data for employee number 234 is deleted, we will also lose the information that this employee completed the course 111. In fact, we lose information about the course altogether. Modification anomaly Suppose that employee number 100 gets a salary increase, we must record this increase in each of the rows for that employee. Otherwise the data will be inconsistent.

Normalization is a process for converting complex data structures into simple, stable data structures (E.Codd 1970) The objectives of the normalization process are: –to eliminate certain kinds of data redundancy, –to avoid certain anomalies. Normalization is accomplished in stages. A normal form is a state of a relation that can be determined by applying simple rules regarding dependencies (or relationships between attributes) Normalization

Every attribute in each record contains only one value, i.e. a table contains NO REPEATING GROUPS! A relation is already (at least) in 1NF A table with repeating groups is converted to a relation in first normal form by: extending the data in each column to fill the cells that are empty because of the repeating groups structures. First Normal Form (1NF)

Student(Student_ID, Sname,GPA,CourseID,Cname,InstructorID,Iname)

Functional dependence A functional dependency is a particular relationship between two attributes For any relation R, the attribute B is functionally dependent on A if for every instance of A, that value of A uniquely determines the value of B. Represented as A -> B Normalization is based on the analysis of functional dependence

Examples of functional dependency SSN -> NAME, ADDRESS, BIRTHDATE A person’s name, address and birthdate are functionally dependent on that person’s social security number. VIN -> MAKE, MODEL, COLOR The make, model and color of a vehicle are functionally dependent on the vehicle identification number ISBN -> TITLE The title of a book is functionally dependent on the book’s international standard book number (ISBN)

The attribute on the left hand side of the arrow in a functional dependency is called a determinant. e.g. SSN,VIN, ISBN are determinants Important! Instances (or sample data) in a relation do not prove that a functional dependency exists. Only knowledge of the problem domain is a reliable method for identifying a functional dependency Determinant

EMPLOYEE2 = (EMPID, NAME, DEPT, SALARY,COURSE, DATE COMPLETED) Functional dependencies: EMPID -> NAME,DEPT, SALARY EMPID, COURSE -> DATE COMPLETED Therefore the only candidate key (and hence primary key) is a combination of EMPID and COURSE

EMPLOYEE2 (EMPID, NAME, DEPT, SALARY,COURSE, DATE COMPLETED) A composite key is a primary key that contains more than one attribute. EMPID is a determinant but not a candidate key. A candidate key is always a determinant But a determinant is not always a candidate key.

Partial functional dependency A functional dependency A-> B is a partial dependency, if B is functionally dependent on A and also functionally dependent on any proper subset of A. We check partial dependency if we have a composite key. EMPLOYEE2= (EMPID,NAME,DEPT,SALARY, COURSE, DATE COMPLETED) The functional dependencies are: EMPID,COURSE -> DATE COMPLETED, EMPID -> NAME, DEPT, SALARY

Second Normal Form (2NF) A relation is in second normal form if: - It is in first normal form, and - every nonkey attribute is functionally dependent on part (but not all) of the primary key, i.e. no partial functional dependency. The conditions of 2NF - The primary key consists of only one attribute, - No nonkey attributes exist, or - Every nonkey attribute is functionally dependent on the full set of primary key attributes.

Problems created by partial functional dependencies? Insertion anomaly –To insert a row, we must provide values for both EMPID and COURSE Deletion anomaly –If we delete a row for an employee, we lose the information that the employee completed a course on a particular date Modification anomaly –If an employee’s salary changes, we must record this change in multiple rows (if the employee completed more than one course)

Removing partial dependencies If a relation is not in 2NF, it can be further normalized into a number of 2NF relations in which nonkey attributes are associated only with the part of the primary key on which they are fully functionally dependent. EMPLOYEE2 = (EMPID,NAME,DEPT,SALARY,COURSE, DATE COMPLETED) EMPID,COURSE->DateCompleted and EMPID->Name, Dept, Salary EMPLOYEE (EMPID,NAME,DEPT,SALARY) EMPCOURSE (EMPID, COURSE, DATE COMPLETED)

Transitive dependency A functional dependency between two (or more) nonkey attributes. A set of attributes Y that is not a subset of the primary key of R, and both X->Y and Y->Z hold, i.e. X->Y and Y->Z, then X->Z. E.g. STUDENT NUMBER -> MAJOR and MAJOR -> ADVISOR then STUDENT NUMBER ->ADVISOR

Transitivity dependency Pseudotransitivity Rule: If X->Y and YZ->W, then XZ->W e.g. STUDENT NUMBER->MAJOR and MAJOR,CLASS->ADVISOR, then STUDENT NUMBER, CLASS->ADVISOR

Third Normal Form (3NF) To eliminate the anomalies caused by the presence of transitive dependencies in a relation. If a relation is in 3NF, it is also in second normal form and no transitive dependencies exist. 3NF normalization: the nonkey attributes connected by each functional dependency which causes the transitive functional dependency become a new relation.

Sales SALES(CUST_NO,NAME, SALESPERSON, REGION) Functional dependencies: CUST_NO -> NAME, SALESPERSON, REGION SALESPERSON -> REGION (Each salesperson is assigned to a unique region)

Insertion Anomaly: A new salesperson Robinson assigned to the North region cannot be entered until a customer has been assigned. Deletion Anomaly: If Customer Number 6577 is deleted from the relation, we lose the information that Hernandez is assigned to the East region Modification anomaly: If salesperson Smith is reassigned to the East region, several rows must be changed to reflect that fact. Anomalies with Sales

Removing transitive dependencies The transitive dependencies can be removed by: Decomposing SALES into two relations: SALES1: (CUST NO, NAME, SALESPERSON) SPERSON (SALESPERSON, REGION) The determinant in the transitive dependency in SALES (i.e. SALESPERSON dependency) becomes primary key in the SPERSON & foreign key in the SALES1 relation

Transitive dependency can occur between sets of attributes in a relation. E.g. SHIPMENT (SNUM, ORIGIN, DESTINATION, DISTANCE) Functional dependencies: SNUM -> ORIGIN, DESTINATION, DISTANCE ORIGIN, DESTINATION -> DISTANCE Transitive dependency between sets of attributes

Identify the insertion anomaly Identify the deletion anomaly Identify the modification anomaly

SHIPMENT1 (SNUM, ORIGIN, DESTINATION) OD_DISTANCE (ORIGIN, DESTINATION, DISTANCE) Relations in 3NF

ER Model and Third Normal Form (3NF) In general, if we have a “good” ER model and convert this model to relation schemes according to the transformation rules, we can get the relations with 3NF.

Additional Normal Forms Relations in third normal form are sufficient for most practical database applications However, 3NF does not guarantee that all anomalies have been removed. There are additional normal forms to remove them: Boyce-Codd Normal Form Fourth Normal Form Fifth Normal Form Domain Key Normal Form

Steps in Normalization Table with repeating groups First normal form (1NF) Second normal form (2NF) Third normal form (3NF) Boyce-Codd normal form (BCNF) Fourth normal form (4NF) Fifth normal form (5NF) Remove repeating groups Remove transitive dependencies Remove multivalued dependencies Remove partial dependencies Remove remaining anomalies resulting from functional dependencies Remove remaining anomalies