RELATIONAL TABLE NORMALIZATION. Key Concepts Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive.

Slides:



Advertisements
Similar presentations
Shantanu Narang.  Background  Why and What of Normalization  Quick Overview of Lower Normal Forms  Higher Order Normal Forms.
Advertisements

Normalization of Database Tables Special adaptation for INFS-3200
Normalization of Database Tables
Fundamentals, Design, and Implementation, 9/e Chapter 4 The Relational Model and Normalization.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day 5.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design: Normalization.
Normalization of Database Tables
1 © Prentice Hall, 2002 Chapter 5: Logical Database Design and the Relational Model Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B.
Normalization of Database Tables
Normalization of Database Tables
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 5 The Relational Model and Normalization.
Chapter 5 Normalization of Database Tables
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design 1: Normalization.
NORMALIZATION N. HARIKA (CSC).
Chapter 3 The Relational Model and Normalization
Chapter 5 Normalization of Database Tables
Lecture 12 Inst: Haya Sammaneh
Concepts and Terminology Introduction to Database.
Relational databases and third normal form As always click on speaker notes under view when executing to get more information!
Copyright, Harris Corporation & Ophir Frieder, Normal Forms “Why be normal?” - Author unknown Normal.
CBAD2103 Data Analysis and Modeling. Chapter 7 Conceptual Design Methodology.
Chapter 5 The Relational Model and Normalization David M. Kroenke Database Processing © 2000 Prentice Hall.
Fundamentals, Design, and Implementation, 9/e. Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/2 Copyright.
CMPE 226 Database Systems September 16 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 4 This material was developed by Oregon Health & Science.
Avoiding Database Anomalies
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Database Systems: Design, Implementation, and Management Tenth Edition
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Chapter 4 The Relational Model and Normalization.
Concepts of Database Management, Fifth Edition
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
The Relational Model and Normalization R. Nakatsu.
1 DATABASE SYSTEMS DESIGN IMPLEMENTATION AND MANAGEMENT INTERNATIONAL EDITION ROB CORONEL CROCKETT Chapter 7 Normalisation.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Database Normalization Lynne Weldon July 17, 2000.
Logical Database Design Relational Model. Logical Database Design Logical database design: process of transforming conceptual data model into a logical.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
Chapter 7 1 Database Principles Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, Modified by Dr. Mathis 3-1 David M. Kroenke’s Chapter Three: The Relational.
The Relational Model and Normalization The Relational Model Normalization First Through Fifth Normal Forms Domain/Key Normal Form The Synthesis of Relations.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
System Design System Design - Mr. Ahmad Al-Ghoul System Analysis and Design.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Component 4/Unit 6d Topic IV: Design a simple relational database using data modeling and normalization Description and Information Gathering Data Model.
Database Principles: Fundamentals of Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables Carlos Coronel, Steven.
©NIIT Normalizing and Denormalizing Data Lesson 2B / Slide 1 of 18 Objectives In this section, you will learn to: Describe the Top-down and Bottom-up approach.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Data Analysis Improving Database Design. Normalization The process of transforming a data model into a flexible, stable structure. Reduces anomalies Anomaly.
Relational Model & Normalization Relational terminology Anomalies and the need for normalization Normal forms Relation synthesis De-normalization.
Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/1 Copyright © 2004 Please……. No Food Or Drink in the class.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Logical Database Design and the Relational Model.
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Lecture 4: Logical Database Design and the Relational Model 1.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Logical Database Design and Relational Data Model Muhammad Nasir
Microsoft Access CS 110 Fall Entity Relationship Model Entities Entities Principal data object about which information is to be collectedPrincipal.
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Normalization Karolina muszyńska
Relational Model and ER Model: in a Nutshell
Database solutions Chosen aspects of the relational model Marzena Nowakowska Faculty of Management and Computer Modelling Kielce University of Technology.
Presentation transcript:

RELATIONAL TABLE NORMALIZATION

Key Concepts Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive dependency Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive dependency

Key Concepts (cont’d) Multivalued dependency First normal form (1NF) Second normal form (2NF) Third normal form (3NF) Fourth normal form (4NF) Domain key normal form (DKNF) Multivalued dependency First normal form (1NF) Second normal form (2NF) Third normal form (3NF) Fourth normal form (4NF) Domain key normal form (DKNF)

Guidelines for Primary Keys Guideline 1 –The domain of the primary key should be large enough to accommodate the identification of unique rows for the next 100 years Guideline 2 –Primary keys should be a unique random collection of alphabetic, numeric or alphanumeric characters Guideline 1 –The domain of the primary key should be large enough to accommodate the identification of unique rows for the next 100 years Guideline 2 –Primary keys should be a unique random collection of alphabetic, numeric or alphanumeric characters

Guidelines for Primary Keys (cont’d) Guideline 3 –Avoid using smart keys. Primary keys should not contain “fact giving” data. If these facts are necessary, they should be entity attributes Guideline 4 –Use the suffix ID in constructing primary key names (CUST_ID, Vendor_ID, etc.) Guideline 3 –Avoid using smart keys. Primary keys should not contain “fact giving” data. If these facts are necessary, they should be entity attributes Guideline 4 –Use the suffix ID in constructing primary key names (CUST_ID, Vendor_ID, etc.)

Data Anomalies Definition of “anomaly”: –Deviation or departure from the normal or common order, form, or rule. –An item that is peculiar, irregular, abnormal, or difficult to classify Definition of “anomaly”: –Deviation or departure from the normal or common order, form, or rule. –An item that is peculiar, irregular, abnormal, or difficult to classify

Deletion Anomaly Occurs when the removal of a record results in a lost of important information For example, if all the information about a customer is contained in the ORDER table, deleting an order also deletes customer information See Recycled Tractor problem for example Occurs when the removal of a record results in a lost of important information For example, if all the information about a customer is contained in the ORDER table, deleting an order also deletes customer information See Recycled Tractor problem for example

Update Anomaly Occurs when multiple record changes for a single attribute are necessary when a change to only one record in a database should be necessary. Example: an evaluator at Recycled Tractor changes his/her cell phone number Occurs when multiple record changes for a single attribute are necessary when a change to only one record in a database should be necessary. Example: an evaluator at Recycled Tractor changes his/her cell phone number

Insertion Anomaly Occurs when there does not appear to be any reasonable place to assign attributes and attribute values to records in a database Two types of insertion anomalies: –Type 1: Adding new attributes to a record –Type 2: Updating only part of a record Occurs when there does not appear to be any reasonable place to assign attributes and attribute values to records in a database Two types of insertion anomalies: –Type 1: Adding new attributes to a record –Type 2: Updating only part of a record

Insertion Anomaly (cont’d) Type 1 example: –Adding Recycled Tractor evaluator’s home address and phone number to the database Type 1 example: –Adding Recycled Tractor evaluator’s home address and phone number to the database

Insertion Anomaly (cont’d) Type 2 example: –Essence of the Insertion Anomaly problem: when to enter values into the database Assign the new Recycled Tractor evaluator to a new dummy lead Or, add new evaluator to all records in LEAD database – can result in lots of null values Type 2 example: –Essence of the Insertion Anomaly problem: when to enter values into the database Assign the new Recycled Tractor evaluator to a new dummy lead Or, add new evaluator to all records in LEAD database – can result in lots of null values

Eliminating Data Anomalies Normalization facilitates the removal of data anomalies Basic rule of normalization: –The attribute values in a relational table should be functionally dependent on the primary key value Normalization facilitates the removal of data anomalies Basic rule of normalization: –The attribute values in a relational table should be functionally dependent on the primary key value

Eliminating Data Anomalies (cont’d) Corollaries to the basic rule: –No repeating groups are allowed in relational tables –A relational table cannot have attributes involved in a transitive dependency with the primary key Corollaries to the basic rule: –No repeating groups are allowed in relational tables –A relational table cannot have attributes involved in a transitive dependency with the primary key

Eliminating Data Anomalies (cont’d) The different types of dependencies are critical to understanding and executing the normalization process One of the primary responsibilities of the database designer is to formalize data relationships by identifying the dependencies among the attributes The different types of dependencies are critical to understanding and executing the normalization process One of the primary responsibilities of the database designer is to formalize data relationships by identifying the dependencies among the attributes

Functional Dependency A functionally dependent relationship exists between two attributes when one attribute value implies or determines the value for the other attribute Example: the value LEAD_NAME determines value of LEAD_BANK in the Recycled Tractor problem A functional dependency can be reciprocal –Social Security # and Name of person A functionally dependent relationship exists between two attributes when one attribute value implies or determines the value for the other attribute Example: the value LEAD_NAME determines value of LEAD_BANK in the Recycled Tractor problem A functional dependency can be reciprocal –Social Security # and Name of person

Transitive Dependency (TD) Occurs when a nonkey attribute value is functionally dependent on another nonkey attribute value that is not a candidate key

Transitive Dependency (cont’d) Example: –EMPLOYEE (EMPLOYEE_ID, CATEGORY, HOURLY_RATE) If JOB_CATEGORY = SUPERVISOR –Then HOURLY_RATE is $25.00 per hour if JOB_CATEGORY = WELDER –Then HOURLY_RATE is $18.00 per hour HOURLY_RATE is dependent on JOB_CATEGORY Example: –EMPLOYEE (EMPLOYEE_ID, CATEGORY, HOURLY_RATE) If JOB_CATEGORY = SUPERVISOR –Then HOURLY_RATE is $25.00 per hour if JOB_CATEGORY = WELDER –Then HOURLY_RATE is $18.00 per hour HOURLY_RATE is dependent on JOB_CATEGORY

Multi-Valued Dependency (MVD) Results from having multiple values for a particular attribute Three types of MVD’s –Simple –Independent –Transitive Results from having multiple values for a particular attribute Three types of MVD’s –Simple –Independent –Transitive

Simple MVD Similar to 1:N cardinality (one to many) Most common type of MVD Examples –A student can register for many courses –LEAD_ID functionally determines many values for TRACTOR_ID Similar to 1:N cardinality (one to many) Most common type of MVD Examples –A student can register for many courses –LEAD_ID functionally determines many values for TRACTOR_ID

Independent and Transitive MVD’s Both types involve three or more attributes Usually eliminated by first three normal forms Both types involve three or more attributes Usually eliminated by first three normal forms

First Normal Form (1NF) A relational table is in first normal form if no attributes form repeating groups Repeating group attributes are removed by creating another table In the Recycled Tractor problem, tractor attributes are removed from LEAD and placed in the TRACTOR table, and EVALUATOR attributes are placed in the EAVLUATOR table A relational table is in first normal form if no attributes form repeating groups Repeating group attributes are removed by creating another table In the Recycled Tractor problem, tractor attributes are removed from LEAD and placed in the TRACTOR table, and EVALUATOR attributes are placed in the EAVLUATOR table

Second Normal Form (2NF) A relational table is in second normal form when all nonkey attributes are functionally dependent on the primary key Only tables with concatenated (composite) keys will a problem in meeting the 2NF requirement Does our new EVALUATOR table meet 2NF requirements? A relational table is in second normal form when all nonkey attributes are functionally dependent on the primary key Only tables with concatenated (composite) keys will a problem in meeting the 2NF requirement Does our new EVALUATOR table meet 2NF requirements?

Third Normal Form (3NF) A relational table is in third normal form when –it is in second normal form –no attribute has a transitive dependency involving nonkey attributes In the Recycled Tractor problem, TRACKER_PHONE# is functionally dependent on TRACKER_NAME, which is functionally dependent on LEAD_ID Boyce-Codd normal form adds requirement that all attribute determinants are also candidate keys A relational table is in third normal form when –it is in second normal form –no attribute has a transitive dependency involving nonkey attributes In the Recycled Tractor problem, TRACKER_PHONE# is functionally dependent on TRACKER_NAME, which is functionally dependent on LEAD_ID Boyce-Codd normal form adds requirement that all attribute determinants are also candidate keys

Fourth Normal Form (4NF) A relational table is in fourth normal form when all multivalued dependencies have been removed In most situations, normalizing tables to third normal form removes multivalued dependencies A relational table is in fourth normal form when all multivalued dependencies have been removed In most situations, normalizing tables to third normal form removes multivalued dependencies

Domain-Key Normal Form ( DKNF) DKNF is a philosophy that focuses on developing themes for tables –A student table contains attributes describing students A relational table is in DKNF if every constraint on the table or file is the result of defining primary keys for a relational table and defining domains for the attributes Examples of data constraints: –Edit rules for attributes –Relationships of attributes –Functional and multi-valued dependencies DKNF is a philosophy that focuses on developing themes for tables –A student table contains attributes describing students A relational table is in DKNF if every constraint on the table or file is the result of defining primary keys for a relational table and defining domains for the attributes Examples of data constraints: –Edit rules for attributes –Relationships of attributes –Functional and multi-valued dependencies

Comments on Normalization The benefits of additional levels of normalization decrease rapidly after tables have been put in 3NF The instances where higher-level normalization strategies are necessary are considered rare and theoretical The benefits of additional levels of normalization decrease rapidly after tables have been put in 3NF The instances where higher-level normalization strategies are necessary are considered rare and theoretical