This presentation prepared for MIS 421 / MBA 575 at Western Washington University. Material in this presentation drawn from Richard T. Watson, Data Management:

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

Chapter 5 Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Normalization Dr. Mario Guimaraes. Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints.
Normalization and Other Data Modeling Methods There are many paths to the top of the mountain but the view is always the same Chinese proverb.
Normalization of Database Tables Special adaptation for INFS-3200
Fundamentals, Design, and Implementation, 9/e Chapter 4 The Relational Model and Normalization.
Accounting 6500 Relational Databases: Accounting Applications Introduction to Normalization.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day 5.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Data Modeling with ERD ISYS 363. Entity-Relationship Diagram An entity is a “thing” in the real world, such as a person, place, event for which we intend.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design: Normalization.
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 5 The Relational Model and Normalization.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design 1: Normalization.
NORMALIZATION N. HARIKA (CSC).
Part ( PartNum, Description, OnHand, Class, Warehouse, Price,
Chapter 3 The Relational Model and Normalization
This presentation prepared for MIS 421 / MBA 575 at Western Washington University. Material in this presentation drawn from Richard T. Watson, Data Management:
File and Database Design SYS364. Today’s Agenda WHTSA DBMS, RDBMS, SQL A place for everything and everything in its place. Entity Relationship Diagrams.
DBSQL 4-1 Copyright © Genetic Computer School 2009 Chapter 4 Database Design.
SQL Normalization Database Design Lecture 5. Copyright 2006Page 2 SQL Normalization Database Design 1 st Normal Form 1 st Normal Form 2 nd Normal Form.
A Guide to SQL, Eighth Edition Chapter Two Database Design Fundamentals.
Relational databases and third normal form As always click on speaker notes under view when executing to get more information!
Chapter 5 The Relational Model and Normalization David M. Kroenke Database Processing © 2000 Prentice Hall.
Fundamentals, Design, and Implementation, 9/e. Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/2 Copyright.
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 4 This material was developed by Oregon Health & Science.
Avoiding Database Anomalies
Normalization A technique that organizes data attributes (or fields) such that they are grouped to form stable, flexible and adaptive entities.
Chapter 4 The Relational Model and Normalization.
Concepts of Database Management Sixth Edition Chapter 5 Database Design 1: Normalization.
The Relational Model and Normalization R. Nakatsu.
1 A Guide to MySQL 2 Database Design Fundamentals.
Copyright Ó Oracle Corporation, All rights reserved. Normalization Use the student note section below for further explanation of the slide content.Use.
Chapter 5: Normalizing the DB. What to do with a bad database structure? How do we determine the right structure? How do we determine primary keys? Normalization.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
In this chapter, you learn about the following: ❑ Anomalies ❑ Dependency and determinants ❑ Normalization ❑ A layman’s method of understanding normalization.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, Modified by Dr. Mathis 3-1 David M. Kroenke’s Chapter Three: The Relational.
The Relational Model and Normalization The Relational Model Normalization First Through Fifth Normal Forms Domain/Key Normal Form The Synthesis of Relations.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
1 A Guide to MySQL 2 Database Design Fundamentals.
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
Customer Order Order Number Date Cust ID Last Name First Name State Amount Tax Rate Product 1 ID Product 1 Description Product 1 Quantity Product 2 ID.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
ITN Table Normalization1 ITN 170 MySQL Database Programming Lecture 3 :Database Analysis and Design (III) Normalization.
1 5 Chapter 5 Database Design 1: Some Normalization Examples Spring 2006.
Component 4/Unit 6d Topic IV: Design a simple relational database using data modeling and normalization Description and Information Gathering Data Model.
Normalization of Database Tables
Database Design – Lecture 9 Normalization Continued.
Data Analysis Improving Database Design. Normalization The process of transforming a data model into a flexible, stable structure. Reduces anomalies Anomaly.
Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/1 Copyright © 2004 Please……. No Food Or Drink in the class.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization MIS335 Database Systems. Why Normalization? Optimizing database structure Removing duplications Accelerating the instructions Data integrity!
Concepts of Database Management Seventh Edition Chapter 5 Database Design 1: Normalization.
Normalization and Other Data Modeling Methods There are many paths to the top of the mountain but the view is always the same Chinese proverb.
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 5 The Relational Model and Normalization.
Southern Methodist University CSE CSE 2337 Introduction to Data Management Chapter 5 Part II.
Normalization ACSC 425 Database Management Systems.
IMS 4212: Normalization 1 Dr. Lawrence West, Management Dept., University of Central Florida Normalization—Topics Functional Dependency.
Normalization Or theoretical and common sense approaches to redesigning a database.
1 First Normal Form (1NF) Unnormalized table : Contains a repeating group –Eg: from multi-valued attributes –Eg: from many-many relationship Table in 1NF:
MS Access. Most A2 projects use MS Access Has sufficient depth to support a significant project. Relational Databases. Fairly easy to develop a good user.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Normal Forms 1NF – A table that qualifies as a relation is in 1NF. (Back)(Back) 2NF – A relation is in 2NF if all of its nonkey attributes are dependent.
A Guide to SQL, Eighth Edition
Normalization Karolina muszyńska
Payroll Management System
Normalization – Part II
Chapter 4 The Relational Model and Normalization
Chapter 14 Normalization Pearson Education © 2009.
Presentation transcript:

This presentation prepared for MIS 421 / MBA 575 at Western Washington University. Material in this presentation drawn from Richard T. Watson, Data Management: Databases and Organization, 5 th Ed., the instructor’s experience, and other sources as noted. Some items © 2006 John Wiley & Sons. All rights reserved.

Normalization MIS 421 Dr. Steven C. Ross Fall 2011

Normalization The initial approach to database design … The initial approach to database design … Takes data in a non-relational structure and “normalizes” it – the result being a properly structured database. Takes data in a non-relational structure and “normalizes” it – the result being a properly structured database. Use normalization … Use normalization … When you inherit someone else’s database When you inherit someone else’s database When you don’t take the time to model the database When you don’t take the time to model the database To check your design To check your design

Functional Dependency Relationship between attributes in an entity Relationship between attributes in an entity Means that one or more attributes “determine” the value of another Means that one or more attributes “determine” the value of another If I know the value of A, then I can determine the value of B in the database A  B If I know the value of A, then I can determine the value of B in the database A  B A is a determinant of B A is a determinant of B Multivalued dependency Multivalued dependency If I know the value of A, then I can determine a set of values of B in the database A  B If I know the value of A, then I can determine a set of values of B in the database A  B Suggests a 1:M or M:M relationship Suggests a 1:M or M:M relationship

Normal Forms A set of seven degrees of classification … higher is better A set of seven degrees of classification … higher is better 1NF – first normal form 1NF – first normal form 2NF – second normal form 2NF – second normal form 3NF – third normal form 3NF – third normal form BCNF – Boyce-Codd normal form BCNF – Boyce-Codd normal form 4NF – fourth normal form 4NF – fourth normal form 5NF – fifth normal form 5NF – fifth normal form DK/NF – domain key normal form DK/NF – domain key normal form

First Normal Form “A relation is in first normal form if and only if all columns are single-valued.” (p. 214) “A relation is in first normal form if and only if all columns are single-valued.” (p. 214) Only one value per attribute Only one value per attribute Beware of attribute entries containing commas Beware of attribute entries containing commas Beware of multiple columns with similar names Beware of multiple columns with similar names

Violation of 1NF* OrderNumOrderDatePartNumNumOrdered /20/2003AT /20/2003DR93DW /21/2003KL /21/2003KT /23/2003BV06CD /23/2003DR /23/2003KV292 ORDER * Adapted from P.J. Pratt and J.J. Adamski, Concepts of Database Management, 4 th Ed., p. 145

Method to Achieve 1NF 1. Copy non-repeating data to each row occupied by repeating data. 2. PK of the new table is the old PK plus the identifier of the repeating data. OrderNumOrderDatePartNumNumOrdered /20/2003AT /20/2003DR93DW /21/2003KL /20/2003

Table in 1NF OrderNumOrderDatePartNumNumOrdered /20/2003AT /20/2003DR /20/2003DW /21/2003KL /21/2003KT /23/2003BV /23/2003CD /23/2003DR /23/2003KV292 ORDER * Adapted from P.J. Pratt and J.J. Adamski, Concepts of Database Management, 4 th Ed., p. 146

Second Normal Form “A relation is in second normal form if and only if it is in first normal form, and all non-key attributes are dependent on the [entire] key.” (p. 215) “A relation is in second normal form if and only if it is in first normal form, and all non-key attributes are dependent on the [entire] key.” (p. 215) If the key is a single field, then relation is in 2NF. If the key is a single field, then relation is in 2NF. If the key is a composite of multiple fields, then attributes cannot be dependent on only a single field. If the key is a composite of multiple fields, then attributes cannot be dependent on only a single field. Violated when a non-key column is dependent on a component of the primary key. Violated when a non-key column is dependent on a component of the primary key. If key is combination of A and B, but B  C, then relation violates 2NF. If key is combination of A and B, but B  C, then relation violates 2NF.

Violation of 2NF* OrderNumOrderDatePartNumDescriptionNumOrderedQuotedPrice /20/2003AT94Iron11$ /20/2003DR93 Gas Range 1$ /20/2003DW11Washer1$ /21/2003KL62Dryer4$ /21/2003KT03Dishwasher2$ /23/2003BV06 Home Gym 2$ /23/2003CD52 Microwave Oven 4$ /23/2003DR93 Gas Range 1$ /23/2003KV29Treadmill2$1, ORDER * Adapted from P.J. Pratt and J.J. Adamski, Concepts of Database Management, 4 th Ed., p. 147 Can you predict problems with this table?

Dependency Diagram* Normal Dependencies Partial Dependencies * Adapted from P.J. Pratt and J.J. Adamski, Concepts of Database Management, 4 th Ed., p. 148

Method to Achieve 2NF 1. Begin a new table with each field and combination of fields in the PK. (OrderNum, (PartNum, (OrderNum, PartNum, 2. Place each of the other columns with its appropriate PK. (OrderNum, OrderDate) (PartNum, Description) (OrderNum, PartNum, NumOrdered, QuotedPrice)

Tables in 2NF* OrderNumOrderDate /20/ /20/ /21/ /21/ /23/ /23/ /23/2003 ORDER * Adapted from P.J. Pratt and J.J. Adamski, Concepts of Database Management, 4 th Ed., p. 149PartNumDescriptionAT94Iron BV06 Home Gym CD52 Microwave Oven DL71 Cordless Drill DR93 Gas Range DW11Washer KL62Dryer KT03Dishwasher KV29Treadmill PART

Tables in 2NF* * Adapted from P.J. Pratt and J.J. Adamski, Concepts of Database Management, 4 th Ed., p. 149 OrderNumPartNumNumOrderedQuotedPrice 21608AT9411$ DR931$ DW111$ KL624$ KT032$ BV062$ CD524$ DR931$ KV292$1, ORDER_LINE

Third Normal Form “A relation is in third normal form if and only if it is in second normal form and has no transitive dependencies.” (p. 216) “A relation is in third normal form if and only if it is in second normal form and has no transitive dependencies.” (p. 216) Violated when a non-key column is a fact about another non-key column. Violated when a non-key column is a fact about another non-key column. A  B  C  A  C A  B  C  A  C If A is the [entire] key, then 3NF is violated because C can be determined by B, an non- key column If A is the [entire] key, then 3NF is violated because C can be determined by B, an non- key column

Violation of 3NF* * Adapted from P.J. Pratt and J.J. Adamski, Concepts of Database Management, 4 th Ed., p. 150 CustomerNumCustomerNameBalanceCreditLimitRepNumLastnameFirstName 148 Al’s Appliance $6550$750020KaiserValerie 282 Brookings Direct $431$ HullRobert 356Ferguson’s$5785$750065PerezJuan 408 The Everything Shop $5285$500035HullRichard 462 Bargains Galore $3412$ PerezJuan 524Kline’s$12762$ KaiserValerie 608 Johnson’s Department Store $2106$ PerezJuan 687 Lee’s Sport and Appliance $2851$500035HullRichard 725 Deerfield’s Four Seasons $248$750035HullRichard 842 All Season $8221$750020KaiserValerie CUSTOMER Can you predict problems with this table?

Dependency Diagram* Normal Dependencies Non Key Dependencies * Adapted from P.J. Pratt and J.J. Adamski, Concepts of Database Management, 4 th Ed., p. 152

Method to Achieve 3NF

Tables in 3NF* * Adapted from P.J. Pratt and J.J. Adamski, Concepts of Database Management, 4 th Ed., p. 153 CustomerNumCustomerNameBalanceCreditLimitRepNum 148 Al’s Appliance $6550$ Brookings Direct $431$ Ferguson’s$5785$ The Everything Shop $5285$ Bargains Galore $3412$ Kline’s$12762$ Johnson’s Department Store $2106$ Lee’s Sport and Appliance $2851$ Deerfield’s Four Seasons $248$ All Season $8221$ CUSTOMERRepNumLastnameFirstName20KaiserValerie 35HullRobert 65PerezJuan REP

Boyce-Codd Normal Form “A relation is in Boyce-Codd Normal Form if and only if every determinant is a candidate key.” (p. 217) “A relation is in Boyce-Codd Normal Form if and only if every determinant is a candidate key.” (p. 217) Only an issue when … Only an issue when … Relation has multiple candidate keys Relation has multiple candidate keys Those keys are composite keys Those keys are composite keys The keys overlap … at least one column in common The keys overlap … at least one column in common

Violation of BCNF* Candidate Keys Candidate Keys SID, Major SID, Major SID, Advisor SID, Advisor Dependencies (SID, Major)  Advisor (SID, Major)  Maj_GPA (SID, Advisor)  Maj_GPA Advisor  Major * Adapted from J.A. Hoffer, M.B. Prescott, and F.R. McFadden, Modern Database Management, 6 th Ed., p. 589 SIDMajorAdvisorMaj_GPA 123PhysicsHawking MusicMahler LiteratureMichener MusicBach PhysicsHawking3.5 STUDENT_ADVISOR Can you predict problems with this table?

Method to Achieve BCNF 1. The determinant that is not a candidate key becomes a component of the primary key of the revised table. Student_Advisor (SID, Advisor, Major_GPA) Student_Advisor (SID, Advisor, Major_GPA) 2. Create a new table containing all the columns from the old table that depend on this determinant. Advisor (Major) 3. Make the determinant the primary key of this new table. Advisor (Advisor, Major) Advisor (Advisor, Major)

Tables in BCNF* SIDAdvisorMaj_GPA 123Hawking Mahler Michener Bach Hawking3.5 STUDENT_ADVISORAdvisorMajorHawkingPhysics MahlerMusic MichenerLiterature BachMusic ADVISOR * Adapted from J.A. Hoffer, M.B. Prescott, and F.R. McFadden, Modern Database Management, 6 th Ed., p. 591

Fourth Normal Form “A relation is in fourth normal form if it is in Boyce-Codd normal form and all multi- valued dependencies on the relation are functional dependencies.” (p. 219) “A relation is in fourth normal form if it is in Boyce-Codd normal form and all multi- valued dependencies on the relation are functional dependencies.” (p. 219) When … When … A   B A   B A   C A   C And there is no dependency between B and C And there is no dependency between B and C And A, B, and C are in same table, relation is not in 4NF And A, B, and C are in same table, relation is not in 4NF

Violation of 4NF* CourseInstructorTextbook ManagementWhiteGreenBlackDruckerPeters FinanceGrayJonesChang A course has multiple instructors A course uses multiple textbooks All instructors use the same textbooks * Adapted from J.A. Hoffer, M.B. Prescott, and F.R. McFadden, Modern Database Management, 6 th Ed., p. 593CourseInstructorTextbookManagementWhiteDrucker ManagementWhitePeters ManagementGreenDrucker ManagementGreenPeters ManagementBlackDrucker ManagementBlackPeters FinanceGrayJones FinanceGrayChang OFFERING Can you predict problems with this table?

Method to Achieve 4NF 1. Divide the relation into two new relations. Each relation contains the two attributes that have a multi-valued relationship in the original relation. Teacher (Course, Instructor) Text (Course, Textbook)

Tables in 4NF* CourseInstructor ManagementWhite ManagementGreen ManagementBlack FinanceGray TEACHERCourseTextbookManagementDrucker ManagementPeters FinanceJones FinanceChang TEXT * Adapted from J.A. Hoffer, M.B. Prescott, and F.R. McFadden, Modern Database Management, 6 th Ed., p. 593

Fifth Normal Form “Fifth normal form concerns dependencies that are rather obscure. It has to do with relations that can be divided into subrelations... but then cannot be reconstructed. The condition under which this situation arises has no clear intuitive meaning. We do not know what the consequences of such dependencies are or even if they have any practical consequences.” “Fifth normal form concerns dependencies that are rather obscure. It has to do with relations that can be divided into subrelations... but then cannot be reconstructed. The condition under which this situation arises has no clear intuitive meaning. We do not know what the consequences of such dependencies are or even if they have any practical consequences.” D. M. Kroenke, Database Processing, 9th Ed, p. 133 D. M. Kroenke, Database Processing, 9th Ed, p. 133

My Solution to the 5NF Example

Skill Builder* You have been given a spreadsheet that contains the details of invoices. The column headers for the spreadsheet are date, invoice number, invoice amount, invoice tax, invoice total, cust number, cust name, cust street, cust city, cust state, cust postal code, cust nation, product code, product price, product quantity, salesrep number, salesrep first name, salesrep last name, salesrep district, district name, and district size (number of salesreps). A single invoice can contain many products. Sales tax varies by salesrep (each rep has a specific tax rate in his or her city). Create a 3NF data model. * Adapted from Richard T. Watson, Data Management: Databases and Organization, 4 th Ed., p. 210

The Answer

Next Lecture The Relational Model and Relational Algebra