Bordoloi Database Design: Normalization Dr. Bijoy Bordoloi.

Slides:



Advertisements
Similar presentations
Normalization What is it?
Advertisements

Normalization Dr. Mario Guimaraes. Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints.
Normalisation The theory of Relational Database Design.
Normalization of Database Tables Special adaptation for INFS-3200
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
CS263:Revision on Normalisation
1 © Prentice Hall, 2002 Chapter 5: Logical Database Design and the Relational Model Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B.
Databases 6: Normalization
Normalization II. Boyce–Codd Normal Form (BCNF) Based on functional dependencies that take into account all candidate keys in a relation, however BCNF.
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
Introduction to Schema Refinement
Chapter 4: Logical Database Design and the Relational Model (Part II)
Lecture 12 Inst: Haya Sammaneh
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Concepts and Terminology Introduction to Database.
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 4 This material was developed by Oregon Health & Science.
Avoiding Database Anomalies
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Database Management COP4540, SCS, FIU Relation Normalization (Chapter 14)
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Database Design (Normalizations) DCO11310 Database Systems and Design By Rose Chang.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
Chapter 7 1 Database Principles Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, Modified by Dr. Mathis 3-1 David M. Kroenke’s Chapter Three: The Relational.
The Relational Model and Normalization The Relational Model Normalization First Through Fifth Normal Forms Domain/Key Normal Form The Synthesis of Relations.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
Component 4/Unit 6d Topic IV: Design a simple relational database using data modeling and normalization Description and Information Gathering Data Model.
©NIIT Normalizing and Denormalizing Data Lesson 2B / Slide 1 of 18 Objectives In this section, you will learn to: Describe the Top-down and Bottom-up approach.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
R. J. Daigle Normalization Concepts CIS 507 Database Programming.
Data Analysis Improving Database Design. Normalization The process of transforming a data model into a flexible, stable structure. Reduces anomalies Anomaly.
Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/1 Copyright © 2004 Please……. No Food Or Drink in the class.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 5 (Part c): Logical Database Design and the Relational Model Modern Database Management.
Brian Thoms.  Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain.
Logical Database Design and the Relational Model.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Normalization Hour1,2 Presented & Modified by Mahmoud Rafeek Alfarra.
Chapter 5 MODULE 6: Normalization © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Prepared by: KIM GASTHIN M. CALIMQUIM.
Lecture 4: Logical Database Design and the Relational Model 1.
IMS 4212: Normalization 1 Dr. Lawrence West, Management Dept., University of Central Florida Normalization—Topics Functional Dependency.
Copyright © Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF.
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
Logical Database Design and Relational Data Model Muhammad Nasir
Chapter 4 © 2013 Pearson Education, Inc. Publishing as Prentice Hall Chapter 4: Logical Database Design and the Relational Model Modern Database Management.
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
MS Access. Most A2 projects use MS Access Has sufficient depth to support a significant project. Relational Databases. Fairly easy to develop a good user.
Lecture # 17 Chapter # 10 Normalization Database Systems.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 4: PART C LOGICAL.
Database Normalization. What is Normalization Normalization allows us to organize data so that it: Normalization allows us to organize data so that it:
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Normal Forms 1NF – A table that qualifies as a relation is in 1NF. (Back)(Back) 2NF – A relation is in 2NF if all of its nonkey attributes are dependent.
Understanding Data Storage
Normalization Karolina muszyńska
A brief summary of database normalization
Chapter 5: Logical Database Design and the Relational Model
Example Question–Is this relation Well Structured? Student
Unit 4: Normalization of Relations
CHAPTER 4: LOGICAL DATABASE DESIGN AND THE RELATIONAL MODEL
Database Normalization.
Presentation transcript:

Bordoloi Database Design: Normalization Dr. Bijoy Bordoloi

Bordoloi Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of dataPrimarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data The process of decomposing relations with anomalies to produce smaller, well- structured relationsThe process of decomposing relations with anomalies to produce smaller, well- structured relations

Bordoloi Results of Normalization Removes the following modification anomalies (integrity errors) with the databaseRemoves the following modification anomalies (integrity errors) with the database –Insertion –Deletion –Update

Bordoloi ANOMALIES InsertionInsertion –inserting one fact in the database requires knowledge of other facts unrelated to the fact being inserted DeletionDeletion –Deleting one fact from the database causes loss of other unrelated data from the database UpdateUpdate –Updating the values of one fact requires multiple changes to the database

Bordoloi ANOMALIES EXAMPLES TABLE: COURSE COURSE#SECTION#C_NAME CIS Database Design CIS CIS Oracle Forms CIS Database Design

Bordoloi ANOMALIES EXAMPLES I nsertion: Suppose our university has approved a new course called CIS563: SQL & PL/SQL. Can this information about the new course be entered (inserted) into the table COURSE in its present form? COURSE#SECTION#C_NAME CIS Database Design CIS CIS Oracle Forms CIS Database Design

Bordoloi ANOMALIES EXAMPLES Deletion: Suppose not enough students enrolled for the course CIS570 which had only one section 072. So, the school decided to drop this section and delete the section# 072 for CIS570 from the table COURSE. But then, what other relevant info also got deleted in the process? COURSE#SECTION#C_NAME CIS Database Design CIS CIS Oracle Forms CIS Database Design

Bordoloi ANOMALIES EXAMPLES Update: Suppose the course name (C_Name) for CIS 564 got changed to Database Management. How many times do you have to make this change in the COURSE table in its current form? COURSE#SECTION#C_NAME CIS Database Design CIS CIS Oracle Forms CIS Database Design

Bordoloi ANOMALIES So, a table (relation) is a stable (‘good’) table only if it is free from any of these anomalies at any point in time.So, a table (relation) is a stable (‘good’) table only if it is free from any of these anomalies at any point in time. You have to ensure that each and every table in a database is always free from these modification anomalies. And, how do you ensure that?You have to ensure that each and every table in a database is always free from these modification anomalies. And, how do you ensure that? ‘Normalization’ theory helps.‘Normalization’ theory helps.

Bordoloi NORMAL FORMS 1 NF 1 NF 2NF 2NF 3NF 3NF BCNF (Boyce-Codd Normal Form)BCNF (Boyce-Codd Normal Form) 4NF4NF 5NF5NF DK (Domain-Key) NFDK (Domain-Key) NF

Bordoloi * First Normal Form(1NF) Second Normal Form(2NF) Third Normal Form(3NF) Fourth Normal Form(4NF) Fifth Normal Form(5NF) Boyce-Codd Normal Form(BCNF) Domain/Key Normal Form (DK/NF) Relationships of Normal Forms

Bordoloi Functional Dependency Relationship between columns X and Y such that, given the value of X, one can determine the value of Y. Written as X YRelationship between columns X and Y such that, given the value of X, one can determine the value of Y. Written as X Y i.e., for a given value of X we can obtain (or look up) a specific value of X X is called the determinant of YX is called the determinant of Y Y is said to be functionally dependent on YY is said to be functionally dependent on Y

Bordoloi Functional Dependency ExampleExample –SOC_SEC_NBR EMP_NME SOC_SEC_NBREMP_NME -One and only one EMP_NME for a specific SOC_SEC_NBR - SOC_SEC_NBR is the determinant of EMP_NME - EMP_NME is functionally dependent on SOC_SEC_NBR

Bordoloi 1NF A table is in 1NF if there are no repeating groups in the table. In other words, a table is in 1NF if all non- key fields are functionally dependent on the primary key (PK). That is, for each given value of PK, we always get only one value of the non-key field(s). Is the following table COURSE in 1NF? COURSE#SECTION#C_NAME CIS Database Design CIS CIS Oracle Forms CIS Database Design Course

Bordoloi 1NF But, didn’t we just conclude that COURSE is a ‘bad’ table (the way it is structured) as it suffers from all the three anomalies we talked about? So, what’s the problem? COURSE#SECTION#C_NAME CIS Database Design CIS CIS Oracle Forms CIS Database Design

Bordoloi Partial Dependency Occurs when a column in a table only depends on part of a concatenated keyOccurs when a column in a table only depends on part of a concatenated key COURSE (COURSE# + SECTION#, C-NAME Example

Bordoloi 2NF C_Name only depends upon the Course# not the Section#. It is partially dependent upon the primary key. C_Name only depends upon the Course# not the Section#. It is partially dependent upon the primary key. A table is in 2NF if it is in 1NF and has no partial dependencies. A table is in 2NF if it is in 1NF and has no partial dependencies.

Bordoloi 2NF How do you resolve partial dependency? How do you resolve partial dependency? Decompose the problematic table into smaller tables. Decompose the problematic table into smaller tables. Must be a ‘loss-less’ decomposition. That is, you must be able to put the decomposed tables back together again to arrive at the original information. Must be a ‘loss-less’ decomposition. That is, you must be able to put the decomposed tables back together again to arrive at the original information. Remember Foreign Keys! Remember Foreign Keys!

Bordoloi 2NF COURSE#C_NAME CIS564 Database Design CIS570 Oracle Forms COURSE#SECTION#CIS CIS CIS CIS COURSE OFFERED_COURSE

Bordoloi 2NF Are the two (decomposed) tables COURSE and OFFEERED_COURSE are 2NF?Are the two (decomposed) tables COURSE and OFFEERED_COURSE are 2NF? Do these two tables have any modification anomalies?Do these two tables have any modification anomalies? –Can you now readily enter the info that a new approved course CIS563? – Can you now delete the section# 072 for CIS570 without losing the info tat CIS570 exists? –How many times do you have to change the name of a given course?

Bordoloi Transitive Dependency Table: Student-Dorm-Fee SIDDORMFEE 101Oracle Oracle DB DB Sybase500

Bordoloi Transitive Dependency Is the table Student-Dorm-Fee in 2NF?Is the table Student-Dorm-Fee in 2NF? Does this table have any modification anomalies?Does this table have any modification anomalies? –Insertion? –Deletion? –Update?

Bordoloi Transitive Dependency Occurs when a non-key attribute is functionally dependent onOccurs when a non-key attribute is functionally dependent on one or more non-key attributes. Example: HOUSING (SID, DORM, FEE) PRIMARY KEY: SID FUNCTIONAL DEPENDENCIES: SID  BUILDING SID  FEE DORM  FEE A table is in 3NF if it is in 2NF and has no transitive A table is in 3NF if it is in 2NF and has no transitivedependencies

Bordoloi 3NF Besides SID, FEE is also functionally dependent on DORM which is a non-key attribute. Besides SID, FEE is also functionally dependent on DORM which is a non-key attribute. A table is in 3NF if it is in 2NF and has no transitive Dependencies. A table is in 3NF if it is in 2NF and has no transitive Dependencies.

Bordoloi 3NF How do you resolve transitive dependency? How do you resolve transitive dependency? Decompose the problematic table into smaller tables. Decompose the problematic table into smaller tables. Must be a ‘loss-less’ decomposition. That is, you must be able to put the decomposed tables back together again to arrive at the original information. Must be a ‘loss-less’ decomposition. That is, you must be able to put the decomposed tables back together again to arrive at the original information. Remember Foreign Keys! Remember Foreign Keys!

Bordoloi 3NF DORMFEE Oracle1000 DB2800 Sybase500 SIDDORM101Oracle 102Oracle 103DB2 104DB2 105Sybase DOM_FEE STUDENT_DORM

3NF Are the two (decomposed) tables STUDENT_DORM and DORM_FEE in 2NF?Are the two (decomposed) tables STUDENT_DORM and DORM_FEE in 2NF? Are they in 3NF? Are they in 3NF? Do these two tables have any modification anomalies?Do these two tables have any modification anomalies?

Bordoloi Data Analyst’s Oath EVERY NON-KEY COLUMN IN A TABLE MUST BE FUNCTIONALLY DEPENDENT UPON THE ENTIRE KEY AND NOTHING BUT THE KEY!

Bordoloi Other Normal Forms There are additional normal forms which do not often occur in actual practice. However, these situations can occur in practice so it is necessary to understand them. These are:There are additional normal forms which do not often occur in actual practice. However, these situations can occur in practice so it is necessary to understand them. These are: –Boyce-Codd Normal Form –Fourth Normal Form –Fifth Normal Form We will deal with these normal forms if time allows. You must, however, fully understand 1 ST through 3 RD NF.We will deal with these normal forms if time allows. You must, however, fully understand 1 ST through 3 RD NF. Domain/Key normal form is a different approach and we will not deal with it in this course.Domain/Key normal form is a different approach and we will not deal with it in this course.

Bordoloi * First Normal Form(1NF) Second Normal Form(2NF) Third Normal Form(3NF) Fourth Normal Form(4NF) Fifth Normal Form(5NF) Boyce-Codd Normal Form(BCNF) Domain/Key Normal Form (DK/NF) Relationships of Normal Forms

Bordoloi Normal Forms –First Normal Form No repeating groups in tablesNo repeating groups in tables –Second Normal Form Table is 1 st normal form and no partial key dependenciesTable is 1 st normal form and no partial key dependencies –Third Normal Form Table is in 2 nd normal form and has no transitive dependenciesTable is in 2 nd normal form and has no transitive dependencies

Bordoloi Normal Forms –Boyce-Codd Normal Form Every determinant of a non-key attribute is a candidate keyEvery determinant of a non-key attribute is a candidate key –Fourth Normal Form A table has no multi-valued dependenciesA table has no multi-valued dependencies –Fifth Normal Form There are no lossey joins between two or more tablesThere are no lossey joins between two or more tables

Bordoloi Sample User View

Bordoloi First Normal Form Remove the repeating groups and concatenate keysRemove the repeating groups and concatenate keys so that the original table can be recovered by joining tables ORD_NBRORD_DTEZIP_ADRCUS_NBRCUS_NMESTR_ADRCTY_ADRSTT_ADR SUB_TOTFRT_AMTTAXTOT_AMT ORD AMOUNTORD_QTYORD_ITM_PRICEITM_DSCITM_NBRORD_NBR ORD_ITM... What problems occur if the database is stored using first normal form?What problems occur if the database is stored using first normal form?

Bordoloi Second Normal Form Are these tables in 2 nd NF? Are these tables in 2 nd NF? In other words, are there any partial dependencies? In other words, are there any partial dependencies? ORD_NBRORD_DTEZIP_ADRCUS_NBRCUS_NMESTR_ADRCTY_ADRSTT_ADR SUB_TOTFRT_AMTTAXTOT_AMT ORD AMOUNTORD_QTYORD_ITM_PRICEITM_DSCITM_NBRORD_NBR ORD_ITM...

Bordoloi Second Normal Form Remove any partial dependenciesRemove any partial dependencies Are there any transitive dependencies?Are there any transitive dependencies? ORD_NBRORD_DTEZIP_ADRCUS_NBRCUS_NMESTR_ADRCTY_ADRSTT_ADR SUB_TOTFRT_AMTTAXTOT_AMT AMOUNTORD_QTY ORD_ITM_PRICEITM_DSC ITM_NBRORD_NBR ORD_ITM ORD ITM_NBR ITM

Bordoloi Third Normal Form Remove transitive dependenciesRemove transitive dependencies ORD_NBRORD_DTE ZIP_ADR CUS_NBR CUS_NMESTR_ADRCTY_ADRSTT_ADR SUB_TOTFRT_AMTTAXTOT_AMT AMOUNTORD_QTY ORD_ITM_PRICEITM_DSC ITM_NBRORD_NBR ORD_ITM ORD ITM_NBR ITM CUS_NBR CUS

Bordoloi Third Normal Form Remove transitive dependenciesRemove transitive dependencies ORD_NBRORD_DTE ZIP_ADR CUS_NBR CUS_NMESTR_ADR SUB_TOTFRT_AMTTAXTOT_AMT AMOUNTORD_QTY ORD_ITM_PRICEITM_DSC ITM_NBRORD_NBR ORD_ITM ORD ITM_NBR ITM CUS_NBR CUS CITYSTATEZIP

Bordoloi DISCUSSION Is the table Ord_Itm in 3NF?Is the table Ord_Itm in 3NF? How about the table ORD?How about the table ORD?

Bordoloi DISCUSSION Is the table Ord_Itm in 3NF? Yes.Is the table Ord_Itm in 3NF? Yes. – There is mathematical dependence between Ord_Qty and Amount, NOT functional dependence! How about the table ORD? NO.How about the table ORD? NO. –In this table, however, there is functional dependence between the non-key attribultes Tot_Amt and (Sub_Tot + Frt_Amt + Tax)

Bordoloi DERIVABLE DATA Rule of thumb: Do NOT include derivable (computable) data in the baseline Logical database design schemaRule of thumb: Do NOT include derivable (computable) data in the baseline Logical database design schema You may, selectively include some derivable data in your design, mainly to enhance the performance of your application – which, however, is a physical database design issue (which we will be discussing soon)You may, selectively include some derivable data in your design, mainly to enhance the performance of your application – which, however, is a physical database design issue (which we will be discussing soon)

Bordoloi Third Normal Form Remove transitive dependenciesRemove transitive dependencies ORD_NBRORD_DTE ZIP_ADR CUS_NBR CUS_NMESTR_ADR SUB_TOTFRT_AMTTAXTOT_AMT AMOUNTORD_QTY ORD_ITM_PRICEITM_DSC ITM_NBRORD_NBR ORD_ITM ORD ITM_NBR ITM CUS_NBR CUS CITYSTATEZIP Derivable Fields =

Bordoloi QUESTION Should an ERD be normalized for Relational database design purposes?Should an ERD be normalized for Relational database design purposes?

Bordoloi DISCUSSION Non-normalized ERDNon-normalized ERD – User-oriented – Good for capturing/communicating the semantics of the database application Normalized ERDNormalized ERD –Implementation-oriented –Can be used to directly define the database structure