Week 6 Lecture Normalization

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

Relational Terminology. Normalization A method where data items are grouped together to better accommodate business changes Provides a method for representing.
 Definition  Components  Advantages  Limitations Contents  Definition Definition  Normal Forms Normal Forms  First Normal Form First Normal Form.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Normalization What is it?
Normalization Dr. Mario Guimaraes. Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints.
Normalisation The theory of Relational Database Design.
Normalization of Database Tables Special adaptation for INFS-3200
Monash University Week 7 Data Modelling Relational Database Theory IMS1907 Database Systems.
Normalization of Database Tables
Chapter 8 Normal Forms Based on Functional Dependencies Deborah Costa Oct 18, 2007.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Normalization of Database Tables
1 © Prentice Hall, 2002 Chapter 5: Logical Database Design and the Relational Model Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B.
Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
Chapter 5 Normalization of Database Tables
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Concepts and Terminology Introduction to Database.
Database Systems: Design, Implementation, and Management Tenth Edition
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Concepts of Database Management, Fifth Edition
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
1 DATABASE SYSTEMS DESIGN IMPLEMENTATION AND MANAGEMENT INTERNATIONAL EDITION ROB CORONEL CROCKETT Chapter 7 Normalisation.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Logical Database Design Relational Model. Logical Database Design Logical database design: process of transforming conceptual data model into a logical.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
Schema Refinement and Normal Forms 20131CS3754 Class Notes #7, John Shieh.
Chapter 7 1 Database Principles Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
COMP1212 COMP1212 Anomalies and Dependencies Dr. Mabruk Ali.
1 Functional Dependencies and Normalization Chapter 15.
Database Principles: Fundamentals of Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables Carlos Coronel, Steven.
Normalization of Database Tables
©NIIT Normalizing and Denormalizing Data Lesson 2B / Slide 1 of 18 Objectives In this section, you will learn to: Describe the Top-down and Bottom-up approach.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Design Process - Where are we?
Chapter 4 Normalization of Database Tables. 2 Database Tables and Normalization Table is basic building block in database design Table is basic building.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization.
Chapter 10 Designing Databases. Objectives:  Define key database design terms.  Explain the role of database design in the IS development process. 
Logical Database Design and the Relational Model.
Ch 7: Normalization-Part 1
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
1 CS 430 Database Theory Winter 2005 Lecture 7: Designing a Database Logical Level.
Lecture 4: Logical Database Design and the Relational Model 1.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
IMS 4212: Normalization 1 Dr. Lawrence West, Management Dept., University of Central Florida Normalization—Topics Functional Dependency.
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
Logical Database Design and Relational Data Model Muhammad Nasir
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
1 Normalization David J. Stucki. Outline Informal Design Guidelines Normal Forms  1NF  2NF  3NF  BCNF  4NF 2.
Normalizing Database Designs. 2 Objectives In this chapter, students will learn: –What normalization is and what role it plays in the database design.
Normalization Karolina muszyńska
Chapter 5: Logical Database Design and the Relational Model
Chapter 6 Normalization of Database Tables
Normalization A337.
Normalization of Database Tables Uploaded by: mysoftbooks.ml
Presentation transcript:

Week 6 Lecture Normalization CSE2132 Database Systems Week 6 Lecture Normalization Normalization 6. 1

Week 5 lecture review: Logical Database Design Steps 1. Conceptual Model (ER Diagram) mapped onto a logical model dependent on the DBMS characteristics. 2. De-normalization (Optimize for efficiency). Combining tables to avoid doing joins Create more tables - Horizontal and Vertical partitioning Data replication (Redundancy) Combination of the above Normalised relations solve data maintenance problems and minimise redundancy, but implemented as such as physical records, may not yield efficient data processing. NB: Only use De-normalisation to gain explicit processing speed when other design actions are not sufficient! Normalization 6. 2

Goal of Relational Design What Relations (tables) should exist and what Attributes (columns) should they contain? Avoid Redundancy if possible - minimize storage space Avoid Anomalies (data that does not make business sense) Avoid Nulls Avoid Joins which produce spurious (false) tuples (rows) Normalization 6. 3

Dependency Theory " One truly scientific part of the field [of database design]" Date 5th ed p.325 Relational database design - a mechanical approach to producing a database schema with certain desirable properties. Following…. A review of normal forms and the problems they solve. Normalization 6. 4

Data Normalization Normalization is a formal process to decide which attributes should be grouped together. Primarily a tool/technique to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data. It provides a formal measure of why one grouping of attributes may be better than another. Each Normal Form requires that a relation satisfies criteria for that normal form and this eliminates a different kind of redundancy. Database operations applied to unnormalized relations may lead to anomalies. Normalized Relations will remain consistent following database operations and will store each fact only once. Normalization 6. 5

Assumptions A group of attributes has a natural “inherent” structure. This structure is independent of the way the data is used. Normalization Introduced by E. Codd together with relational database theory. Originally Codd defined three normal forms. This was later expanded to include Boyce-Codd and fourth and fifth normal forms. Normalization 6. 6

Anomalies Consider the poorly structured relation ASSIGN Person_Id Project_budget Project_Id Time_ Spent_on_Project S75 32 P1 7 S75 40 P2 8 S79 32 P1 4 S79 27 P3 1 S80 40 P2 5 - 17 P4 - Null Values are considered to be anomalies Normalization 6. 7

Anomalies Insertion Anomaly add tuple (ASSIGN , <S85,35,P1,9>) - two conflicting budgets for P1 Deletion Anomaly delete tuple (ASSIGN, <S79,27,P3,1>) - removes project budget for P3 Normalization 6. 8

Anomalies Update anomalies update tuple (ASSIGN, <S75,32,P1,7>,<S75,35,P1,7>) This example tries to update the budget for P1. But P1 is also listed in the row with S79 ... either multiple updates or the potential for inconsistency ... Normalization 6. 9

Normalization and Functional Dependencies Normalization is based on the analysis of Functional Dependencies. Functional dependency = constraint between two attributes or two sets of attributes. Normalization 6. 10

Functional Dependencies - the values of one set of attributes effect the values of another attribute. The value of X determines the value of Y. The value of Y is functionally dependent on the value of X. Y is a fact about X. The simplest case is 1 attribute determines another single attribute. Often 2 or 3 attributes are needed to determine another single attribute. Y X Normalization 6. 11

Functional Dependencies Referring to slide 6.7 ... Project_id Project Budget Person_Id Project_id Time Spent on Project Alternative Representation: Functional Dependency Diagram Project_id Project Budget Normalization 6. 12

Task: Write down all the Functional Dependencies Answer: Name birtdate salary EMPLOYEE1 Emp_id Answer: Name salary date_completed EMPLOYEE2 Emp_id Course_id Normalization 6. 13

First Normal Form (1NF) A table is in 1NF if: it contains no repeating groups (i.e. no multi-valued attributes) every attribute is atomic ( Relational Model does not handle repeating groups) Relationship between key and non-key fields Will be one to one(1:1) or one to many (1:N) Normalization 6. 14

First Normal Form (Example) Remove Repeating Groups All occurrences in a relation must have the same number of fields Relation: STUDENT(STUD#,SNAME(SUBCODE,TITLE,RESULT)) 1NF Relation: STUDENT(STUD#,SNAME) STUDENT-RESULT(STUD#,SUBCODE,TITLE,RESULT) Normalization 6. 15

Second Normal Form A relation is in 2NF if: it is in 1NF, and every non-key attribute is fully functionally dependent on the whole key. Problems with relations not in 2NF: - repeated information - update anomalies - potential inconsistency - delete anomalies Normalization 6. 16

Second Normal Form (Example) Remove Partial Dependencies A non-key attribute cannot be identified by part of a composite key ORDER-ITEM(ORDER#,ITEM#, DESC, QTY) ORDER-ITEM(ORDER#,ITEM#,QTY) ITEM(ITEM#,DESC) Normalization 6. 17

Anomalies due to Partial Dependencies ORDER-ITEM ORDER# ITEM# DESC QTY 27 873 NUT 2 28 402 BOLT 1 28 873 NUT 10 30 495 WASHER 50 UPDATE - change DESC in many places DELETE - data for ITEM is lost when ORDER is deleted INSERT - cannot create a new ITEM until an ORDER requires that ITEM Normalization 6. 18

Solution to 2NF Anomalies ORDER-ITEM ORDER# ITEM# QTY Delete Order# 30 and washer still remains 27 873 2 28 402 1 28 873 10 30 495 50 ITEM Add a new Item at any time ITEM# DESC 873 NUT Update BOLT in one place only 402 BOLT 495 WASHER Normalization 6. 19

Third Normal Form A relation is in 3NF if: it is in 2NF, and A functional dependency between two (or more) nonkey attributes, gives rise to a transitive dependency A relation is in 3NF if: it is in 2NF, and contains no transitive dependencies 3NF - is violated when a non-key field is a fact(thus a functional dependency exists) about another non-key field Problems with relations not in 3NF: -as for 2NF Normalization 6. 20

Third Normal Form (Example) The functional dependency between the nonkey attributes (DEPT# and DNAME_, gives rise to a transitive dependency (EMP#  DNAME). Remove this transitive dependency Remove Transitive Dependencies A non-key attribute cannot be identified by another non-key attribute. EMPLOYEE(EMP#,ENAME,DEPT#,DNAME) EMPLOYEE(EMP#,ENAME,DEPT#) DEPARTMENT(DEPT#,DNAME) Emp#  dept# dept#  dname therefore emp#  dname (transitively) Normalization 6. 21

Anomalies due to Transitive Dependencies EMPLOYEE EMP# ENAME DEPT# DNAME 10 SMITH D5 EDP 20 JONES D7 FINANCE 25 SMITH D7 FINANCE 30 BLACK D8 SALES UPDATE - change DNAME in many places DELETE - data for DEPT is lost when last EMP is deleted for DEPT INSERT - cannot create a new DEPT until an EMP starts for that DEPT Normalization 6. 22

Solution to 3NF Anomalies EMPLOYEE DELETE last EMP but DEPT still remains EMP# ENAME DEPT# 10 SMITH D5 20 JONES D7 25 SMITH D7 30 BLACK D8 DEPARTMENT DEPT# DNAME ADD new DEPT at any time D5 EDP D7 FINANCE UPDATE DNAME once D8 SALES Normalization 6. 23

A Simple Test for 3NF Each attribute should depend on : the key the whole key and nothing but the key (so help me CODD) Normalization 6. 24

Steps in Normalization

Example Problem Consider the poorly formed relation following. The HR department wishes to keep track of Employees, Departments, Jobs and Employee job assignments. The primary key of the relation is underlined. ASSIGNMENT(EMP-ID, JOB-CODE,DEPT-NO,EMP_NAME, JOB-DESCR, DATE_JOB_ASSIGNED,DEPT-DESC) It is known that EMP_ID functionally determines EMP-NAME and DEPT-NO, DEPT-NO functionally determines DEPT-DESC and that JOB_CODE functionally determines JOB_DESCR. The system also needs to keep track of the date on which a specific employee has been assigned to a specific job. An employee can be assigned to more than one job over time. Normalization 6. 26

The Question [1] In what normal form (if any) is the relation as it appears above? [2] Rewrite the above relation as a number of relations all of which are in third normal form. (It is not required to write down relations in 1st or 2nd normal form.) Normalization 6. 27

One Approach to Solving Draw a data structure diagram (DSD) that is a best guess as to the final relations Identify the primary key in each relation Make sure each attribute is functionally dependent on the primary key attribute(s) Check a foreign key is present (at the many end) if the relation is related to some other relation Scan the resulting DSD for any omitted relationships, any repeating groups, partial dependencies or transitive dependencies If relationships are present include those relationships. If repeating groups, partial dependencies or transitive dependencies are present break down the offending relation further Normalization 6. 28

An Answer It is in first normal form as there are no repeating groups. EMPLOYEE(EMP-ID,EMP_NAME,DEPT-NO) JOB(JOB-CODE,JOB-DESCR) ASSIGNMENT(EMP-ID, JOB-CODE, DATE_JOB_ASSIGNED) DEPARTMENT(DEPT-NO,DEPT-DESC) EMPLOYEE JOB DEPT ASSIGNMENT Normalization 6. 29