Normalization DBS201.

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

5 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
Chapter 5 Normalization of Database Tables
Chapter 5 Normalization of Database Tables
 Definition  Components  Advantages  Limitations Contents  Definition Definition  Normal Forms Normal Forms  First Normal Form First Normal Form.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Boyce-Codd Normal Form Kelvin Nishikawa SE157a-03 Fall 2006 Kelvin Nishikawa SE157a-03 Fall 2006.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
XP Chapter 1 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Level 3 Objectives: Identifying and Eliminating Database.
Lecture 12 Inst: Haya Sammaneh
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Concepts of Database Management, Fifth Edition
Module III: The Normal Forms. Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form. The database.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Logical Database Design Relational Model. Logical Database Design Logical database design: process of transforming conceptual data model into a logical.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Database Design – Lecture 8
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
N ORMALIZATION DBS201. What is Normalization? Normalization is a series of steps used to evaluate and modify table structures to ensure that every non-key.
Brian Thoms.  Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Normalization Hour1,2 Presented & Modified by Mahmoud Rafeek Alfarra.
IST Database Normalization Todd Bacastow IST 210.
RELATIONAL TABLE NORMALIZATION. Key Concepts Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive.
Normalization ACSC 425 Database Management Systems.
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
Lecture # 17 Chapter # 10 Normalization Database Systems.
Database Normalization. What is Normalization Normalization allows us to organize data so that it: Normalization allows us to organize data so that it:
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Logical Design & the Relational Model
Understanding Data Storage
INLS 623 – Database Normalization
A Guide to SQL, Eighth Edition
Revised: 2 April 2004 Fred Swartz
Chapter 4 Logical Database Design and the Relational Model
Chapter 5: Relational Database Design
Normalization Karolina muszyńska
Chapter 4: Relational Database Design
Database Normalization
Chapter 5: Logical Database Design and the Relational Model
Payroll Management System
Example Question–Is this relation Well Structured? Student
Unit 4: Normalization of Relations
Database Normalization
The Process of Normalisation
Functional Dependencies and Normalization
Normalization Referential Integrity
Relational Database.
Chapter 6 Normalization of Database Tables
Normalization A337.
1st, 2nd, and 3rd Normal Forms
INFS 3220 Systems Analysis & Design
Chapter 4.1 V3.0 Napier University Dr Gordon Russell
Outline: Normalization
Normalization.
Normalization of Database Tables Uploaded by: mysoftbooks.ml
Normalization Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting.
4 Normal Form.
CHAPTER 4: LOGICAL DATABASE DESIGN AND THE RELATIONAL MODEL
1st, 2nd, and 3rd Normal Forms
Normalization February 28, 2019 DB:Normalization.
Copyright © 2018, 2015, 20 Pearson Education, Inc. All Rights Reserved Database Concepts Eighth Edition Chapter # 2 The Relational Model.
Sampath Jayarathna Cal Poly Pomona
Database Normalization.
Chapter 4 The Relational Model and Normalization
Normalization DBS201.
Normalisation 1 Unit 3.1 Dr Gordon Russell, Napier University
Presentation transcript:

Normalization DBS201

What is Normalization? Normalization is a series of steps used to evaluate and modify table structures to ensure that every non-key column in every table is directly dependent on the primary key. The results of normalization are reduced redundancies, fewer anomalies and improved efficiencies.

Two purposes of normalization are: Eliminate redundant data (the same data stored in more than one table) Ensure the data within a table are related

Eliminating redundant data is achieved by splitting tables with redundant data into two or more tables without the redundancy Normalization involves the process of applying rules called normal forms to table structures that produce a design that is free of data redundancy problems.

Several normal forms exist, with the most common being 1NF, 2NF, and 3NF Each normal form addresses the potential for a particular type of redundancy. A table is said to be in one of the normal forms if it satisfies the rules required by that form.

First Normal Form (1NF) 1 2 3 Two dimensional table format no repeating groups – each row/column intersection only contains one value 2 Primary key is identified 3

Product no 145 is stored in two different warehouses. Inventory product_id product_desc whse_id bin qty whse_address city prov pcode 145 Saw 122 322 136 175 40 25 122 Peter St. 4433 Oak Ave Newmarket Oakville Ont L4T5Y6 L5T6R5 355 Screwdriver 111 55 130 Hammer 98 35 Product no 145 is stored in two different warehouses. Having more than one value at the intersection of a row and a column is referred to as having a repeating group A table that contains a repeating group is called an un-normalized table (UNF) [product_id, product_desc(whse_id,bin,qty,whse_address,city,prov,pcode)]

What should the primary key be? Inventory product_id product_desc whse_id bin qty whse_address city prov pcode 145 Saw 122 136 40 122 Peter St. Newmarket Ont L4T5Y6 322 175 25 4433 Oak Ave Oakville L5^6R5 355 Screwdriver 111 55 130 Hammer 98 35 L5T6R5 The repeating groups can be eliminated by filling in the values in vacant cells of the table. Each row/column intersection must contain only a single value to satisfy the first rule for 1NF. What should the primary key be?

The Primary Key should be product_id concatenated to whse_id. Inventory product_id whse_id product_desc bin qty whse_address city prov pcode 145 122 Saw 136 40 122 Peter St. Newmarket Ont L4T5Y6 322 175 25 4433 Oak Ave Oakville L5T6R5 355 Screwdriver 111 55 130 Hammer 98 35 The Primary Key should be product_id concatenated to whse_id. [product_id, whse_id, product_desc, bin,qty, whse_address, city, prov, pcode]

Problems with Un-normalized Data Update Problem Data Inconsistency Problem Data Redundancy Problem Insert Problem Deletion Problem

Oakville warehouse is moved to Burlington Inventory product_id whse_id product_desc bin qty whse_address city prov pcode 145 122 Saw 136 40 122 Peter St. Newmarket Ont L4T5Y6 322 175 25 4433 Oak Ave Oakville L5T6R5 355 Screwdriver 111 55 130 Hammer 98 35 The Update Problem The need to perform the same update in several locations of the database because the same data is repeated Oakville warehouse is moved to Burlington We will have to make more than one change to the database

The Data Inconsistency Problem Inventory product_id whse_id product_desc bin qty whse_address city prov pcode 145 122 Saw 136 40 122 Peter St. Newmarket Ont L4T5Y6 322 175 25 4433 Oak Ave Burlington L5T6R5 355 Screwdriver 111 55 130 Hammer 98 35 Burlingtown The Data Inconsistency Problem When the same data is repeated in several records, they can be inconsistent What is the inconsistency?

The Data Redundancy Problem Inventory product_id whse_id product_desc bin qty whse_address city prov pcode 145 122 Saw 136 40 122 Peter St. Newmarket Ont L4T5Y6 322 175 25 4433 Oak Ave Burlington L5T6R5 355 Screwdriver 111 55 130 Hammer 98 35 The Data Redundancy Problem the unnecessary repetition of data in the database of non-key fields. While it is fine to repeat Primary Keys and Foreign Keys, we do not want to repeat data fields.

The Insert Problem Only one table for storing information, STUDENT Let us say we have just hired a new teacher: Mr. Vert. We have no way to put him into the database as he has no students yet. STUDENT(Student-Num, Student-Name, Teacher, Student-Age) 1243658712 Tom Blu Ms.Greene 14 2343216578 Jill Fall Mr.Brown 14 3214325436 Jack Pail Ms.Green 14

The Deletion Problem If there is no teacher table, and if a teacher’s students all go to high school, then the teacher will disappear from our database. STUDENT(Student-Num, Student-Name, Teacher, Student-Age) 1243658712 Tom Blu Ms.Greene 14 2343216578 Jill Fall Mr.Brown 14 3214325436 Jack Pail Ms.Green 14

Second Normal Form (2NF) 1 no partial dependencies; all non-key columns are fully dependent on the entire primary key 2

Is this table in 2NF? Fix this Inventory 145 122 Saw 136 40 product_id whse_id product_desc bin qty whse_address city prov pcode 145 122 Saw 136 40 122 Peter St. Newmarket Ont L4T5Y6 322 175 25 4433 Oak Ave Burlington L5T6R5 355 Screwdriver 111 55 130 Hammer 98 35 Is this table in 2NF? Fix this

Let’s write this out in a text based form called Relational Schema: Product Warehouse product_id product_desc 145 Saw 355 Screwdriver1 130 Hammer whse_id whse_address city prov pcode 122 122 Peter St. Newmarket Ont L4T5Y6 322 4433 Oak Ave Oakville L5T6R5 Inventory product_id whse_id bin qty 145 122 40 322 175 25 355 111 55 130 98 35 Let’s write this out in a text based form called Relational Schema:   Capitalize the entity name Put attributes in parenthesis Bold and underline the primary key For later - Italicize the foreign key or use FK

PRODUCT (product_id, prod_desc) Warehouse product_id product_desc 145 Saw 355 Screwdriver1 130 Hammer whse_id whse_address city prov pcode 122 122 Peter St. Newmarket Ont L4T5Y6 322 4433 Oak Ave Oakville L5T6R5 Inventory product_id whse_id bin qty 145 122 40 322 175 25 355 111 55 130 98 35 PRODUCT (product_id, prod_desc)   WAREHOUSE (whse_id, whse_address, city, prov, pcode) INVENTORY (whse_id (FK), product_id (FK), bin, qty)

ERD comes to a similar conclusion Distribution Company Many-to-Many ERD Many products stored in many warehouses A large distribution company has many warehouses across the county which provides faster shipping for customers. The company handles many products that may be stored in any of the company’s warehouses. Of course, there will be times when a product is out of stock and is not stored in a particular warehouse. The company wants to know the quantity on hand (inventory) for each product at each warehouse. For products store the product id, product name, and unit cost. For warehouses, store the warehouse id and city.

Third Normal Form (3NF) 1 2 2NF A non-key column cannot determine the value of another non-key column. Every non-key column must depend directly on the primary key 2

Warehouse whse_id whse_address city prov pcode 122 122 Peter St. Newmarket Ont L4T5Y6 322 4433 Oak Ave Oakville L5T6R5 To satisfy the second rule of 3NF, the warehouse table can be split into two tables

Postal Code determines city and province Warehouse PostalCode whse_id whse_address pcode 122 122 Peter St. L4T5Y6 322 4433 Oak Ave L5T6R5 pcode city prov L4T5Y6 Newmarket Ont L5T6R5 Oakville Postal Code determines city and province

Why is this table not in 3NF? ORDER order_id cust_id product_id quantity unit_price total_amt 101 141 222 11 25.86 284.46 102 156 333 54 35.77 1931.58 103 322 654 33 22.10 729.30 104 445 321 87 91.55 7968.33

ORDER in 3NF ORDER order_id cust_id product_id quantity unit_price 101 141 222 11 25.86 102 156 333 54 35.77 103 322 654 33 22.10 104 445 321 87 91.55

Functional Dependency A functional dependency occurs when one or more attributes in a table uniquely determines another attribute product_id  prod_desc whse_id, product_id  bin, qty whse_id  whse_address, city, prov, pcode

Partial Dependency Partial dependency is where a non-key column is dependent on part of the primary key but is not dependent on the entire primary key. [product_id, whse_id, product_desc, bin,qty, whse_address, city, prov, pcode] The prod_desc column is dependent on the product_id key but is not determined by the whse_id key The whse_address column is dependent on the whse_id key but is not related to the product_id key We only need to look for this and solve it when we have a concatenated key

Boyce-Codd Normal Form (BCNF) Every determinant in a table is a candidate key. If there is only one candidate key, 3NF and BCNF are the same. 1

Fourth Normal Form (4NF) 1 It has no multivalued dependencies. A multivalued fact is one in which several values for a column might be determined by one value for another column. 2

Summary Rules for reducing data redundancy and related problems called normal forms UNF? 1NF – no repeating groups 2NF – no partial dependencies 3NF – no transitive dependencies Functional dependency? Partial dependency?

Assignment 1 15% of final mark Groups of five students Enough work to keep 4 or 5 students busy, not ideal for 3 or less students Assign tasks

Assignment 1 Assigning Tasks 2 or 3 students work on design Normalization UNF, 1NF, 2NF, 3NF Merging Final ERD 2 or 3 students work on SQL Create a collection early and secure it (DB201AG01) Create Tables, views, constraints, this can be modified as you finalize the design Set permissions – this can be done early. Save the SQL for this if objects need to be removed and recreated. Student can optionally work on electronic submission of DSPFD with a CLLE program