Copyright, Harris Corporation & Ophir Frieder, 19981 Normal Forms “Why be normal?” - Author unknown Normal.

Slides:



Advertisements
Similar presentations
 Definition  Components  Advantages  Limitations Contents  Definition Definition  Normal Forms Normal Forms  First Normal Form First Normal Form.
Advertisements

Normalization What is it?
Ch 10, Functional Dependencies and Normal forms
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Jump to first page Normalization Jump to first page Topics n Why normalization is needed n What causes anomalies n What the 4 normal forms are n How.
Chapter 8 Normal Forms Based on Functional Dependencies Deborah Costa Oct 18, 2007.
Fundamentals, Design, and Implementation, 9/e Chapter 4 The Relational Model and Normalization.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 COS 346 Day 5.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Boyce-Codd Normal Form Kelvin Nishikawa SE157a-03 Fall 2006 Kelvin Nishikawa SE157a-03 Fall 2006.
1 Database Design Theory Which tables to have in a database Normalization.
Normalization I.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
SLIDE 1IS 257 – Fall 2004 Database Design: Normalization and The Relational Model University of California, Berkeley School of Information.
Chapter 5 Normalization of Database Tables
NORMALIZATION N. HARIKA (CSC).
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
Normalization B Database Systems Normal Forms Wilhelm Steinbuss Room G1.25, ext. 4041
Daniel AdinugrohoDatabase Programming 1 DATABASE PROGRAMMING Lecture on 29 – 04 – 2005.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Week 6 Lecture Normalization
DBSQL 4-1 Copyright © Genetic Computer School 2009 Chapter 4 Database Design.
Functional Dependencies
Lecture 12 Inst: Haya Sammaneh
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Relational databases and third normal form As always click on speaker notes under view when executing to get more information!
Fundamentals, Design, and Implementation, 9/e. Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/2 Copyright.
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 4 This material was developed by Oregon Health & Science.
Normalization. 2 Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification of various types of update anomalies.
Database Systems: Design, Implementation, and Management Tenth Edition
Chapter 4 The Relational Model and Normalization.
CS 405G: Introduction to Database Systems 18. Normal Forms and Normalization.
Concepts of Database Management, Fifth Edition
Database Management COP4540, SCS, FIU Relation Normalization (Chapter 14)
The Relational Model and Normalization R. Nakatsu.
1 Database Design and Development: A Visual Approach © 2006 Prentice Hall Chapter 4 DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH Chapter 4 Normalization.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Schema Refinement and Normal Forms 20131CS3754 Class Notes #7, John Shieh.
In this chapter, you learn about the following: ❑ Anomalies ❑ Dependency and determinants ❑ Normalization ❑ A layman’s method of understanding normalization.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Functional Dependencies and Normalization for Relational Databases.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Copyright, Harris Corporation & Ophir Frieder, The Process of Normalization.
Lecture Nine: Normalization
Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/1 Copyright © 2004 Please……. No Food Or Drink in the class.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization MIS335 Database Systems. Why Normalization? Optimizing database structure Removing duplications Accelerating the instructions Data integrity!
Normalization.
Copyright, Harris Corporation & Ophir Frieder, Relational Definitions.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Brian Thoms.  Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Ch 7: Normalization-Part 1
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
RELATIONAL TABLE NORMALIZATION. Key Concepts Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Objectives of Normalization  To create a formal framework for analyzing relation schemas based on their keys and on the functional dependencies among.
Copyright © Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF.
Logical Database Design and Relational Data Model Muhammad Nasir
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Normal Forms 1NF – A table that qualifies as a relation is in 1NF. (Back)(Back) 2NF – A relation is in 2NF if all of its nonkey attributes are dependent.
Payroll Management System
Presentation transcript:

Copyright, Harris Corporation & Ophir Frieder, Normal Forms “Why be normal?” - Author unknown Normal

Copyright, Harris Corporation & Ophir Frieder, Objectives To define first, second, third, and Boyce-Codd normal forms. To discuss the motivation for normal forms, and the implications for database design.

Copyright, Harris Corporation & Ophir Frieder, Normal Forms A relational scheme is said to be in first normal form (1NF) if and only if each of it’s domains contains only scalar values. –No “repeating groups” or unbounded lists. –A field cannot itself be a table, as in Oracle8! Question: Do CITY and STATE violate 1NF in relation scheme R=(CITY,STATE,POPULATION)?

Copyright, Harris Corporation & Ophir Frieder, Motivation For 1NF Representation - with a repeating group, some method must be devised for specifying the end or length of the list. Space Allocation - how is space allocated on a per record basis if repeating groups are allowed?

Copyright, Harris Corporation & Ophir Frieder, Motivation For 1NF Operations - without 1NF, all operations become more complex, and this propagates throughout the database management system. Theory - 1NF simplifies the theoretical basis of the relational model (e.g., proof of algorithmic correctness).

Copyright, Harris Corporation & Ophir Frieder, Converting A Relational Scheme To 1NF A repeating group is typically eliminated by “flattening” the table. This makes most things simpler.

Copyright, Harris Corporation & Ophir Frieder, Single v.s. Multiple Keys For the sake of simplicity we will assume initially that each relational scheme has exactly one key. Multiple keys do occur, but less frequently. In what follows, the formal definitions do not change in the case of multiple keys. Given a relational scheme R, it will be helpful throughout the following to divide the attributes up into two sets, those that are part of the key, and those that are not.

Copyright, Harris Corporation & Ophir Frieder, Example #1: Key And Non-Key Attributes Consider the following relational scheme for a department store chain (e.g., Walmart): –Attributes: STORE_ID#- A store identification number. CITY- The city in which the store is located. STATE- The state in which the store is located. ITEM- An item sold by the store. PRICE- The price of the item. –Functional Dependencies: STORE_ID# => CITY STORE_ID# => STATE STORE_ID#, ITEM => PRICE

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont.

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont. The only key is: STORE_ID#,ITEM Key AttributesNon-Key Attributes STORE_ID#CITY ITEMSTATE PRICE

Copyright, Harris Corporation & Ophir Frieder, Partial Dependency A functional dependency X=>A is a partial dependency if: –X is a proper subset of the key attributes, and –A is a non-key attribute.

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont. Key AttributesNon-Key Attributes STORE_ID#CITY ITEMSTATE PRICE STORE_ID# => CITY is a partial dependency. Similarly, STORE_ID# => STATE is a partial dependency.

Copyright, Harris Corporation & Ophir Frieder, Normal Forms - 2NF A relational scheme is said to be in second normal form (2NF) if and only if it is in 1NF and contains no partial dependencies. Question: Why eliminate partial dependencies?

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont. Note the redundancy in the following legal relation:

Copyright, Harris Corporation & Ophir Frieder, Motivation for 2NF Example #2 Consider the following relational scheme for maintaining information associated with students at a university: –Attributes: STUDENT_ID#- The social security number of a student. NAME- The students last name. COURSE_ID#- The ID # of a course the student is registered in. DEPT_ID#- The ID # of the department that offers the course. –Functional Dependencies: STUDENT_ID# => NAME COURSE_ID# => DEPT_ID#

Copyright, Harris Corporation & Ophir Frieder, Example #2, Cont.

Copyright, Harris Corporation & Ophir Frieder, Example #2, Cont. The only key is: STUDENT_ID#, COURSE_ID# Key AttributesNon-Key Attributes STUDENT_ID#NAME COURSE_ID#DEPT_ID# STUDENT_ID# => NAME is a partial dependency, so it is not in 2NF. Similarly, COURSE_ID# => DEPT_ID# is a partial dependency.

Copyright, Harris Corporation & Ophir Frieder, Example #2, Cont. Note the redundancy in the following legal relation:

Copyright, Harris Corporation & Ophir Frieder, Anomalies Resulting From Partial Dependencies Insertion Anomalies - A new student cannot be added unless they are currently registered for at least one course. Deletion Anomalies - If a student drops their last, or only course, then there is no record left of the student. Update Anomalies - Changing a students’ name requires all their records to be updated. Similarly, for changing a course ID #, or for assigning a course to a different department. Note that the first two assume null values are not desirable, which is of considerable debate in the database community.

Copyright, Harris Corporation & Ophir Frieder, Transitive Dependency A functional dependency X=>A is a transitive dependency if: –X is a proper subset of the non-key attributes, and –A is a non-key attribute.

Copyright, Harris Corporation & Ophir Frieder, Motivation For 3NF Example #1 Consider the following relational scheme for National Football League (NFL) athletes: –Attributes: PLAYER_ID#- The social security number for an NFL athlete. TEAM- The name of the team the athlete plays for. STATE- The state in which the team is located. –Functional Dependencies: PLAYER_ID# => TEAM TEAM => STATE

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont.

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont. Key AttributesNon-Key Attributes PLAYER_ID#TEAMSTATE TEAM => STATE is a transitive dependency.

Copyright, Harris Corporation & Ophir Frieder, Normal Forms - 3NF A relational scheme is said to be in third normal form (3NF) if and only if it is in 2NF and contains no transitive dependencies. Every non-key attribute depends on the key, the whole key and nothing but the key. Question: Why eliminate transitive dependencies?

Copyright, Harris Corporation & Ophir Frieder, Motivation For 3NF, Cont. Key AttributesNon-Key Attributes PLAYER_ID#TEAM STATE There are no partial dependencies, so it is in 2NF. However, it is not in 3NF because of the transitive dependency TEAM=>STATE.

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont. Note the redundancy in the following legal relation:

Copyright, Harris Corporation & Ophir Frieder, Motivation for 3NF Example #2 Consider the following relational scheme for a university student database: –Attributes: STUDENT_ID#- A students’ social security number. CITY- The city of the students’ home address. STATE- The state of the students’ home address. ZIP- The zip code of the students’ home address. –Functional Dependencies: STUDENT_ID# => CITY STUDENT_ID# => STATE STUDENT_ID# => ZIP ZIP => STATE ZIP => CITY

Copyright, Harris Corporation & Ophir Frieder, Example #2, Cont.

Copyright, Harris Corporation & Ophir Frieder, Example #2, Cont. Key AttributesNon-Key Attributes STUDENT_IDCITY STATE ZIP There are no partial dependencies, so it is in 2NF. ZIP => STATE and ZIP=>CITY are transitive dependencies, so it is not in 3NF.

Copyright, Harris Corporation & Ophir Frieder, Example #2, Cont. Note the redundancy in the following legal relation:

Copyright, Harris Corporation & Ophir Frieder, Motivation for 3NF Example #3 Consider a relational scheme for tracking software licenses: –Attributes: LICENSE_ID#- The license ID number for a piece of software. MACHINE_ID#- The ID number of the machine on which the software is installed. EMPLOYEE_ID#- The social security number of the employee to which the machine is assigned. LOCATION- The location of the employee’s office. –Functional Dependencies: LICENSE_ID# => MACHINE_ID# MACHINE_ID# => EMPLOYEE_ID# EMPLOYEE_ID# => LOCATION

Copyright, Harris Corporation & Ophir Frieder, Example #3, Cont.

Copyright, Harris Corporation & Ophir Frieder, Example #3, Cont. Key AttributesNon-Key Attributes LICENSE_ID#MACHINE_ID# EMPLOYEE_ID# LOCATION There are no partial dependencies, so the relational scheme is in 2NF. MACHINE_ID# => EMPLOYEE_ID# and EMPLOYEE_ID# => LOCATION are both transitive dependencies, so it is not in 3NF.

Copyright, Harris Corporation & Ophir Frieder, Example #3, Cont. Note the redundancy in the following legal relation:

Copyright, Harris Corporation & Ophir Frieder, Anomalies Resulting From Transitive Dependencies Insertion Anomalies - A license cannot be added until it is installed on a machine, and until that machine is assigned to an employee. Deletion Anomalies - Deleting all of the records for a particular employee would delete any record of the machines or licenses assigned to that employee. Update Anomalies - Changing the employee assigned to a particular machine requires multiple record updates. Similarly for changing an employees’ location. As with partial dependencies, the first two assume null values are not desirable.

Copyright, Harris Corporation & Ophir Frieder, Normal Forms - BCNF A relation is said to be in Boyce/Codd normal form (BCNF) if every attribute depends on the key, the whole key, and nothing but the key.

Copyright, Harris Corporation & Ophir Frieder, BCNF Example #1 Consider a relational scheme for tracking employee salary adjustments: –Attributes: EMPLOYEE_ID#- An employee identification number DATE- A date on which the employee’s salary was adjusted AMOUNT- The amount of the salary adjustment EXPLANATION- An explanation for the adjustment –Functional Dependencies: EMPLOYEE_ID#,DATE => AMOUNT EMPLOYEE_ID#,DATE => EXPLANATION

Copyright, Harris Corporation & Ophir Frieder, Example #3, Cont. Key AttributesNon-Key Attributes EMPLOYEE_ID#AMOUNT DATEEXPLANATION There are no partial dependencies, so the relational scheme is in 2NF. There are no transitive dependencies, so the relational scheme is in 3NF. Each of the non-key attributes depends on both of the key attributes (the key, the whole key, and nothing but the key), so the relational scheme is in BCNF.

Copyright, Harris Corporation & Ophir Frieder, Normal Forms - BCNF, Cont. Note that the definition of BCNF does not reference that for 3NF. This raises a couple of questions: –If a relational scheme is in BCNF, is it also in 3NF? –If a relational scheme is in 3NF, is it also in BCNF? The answer to the first is yes (proof is left as an exercise). The answer to the second question depends...

Copyright, Harris Corporation & Ophir Frieder, Normal Forms - BCNF, Cont. Relational Scheme Has Only One Key: –A relational scheme is in 3NF if and only if it is in BCNF. Relational Scheme Has Multiple Keys: –If the relational scheme is in BCNF, then it is in 3NF (already stated). –If the relational scheme is in 3NF, however, it is not necessarily in BCNF. If the case of multiple keys, 1NF, 2NF, and 3NF definitions are still the same.

Copyright, Harris Corporation & Ophir Frieder, Motivation For BCNF Example #1 Consider the following relational scheme: –Attributes: STUDENT_ID#- A student ID number. COURSE_ID#- The ID# of a course being taken by the student. FACULTY_ID#- The ID# of the faculty member who teaches the course taken by the student. –Functional Dependencies: STUDENT_ID#,COURSE_ID# => FACULTY_ID# FACULTY_ID# => COURSE_ID#

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont.

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont. There are two keys: STUDENT_ID#,COURSE_ID# STUDENT_ID#,FACULTY_ID# The relation is in 1NF, 2NF, and 3NF (why?) The relation is not in BCNF because of the dependency: FACULTY_ID# => COURSE_ID#

Copyright, Harris Corporation & Ophir Frieder, Example #1, Cont. Note the redundancy in the following legal relation:

Copyright, Harris Corporation & Ophir Frieder, Motivation For BCNF Example #2 Consider the following relational scheme: –Attributes: LICENSE_ID#- A Florida state driver’s license number. SS#- The social security number of the person holding the license. CODE- A traffic violation code. QTY- The number of times the person has been issued the violation within the past year. –Functional Dependencies: LICENSE_ID# => SS#LICENSE_ID#,CODE => QTY SS# => LICENSE_ID#SS#,CODE => QTY

Copyright, Harris Corporation & Ophir Frieder, Example #2, Cont.

Copyright, Harris Corporation & Ophir Frieder, Example #2, Cont. There are two keys: LICENSE_ID#,CODE SS#,CODE The relation is in 1NF, 2NF, and 3NF (why?) The relation is not in BCNF because of the dependencies: LICENSE_ID# => SS# SS# => LICENSE_ID#

Copyright, Harris Corporation & Ophir Frieder, Example #2, Cont. Note the redundancy in the following legal relation:

Copyright, Harris Corporation & Ophir Frieder, Update Anomolies Insertion Anomalies - The fact that a license ID# has been assigned to a particular person cannot be recorded unless they have at least one violation. Deletion Anomalies - Deleting all of the violations for a particular driver would delete any record of the license ID# for that person. Update Anomalies - Changing a driver’s name requires changing all the records for each type of violation the driver has committed. As with partial dependencies, the first two assume null values are not desirable.

Copyright, Harris Corporation & Ophir Frieder, Normal Forms Summary A relational scheme is said to be in first normal form (1NF) if and only if each of it’s domains contains only scalar values. A relational scheme is said to be in second normal form (2NF) if and only if it is in 1NF and contains no partial dependencies. A relational scheme is said to be in third normal form (3NF) if and only if it is in 2NF and contains no transitive dependencies. A relational scheme is said to be in Boyce/Codd normal form (BCNF) if and only if the only nontrivial dependencies for the relational scheme are those in which a key functionally determines one or more attributes (“every attribute depends on the key, the whole key, and nothing but the key”).

Copyright, Harris Corporation & Ophir Frieder, Normal Forms - 4NF & 5NF Currently Beyond The Scope Of This Course: A relational scheme R is said to be in fourth normal form (4NF) if and only if whenever there is a multivalued dependency X=>>Y, where Y is not empty or a subset of X, and XY does not include all the attributes of R, then X is a superkey of R. A relational scheme R is said to be in fifth normal form (5NF) - also called projection-join normal form (PJ/NF) - if and only if every join dependency in R is implied by the candidate keys of R.