CSIS 254 Oracle Normalization. Relational Databases (Review) In relational databases, all data is stored in tables, which correspond roughly to entitiesIn.

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Chapter 10: Designing Databases
BUSINESS DRIVEN TECHNOLOGY Plug-In T4 Designing Database Applications.
Copyright © 2015 Pearson Education, Inc. Database Design Chapters 17 and
Athabasca University Under Development for COMP 200 Gary Novokowsky
Monash University Week 7 Data Modelling Relational Database Theory IMS1907 Database Systems.
Normalization of Database Tables
Client/Server Databases and the Oracle 10g Relational Database
© 2005 by Prentice Hall Chapter 3a Database Design Modern Systems Analysis and Design Fourth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
CSIS 254 Oracle Database Design Day 3. Today’s Agenda Questions on homeworkQuestions on homework Review key concepts from last weekReview key concepts.
Entity-Relationship Model and Diagrams (continued)
The Relational Database Model:
Chapter 5 Normalization Transparencies © Pearson Education Limited 1995, 2005.
The Relational Database Model. 2 Objectives How relational database model takes a logical view of data Understand how the relational model’s basic components.
Chapter 5 Normalization of Database Tables
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
NORMALIZATION N. HARIKA (CSC).
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
The Relational Database Model
3 The Relational Model MIS 304 Winter Class Objectives That the relational database model takes a logical view of data That the relational model’s.
Normalization Rules for Database Tables Northern Arizona University College of Business Administration.
Chapter 5 Normalization of Database Tables
Week 6 Lecture Normalization
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Concepts and Terminology Introduction to Database.
Copyright, Harris Corporation & Ophir Frieder, Normal Forms “Why be normal?” - Author unknown Normal.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Normalization A technique that organizes data attributes (or fields) such that they are grouped to form stable, flexible and adaptive entities.
Database Systems: Design, Implementation, and Management Tenth Edition
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
Module III: The Normal Forms. Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form. The database.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Copyright Ó Oracle Corporation, All rights reserved. Normalization Use the student note section below for further explanation of the slide content.Use.
Logical Database Design Relational Model. Logical Database Design Logical database design: process of transforming conceptual data model into a logical.
Normalization Transparencies
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, Modified by Dr. Mathis 3-1 David M. Kroenke’s Chapter Three: The Relational.
MS Access: Creating Relational Databases Instructor: Vicki Weidler Assistant: Joaquin Obieta.
The Relational Database Model
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Module 2: Information Technology Infrastructure Chapter 5: Databases and Information Management.
C-1 Management Information Systems for the Information Age Copyright 2004 The McGraw-Hill Companies, Inc. All rights reserved Extended Learning Module.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 10 Normalization Pearson Education © 2009.
ITN Table Normalization1 ITN 170 MySQL Database Programming Lecture 3 :Database Analysis and Design (III) Normalization.
©NIIT Normalizing and Denormalizing Data Lesson 2B / Slide 1 of 18 Objectives In this section, you will learn to: Describe the Top-down and Bottom-up approach.
Database Design Normalisation. Last Session Looked at: –What databases were –Where they are used –How they are used.
Chapter 10 Designing Databases. Objectives:  Define key database design terms.  Explain the role of database design in the IS development process. 
Data modeling Process. Copyright © CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design.
Logical Database Design and the Relational Model.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Lecture 4: Logical Database Design and the Relational Model 1.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
1 Entity Relationship Approach u Top-down approach to data modeling u Uses diagrams u Normalization - confirms technical soundness u Entity Relationship.
CSIS 254 Oracle Database Design
Database Normalization
MIS 322 – Enterprise Business Process Analysis
Functional Dependencies
Entity-Relationship Model and Diagrams (continued)
© 2011 Pearson Education, Inc. Publishing as Prentice Hall
Entity relationship diagrams
Chapter 4.1 V3.0 Napier University Dr Gordon Russell
Relational Database Model
The Relational Database Model
DATABASE DESIGN & DEVELOPMENT
CSIS 254 Oracle Normalization
Normalisation 1 Unit 3.1 Dr Gordon Russell, Napier University
Presentation transcript:

CSIS 254 Oracle Normalization

Relational Databases (Review) In relational databases, all data is stored in tables, which correspond roughly to entitiesIn relational databases, all data is stored in tables, which correspond roughly to entities Each table is two-dimensional, consisting of rows and columnsEach table is two-dimensional, consisting of rows and columns Each row in a table, called a t-uple, corresponds to an occurrence of the entityEach row in a table, called a t-uple, corresponds to an occurrence of the entity Columns in each table contain similar data across all rows in the tableColumns in each table contain similar data across all rows in the table

Relational Database Example The following table is an example of a relational table describing classes that students have taken at a mythical college used in the rest of this lesson Student Student Course Student Student Course Id Name Id Course Name Grade Term Teacher Id Name Id Course Name Grade Term Teacher Joe Adams CSIS-840 VB Concepts C Spr-02 Wilkins Joe Adams CSIS-840 VB Concepts C Spr-02 Wilkins Joe Adams CSIS-824 Intro to C++ B Fal-02 Smythe Joe Adams CSIS-824 Intro to C++ B Fal-02 Smythe Joe Adams CSIS-740 Oracle Admin A Spr-03 Wallace Joe Adams CSIS-740 Oracle Admin A Spr-03 Wallace Jane Smith CSIS-941 Systems Des. B Fal-02 Evans Jane Smith CSIS-941 Systems Des. B Fal-02 Evans Jane Smith CSIS-840 VB Concepts B Spr-02 Wolkins Jane Smith CSIS-840 VB Concepts B Spr-02 Wolkins Ida Know CSIS-184 Networks A Sum-03 Farmer Ida Know CSIS-184 Networks A Sum-03 Farmer Eunice Eye CSIS-824 PowerPoint W Spr-02 Simpson Eunice Eye CSIS-824 PowerPoint W Spr-02 Simpson

Relational Database Example Each row (or t-uple) in the table describes a Class taken by a Student in a term at our collegeEach row (or t-uple) in the table describes a Class taken by a Student in a term at our college The data in each column is consistent throughout the tableThe data in each column is consistent throughout the table However, there are three inconsistencies in the table itself. Can you find them?However, there are three inconsistencies in the table itself. Can you find them?

Primary Keys (Review) Each row in a table has a primary key, which is the column or set of columns identified to our DBMS that uniquely identifies it from every other row in the tableEach row in a table has a primary key, which is the column or set of columns identified to our DBMS that uniquely identifies it from every other row in the table No attribute value in a primary key can be NULLNo attribute value in a primary key can be NULL A table can have only one primary keyA table can have only one primary key If a primary key is not specified, Oracle supplies oneIf a primary key is not specified, Oracle supplies one What would be the primary key for our sample database?What would be the primary key for our sample database?

Foreign Keys (Review) An attribute (or group of attributes) in a table can also be a foreign key, meaning that it references the primary key (or at least unique attribute) to another tableAn attribute (or group of attributes) in a table can also be a foreign key, meaning that it references the primary key (or at least unique attribute) to another table An example would be a Customer Id attribute on an invoice header, which would reference the customer account information for that invoiceAn example would be a Customer Id attribute on an invoice header, which would reference the customer account information for that invoice

Normalization Let’s begin our discussion of normalization by using an example -- we want to expand the sample relational table for our mythical college by tracking data for: –students –courses –departments –teachers –classes (courses offered during a term) –teachers assigned to each class –students enrolled in each class

Database Normalization Example We might start off with an entity for each Class that looks something like this CLASS Course Id Term Offered Department Name Course Description Classroom (or “Internet”) Credits / Hours Teacher Id Teacher Name Student #1 Data Student #2 Data …. …. Student #30 Data

Database Normalization Example The information stored for each student would be What problems can you see with this scheme? CLASS (exploded) … Student #1 Data Id Full Name Addresses Grade for Class GPA Student #2 Data Id Full Name Addresses Grade for Class GPA.

Problems with Our Example We can’t have more than 30 students in a classWe can’t have more than 30 students in a class There’s lots of duplicate information in our tablesThere’s lots of duplicate information in our tables –This design would require many updates whenever a change was made to data about a department, a teacher, a student, etc. Does it make sense for us to have to know, for example, a course number in order to to look up a teacher’s name?Does it make sense for us to have to know, for example, a course number in order to to look up a teacher’s name?

Problems with Our Example (continued) Removing a class entity occurrence might remove valuable information from our databaseRemoving a class entity occurrence might remove valuable information from our database We don’t have any data verification checksWe don’t have any data verification checks –We might wind up with inconsistent data across two or more records (is this necessarily bad if we are trying to take snapshots?)

Normalization Goal #1 Remove redundant data Duplicated data wastes disk spaceDuplicated data wastes disk space Duplicated data may not necessarily be consistent, that is, stored in exactly the same wayDuplicated data may not necessarily be consistent, that is, stored in exactly the same way Redundant data creates problems for our coders Redundant data creates problems for our coders –Ideally, data should be stored (and changed) in exactly the same way in all locations, which not only is time consuming for the system’s programmers, but also takes computer resources to perform once the system is implemented

Normalization Goal #2 Remove dependency issues It is not intuitive for a user of our new system to look in the CLASS entity to find, for example, a student’s address.It is not intuitive for a user of our new system to look in the CLASS entity to find, for example, a student’s address. It would probably make more sense to move this information into a separate entity (i.e., a database table that defines students).It would probably make more sense to move this information into a separate entity (i.e., a database table that defines students).

Normalization The Bottom Line “In summary, normal forms insure that we do not compromise the integrity of our data by either creating false data or destroying true data.” Ensor & Stevenson

Forms of Normalization To accomplish these goals, we have created a set of rules which define normal forms or levels.To accomplish these goals, we have created a set of rules which define normal forms or levels. There are five normal forms, each progressively more restrictive, which are called first normal form (1NF), second normal form (2NF), …There are five normal forms, each progressively more restrictive, which are called first normal form (1NF), second normal form (2NF), … Most database designers only consider the first three forms in their work, as we willMost database designers only consider the first three forms in their work, as we will As we shall see, there might be good reasons to deviate from these normal formsAs we shall see, there might be good reasons to deviate from these normal forms

First Normal Form (1NF) A database is in first normal form (1NF) if each attribute of the database is simple, single-valued (atomic), and does not repeatA database is in first normal form (1NF) if each attribute of the database is simple, single-valued (atomic), and does not repeat –Let’s assume column definitions are consistent across rows Method:Method: –Reduce all attributes into atomic components –Eliminate duplicative columns (repeating groups) and multi- valued attributes from the same table –Create a separate table for each group of related data –Identify each row with a unique column or set of columns (a primary key)

Our Sample Database Here’s what our database entity for classes at our college currently looks like CLASS Course Id Term Offered Department Name Course Description Classroom (or “Internet”) Credits / Hours Teacher Id Teacher Name Student #1 Data Student #2 Data …. …. Student #30 Data

Our Sample Database in 1NF We should divide the Course Id into a Department Id and Course Number (e.g., Course ID “CSIS-254” would be divided into Department Id “CSIS”, Course Number “254”) (Won’t this make the Department Name redundant?) CLASS Department Id (added) Course Number (added) Term Offered Department Name Course Description Classroom (or “Internet”) Credits / Hours Teacher Id Teacher Last Name Student #1 Data Student #2 Data …. …. Student #30 Data

Our Sample Database in 1NF Next, break out Student Ids, Names, Address, and Grades into a separate entity, eliminating the repeating Student groups. Department Id Course Number Term Offered Student Id Student Full Name Student Addresses Student Grade for Class Student GPA CLASS / STUDENT

Our Sample Database in 1NF We need to break down the Student’s Names into their simpler components Department Number Course Number Term Offered Student Id Student Full Name First Name Middle Name Last Name Student Addresses Student Grade for Class Student GPA CLASS / STUDENT

Our Sample Database in 1NF Finally, we need to break out Student e- mail Addresses into another entity, where each occurrence represents a single address Department Id Course Number Term Offered Student Id Address Number or Id Student Address CLASS / STUDENT ADDRESS ADDRESS

Our Sample Database in 1NF Department Id Course Number Term Offered Department Name Course Description Classroom (or “Internet”) Credits / Hours Teacher Id Teacher Last Name CLASS CLASS / STUDENT Department Id Course Number Term Offered Student Id Student Full Name First Name Middle Name Last Name Student Grade for Class Student GPA

Our Sample Database in 1NF CLASS / STUDENT Department Id Course Number Term Offered Student Id Student Full Name First Name Middle Name Last Name Student Grade for Class Student GPA CLASS / STUDENT ADDRESS ADDRESS Department Id Course Number Term Offered Student Id Address Number or Id Student Address

1NF Advantages Removes limits artificially introduced into a database design by using repeating groupsRemoves limits artificially introduced into a database design by using repeating groups Ensures that attributes are broken into their most basic units and are not multi-valuedEnsures that attributes are broken into their most basic units and are not multi-valued

Exercise Put the following table in 1NF, then draw an ERD for your new system FAVORITE TV SHOWS TV Show Name Category Main Star Name #1 Main Star Name #2 Main Star Name #3 Day and Time Shown NetworkChannel My Rating (1-10)

One Possible Answer FAVORITE TV SHOWS TV Show Name Category My Rating (1-10) SHOW / STARS TV Show Name Star Number Star Name SHOW TIMES TV Show Name Slot Number Date and Time NetworkChannel

Second Normal Form (2NF) 2NF implies 1NF by definition2NF implies 1NF by definition All non-key attributes must be fully-dependent on every key attribute in the primary keyAll non-key attributes must be fully-dependent on every key attribute in the primary key –In other words, a non-key attribute cannot depend on only part of the primary key –This restriction applies only to tables with composite keys 2NF reduces redundant data in a table by extracting it, placing it in new table(s), then creating relationships between those tables.2NF reduces redundant data in a table by extracting it, placing it in new table(s), then creating relationships between those tables.

Second Normal Form (2NF) Method:Method: –Remove subsets of data that appear in multiple rows of a table, and place into separate tables –Create relationships between these new tables and their predecessors through the use of foreign keys.

Our Sample Database in 2NF We can break out the Department Name from the CLASS entity, as it will be the same for each Class having the same Department Department Id Department Name DEPARTMENT

Our Sample Database in 2NF We also can break out the Course Description from this entity, as it also will be the same for each Class referencing the same Course Department Id Course Number Course Description Credits / Hours COURSE Note that we’ve kept the Department Id in this entity. Why?

Our Sample Database in 2NF We can also break out the information about each Teacher, since it also will be the same for each Class that a Teacher conducts, irrespective of the Class Teacher Id Teacher Last Name TEACHER

Our Sample Database in 2NF Our new CLASS / STUDENT entity can also have its student- related attributes (names, and GPA) broken out, that is, attributes that do not change with the class number Student Id Student Full Name First Name Middle Name Last Name Student GPA STUDENT

Our Sample Database in 1NF Student Addresses are not dependent upon Department Id, Course Number, or Term, so remove them from the e- mail entity Department Id (deleted) Course Number (deleted) Term Offered (deleted) Student Id Address Number or Id Student Address STUDENT ADDRESS ADDRESS

Our Sample Database in 2NF Our final CLASS / STUDENT entity, minus all of the attributes that have been moved to other entities, looks like Department Id Course Id Student Id Term Student Grade for Class CLASS / STUDENT

2NF and Foreign Keys To ensure data integrity, we would implement four foreign keys in our CLASS, CLASS / STUDENTTo ensure data integrity, we would implement four foreign keys in our CLASS, CLASS / STUDENT –Department Id must reference an occurrence in DEPARTMENT entity –Course Id must reference a row in COURSE –Student Id must reference a row in STUDENT –Teacher Id must reference a row in TEACHER Would we implement a similar restriction on our student address entity?Would we implement a similar restriction on our student address entity?

2NF Advantages All advantages of 1NFAll advantages of 1NF Common data is forced to be consistent, since it is stored in only one place in the databaseCommon data is forced to be consistent, since it is stored in only one place in the database We can store data about separate entities without implying the existence of othersWe can store data about separate entities without implying the existence of others –In our original database design, we can’t store information about Students, Teachers, or Departments if we don’t have any classes in which they are involved.

Exercise Convert the following table into 2NF, and draw a new ERD SALES ORDER SALES ORDER Order Number Order Number Customer Account Number Customer Account Number Customer Account Name Customer Account Name Customer Address Customer Address Date of Entry of Order Date of Entry of Order Date of Requested Shipment Date of Requested Shipment Item Numbers Item Numbers Item Descriptions Item Descriptions Quantities Ordered Quantities Ordered Unit Prices Unit Prices Extended Prices Extended Prices Total Order Price Total Order Price

Third Normal Form (3NF) 3NF implies 2NF (which implies 1NF)3NF implies 2NF (which implies 1NF) A database is in third normal form (3NF) if the data in every column of each row (occurrence) in a table (entity) is dependent ONLY upon each column in the keyA database is in third normal form (3NF) if the data in every column of each row (occurrence) in a table (entity) is dependent ONLY upon each column in the key –In general, any time the contents of a group of fields may apply to more than a single record in the table, consider placing those fields in a separate table. –This means that derived attributes are not allowed in 3NF

Third Normal Form (3NF) All attributes depend upon the key, the whole key, and nothing but the keyAll attributes depend upon the key, the whole key, and nothing but the key Method:Method: –Remove all derived columns –Move all remaining columns not dependent on the key into a new table

Our Sample Database in 3NF Our STUDENT entity cannot contain a GPA, since that is a derived attribute (the average of all of the Grades received) Student Id Student Names First Name Middle Name Last Name Student GPA (deleted) STUDENT

Advantages of 3NF All advantages of 1NF and 2NFAll advantages of 1NF and 2NF Information is stored in one and only one place in the databaseInformation is stored in one and only one place in the database All entities are now 2-dimensional, non- redundant, and can be implemented in relational tablesAll entities are now 2-dimensional, non- redundant, and can be implemented in relational tables

Disadvantages of Normalization Proliferation of tables, resulting in increased system complexityProliferation of tables, resulting in increased system complexity –Can be overcome with views for end-users Performance hits through added tables and lack of derived attributesPerformance hits through added tables and lack of derived attributes –May be partially offset by reduced computing needs of maintaining data only once We will discuss these in detail next week...We will discuss these in detail next week...

Last Slide Next Week’s Assignment Draw a complete ERD for our normalized 3NF mythical college database. Does it make sense to you?Draw a complete ERD for our normalized 3NF mythical college database. Does it make sense to you? Normalize the two organizations / systems that you used in last week’s homework by updating their ERD’s (Engineering Method only).Normalize the two organizations / systems that you used in last week’s homework by updating their ERD’s (Engineering Method only). Introduce at least two derived attributes that you might include in your design, and explain why.Introduce at least two derived attributes that you might include in your design, and explain why. Prepare for a quiz next week on what we have covered so far in class:Prepare for a quiz next week on what we have covered so far in class: Stages of SDLC, Entities, Attributes, Relationships, Diagramming, and Normalization