Normalization Is the gradual and sequential process of efficiently organizing data in a database that follows the rules listed in the previous slide –

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

5 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
Chapter 5 Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Normalization What is it?
The Relational Model System Development Life Cycle Normalisation
Chapter 5 Normalization of Database Tables
Normalization of Database Tables Special adaptation for INFS-3200
Normalization of Database Tables
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
The Relational Database Model:
Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
Terms - data,information, file record, table, row, column, transaction, concurrency Concepts - data integrity, data redundancy, Type of databases – single-user,
Bad DB Design Duplicate of data Duplicate of data Updating Updating Deleting Deleting.
NORMALIZATION N. HARIKA (CSC).
Database Architecture The Relational Database Model.
Introduction to Databases
Normalization Rules for Database Tables Northern Arizona University College of Business Administration.
Chapter 5 Normalization of Database Tables
Lecture 2 The Relational Model. Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations.
Chapter 4 The Relational Model Pearson Education © 2014.
Chapter 3 The Relational Model Transparencies Last Updated: Pebruari 2011 By M. Arief
Week 6 Lecture Normalization
Modelling Techniques - Normalisation Description and exemplification of normalisation.Description and exemplification of normalisation. Creation of un-normalised.
A Guide to SQL, Eighth Edition Chapter Two Database Design Fundamentals.
Database Requires Normalization
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 4 This material was developed by Oregon Health & Science.
Database Systems: Design, Implementation, and Management Tenth Edition
Concepts of Database Management Sixth Edition Chapter 5 Database Design 1: Normalization.
Concepts of Database Management, Fifth Edition
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
1 A Guide to MySQL 2 Database Design Fundamentals.
1 DATABASE SYSTEMS DESIGN IMPLEMENTATION AND MANAGEMENT INTERNATIONAL EDITION ROB CORONEL CROCKETT Chapter 7 Normalisation.
Module III: The Normal Forms. Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form. The database.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
CORE 2: Information systems and Databases NORMALISING DATABASES.
MS Access: Creating Relational Databases Instructor: Vicki Weidler Assistant: Joaquin Obieta.
What's a Database A Database Primer Let’s discuss databases n Why they are hard n Why we need them.
M1G Introduction to Database Development 4. Improving the database design.
Database Design – Lecture 8
Component 4/Unit 6d Topic IV: Design a simple relational database using data modeling and normalization Description and Information Gathering Data Model.
Normalization of Database Tables
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization Example. Database Systems, 8 th Edition 2 Database Tables and Normalization Normalization –Process for evaluating and correcting table structures.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
NORMALIZATION. What is Normalization  The process of effectively organizing data in a database  Two goals  To eliminate redundant data  Ensure data.
Data modeling Process. Copyright © CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design.
Understand Relational Database Management Systems Software Development Fundamentals LESSON 6.1.
Btec National - IT SYSTEMS ANALYSIS AND DESIGN 1 IT Systems Analysis and Design Entity Relationship Diagrams.
IST Database Normalization Todd Bacastow IST 210.
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Normalization ACSC 425 Database Management Systems.
IMS 4212: Normalization 1 Dr. Lawrence West, Management Dept., University of Central Florida Normalization—Topics Functional Dependency.
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
What Is Normalization  In relational database design, the process of organizing data to minimize redundancy  Usually involves dividing a database into.
MS Access. Most A2 projects use MS Access Has sufficient depth to support a significant project. Relational Databases. Fairly easy to develop a good user.
Lecture # 17 Chapter # 10 Normalization Database Systems.
Normalisation FORM RULES 1NF 2NF 3NF. What is normalisation of data? The process of Normalisation organises your database to: Reduce or minimise redundant.
Database Normalization. What is Normalization Normalization allows us to organize data so that it: Normalization allows us to organize data so that it:
NORMALISATION OF DATABASES. WHAT IS NORMALISATION? Normalisation is used because Databases need to avoid have redundant data, which makes it inefficient.
Normalization Karolina muszyńska
Database Normalization
Entity relationship diagrams
Chapter 6 Normalization of Database Tables
BTEC ICT – Unit 18 With Mr Griffiths.
Presentation transcript:

Normalization Is the gradual and sequential process of efficiently organizing data in a database that follows the rules listed in the previous slide – Normalization commonly involves following three schemas (in order): First, Second, and Third Normal Form (1NF, 2NF, 3NF) – This is commonly done during early stages on UML class diagrams The goal of normalization is to: – eliminate the duplication of data (which make database large, inefficient, and slow) which in turn prevents data manipulation anomalies and loss of data integrity changes that happen in different places may not be the same – This is done by creating tables and assigning PK for each table, and making sure that each information shows up once in the database It eliminates redundant data (storing the same data in more than one table) and ensuring data dependencies are logical (only storing related data in a table) Normalization reduces the amount of space a database consumes and ensures data is logically stored

First Normal Form (1NF) 1NF deals with duplicative data across multiple columns! It sets the very basic rules to make sure that: – Separate tables are created for each group of related data (e.g., IsotopicAge, Fold, Rock) i.e., each table should represent a distinct entity 1.Duplicative (repeating) columns containing the same type of data are removed from the same table There should be no repeated data types: Mineral1, Mineral2, Mineral3 or cellPhone, homePhone, workPhone These should go to a new table 2.All columns must contain a single value, i.e., All attributes must be atomic (e.g., XRF,) not multi-valued. Each cell must only have one value, e.g., XRF, not XRF, REE, Isotope 3.There should be a set of one or more columns that uniquely identify each row, i.e., there should be a primary key

Another example: Analysis table InvestigatorAnalysisTypeAddress Hassan BabaieXRF24 Peachtree Center Ave, Atlanta, GA John WayneXRF, XRD, REE3500 Pacific View Dr, Newport Beach, CA Elizabeth TuckerPetrography1100 Angela Ra, Charlotte, NC, John WayneIsotopic age3500 Pacific View Dr, Newport Beach, CA Investigators submit their samples to an Analyzing company. They company stores the above set of data for the customers What are the problems: – This is not in 1NF – The AnalysisType column does not represent a distinct entity Can’t find out how many people order analysis for XRF. They are all mixed. – The Address column is compound, and needs to move out into another table. City depends on zip zode. – There is no PK

Second Normal Form (2NF) 2NF deals with redundancy across multiple rows! Second normal form (2NF) further addresses the concept of removing duplicative data Meet all the requirements of the first normal form (1NF) Identify columns whose data repeat in different places – Remove them to their own table In the next slide, we see that data for Joe Strat is repeated. Solution is to remove the Alum column (with its address and school into their own Table called Alum and School See next slide for more!

An improved Analysis Table Now we can query on the type of analysis There are still problems with the structure: There are still redundancies The company can only keep track of three types of analyses; four would not work! Address is still compound; needs to be broken It is difficult to determine the analysis order for each person. – Order in this case depends on non-Pk columns Investig ator Analy sis1 Analysi s2 Analy sis3 ordersAddress Hassan Babaie XRFDepartment of Geosciences, GSU, Atlanta, GA John Wayne XRFXRDREE3500 Pacific View Dr, Newport Beach, CA Elizabet h Tucker Petro graph y 1100 Angela Ra, Charlotte, NC, John Wayne Isoto pic 3500 Pacific View Dr, Newport Beach, CA

Better solution We need to break the table into several tables: – Investigator, Analysis, Order, OrderItems, and Address investiIDlastNamefirstNameaffiliation 1WayneJohnExHollywood 2BabaieHassanGSU AnalysisIDAnalysisType 1XRF 2 NumberStreetCityStatezipCodeCountry 3500Pacific View Dr.Newport BeachCA92662USA 24Peachtree Center Ave AtlantaGA30303 Investigator Table Analysis Table Address Table

… Order and OrderItem Tables, partially shown OrderItemIDOrderIDAnalysisIDQty OrderIDInvestiIDOrderDateDeliveryDate 113/5/19604/30/ /17/20133/12/2013 Order Table OrderItem Table

Some improvement Analysis AnalysisID AnalysisType OrderItem OrderItemID OrderID AnalysisID Qty Order OrderID InvestID OrderDate DeliveryDate Investigator InvestID FirstName Address AddressID Number Stree …

Third Normal Form (3NF) Third normal form goes one large step further Meet all the requirements of the 2NF No transitive functional dependencies – Remove columns that are not dependent upon the primary key Remove columns that their values depend on columns other than the PK – This means: remove subkeys

3NF, cont’d There should be no partial functional dependencies If x  y, i.e., x functionally determines y, and y is functionally dependent on x, then given x, we can find y. – Example, in the Address table, given the nine-digit zip code, we can find city and state because they are functionally dependent on the zip code. The opposite is not true, given a city we cannot find the zip code (Note: some cities have several zip codes) By definition, a super key (primary key) functionally determines all other attributes in the table The zip code is a subkey (not a superkey) because it only determine the city and state part of the Address table not the other attributes

To take care of the partial functional dependency issue take 3 steps: – Remove all the attributes that depend on the subkey from the table (e.g., city and State from Address table) – Move them into a new table (e.g., call it ZipLocations with zipCode, city, and state attributes – Keep a copy of the subkey attribute (i.e., zipCode) in the original table as a foreign key The address table now has firstname, last name, street (these 3 make the PK), and zipCode (as FK to the other table). Summary: Subkeys always result in redundant data and must be removed! In other words, remove subsets of data that apply to multiple rows of a table and place them in separate tables – i.e., remove duplicative data – For example, break address into its independent constituents that do not depend on each other Create relationships between these new tables and their predecessors through the use of foreign keys

Fourth Normal Form (4NF) Normalizing a database to the 3NF is usually sufficient Finally, fourth normal form (4NF) has one additional requirement Meet all the requirements of the third normal form A relation is in 4NF if it has no multi-valued dependencies