Normalisation RELATIONAL DATABASES.  Last week we looked at elements of designing a database and the generation of an ERD  As part of the design and.

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

Relational Terminology. Normalization A method where data items are grouped together to better accommodate business changes Provides a method for representing.
Normalization By Jason Park Fall 2005 CS157A. Database Normalization Database normalization is the process of removing redundant data from your tables.
Normalisation Ensuring data integrity in database design 1.
Normalization of Database Tables Special adaptation for INFS-3200
Normalization of Database Tables
Chapter 8 Normal Forms Based on Functional Dependencies Deborah Costa Oct 18, 2007.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
The Relational Database Model:
Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Project and Data Management Software
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
Bad DB Design Duplicate of data Duplicate of data Updating Updating Deleting Deleting.
NORMALIZATION N. HARIKA (CSC).
Normalization.
Chapter 5 Normalization of Database Tables
Week 6 Lecture Normalization
Modelling Techniques - Normalisation Description and exemplification of normalisation.Description and exemplification of normalisation. Creation of un-normalised.
Normalization. Database Normalization Database normalization is the process of removing redundant data from your tables in to improve storage efficiency,
Cambridge TEC - Level 3 Certificate/Diploma IT. ICT Dept ScenarioLO1LO2LO3.
Chapter 1 Overview of Database Concepts Oracle 10g: SQL
1 Chapter 1 Overview of Database Concepts. 2 Chapter Objectives Identify the purpose of a database management system (DBMS) Distinguish a field from a.
Concepts and Terminology Introduction to Database.
Lecture 2 An Overview of Relational Database IST 318 – DB Admin.
Database Systems: Design, Implementation, and Management Tenth Edition
Concepts of Database Management, Fifth Edition
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
The Relational Model and Normalization R. Nakatsu.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
1 DATABASE SYSTEMS DESIGN IMPLEMENTATION AND MANAGEMENT INTERNATIONAL EDITION ROB CORONEL CROCKETT Chapter 7 Normalisation.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
CORE 2: Information systems and Databases NORMALISING DATABASES.
MS Access: Creating Relational Databases Instructor: Vicki Weidler Assistant: Joaquin Obieta.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
Chapter 1Introduction to Oracle9i: SQL1 Chapter 1 Overview of Database Concepts.
Normalization of Database Tables
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
Lecture Nine: Normalization
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
NORMALIZATION. What is Normalization  The process of effectively organizing data in a database  Two goals  To eliminate redundant data  Ensure data.
Brian Thoms.  Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain.
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
Logical Database Design and Relational Data Model Muhammad Nasir
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
MS Access. Most A2 projects use MS Access Has sufficient depth to support a significant project. Relational Databases. Fairly easy to develop a good user.
Lecture # 17 Chapter # 10 Normalization Database Systems.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
NORMALISATION OF DATABASES. WHAT IS NORMALISATION? Normalisation is used because Databases need to avoid have redundant data, which makes it inefficient.
Normalization Karolina muszyńska
A brief summary of database normalization
INFORMATION TECHNOLOGY – INT211
Database Normalization
Entity relationship diagrams
Chapter 6 Normalization of Database Tables
Normalization By Jason Park Fall 2005 CS157A.
1st, 2nd, and 3rd Normal Forms
Normalization.
Normalization Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting.
1st, 2nd, and 3rd Normal Forms
Normalization By Jason Park Fall 2005 CS157A.
Normalisation 1 Unit 3.1 Dr Gordon Russell, Napier University
Presentation transcript:

Normalisation RELATIONAL DATABASES

 Last week we looked at elements of designing a database and the generation of an ERD  As part of the design and generation of an ERD there is an iterative cycle of generating ERD, apply normalisation, adjust ERD, check against business rules …. Etc.  This week we will look at the process of normalising the data  What normalisation is  Why it is needed  3NF and beyond INTRODUCTION

 Process of restructuring the logical data model of a database  The process of removing redundant data from tables  Removes repeating data  Improves storage efficiency, data integration and scalability  Supported by the relational model  The level of efficiency of a database is measured in normal form (NF)  Achieved through a process of applying a series of algorithms / methods  Generally involves splitting existing tables into multiple tables and then re-connecting them through joins when a query is needed to pull the data together. DATABASE NORMALISATION

 Proposed by Edgar F. Codd in the paper “A relational model of data for large shared data banks” “there is, in fact, a very simple elimination procedure which we shall call normalisation. Through decomposition non-simple domains are replaced by domains whose elements are atomic (non-decomposable) values”  Normalising data is now standard in the relational database world.  It optimises both data input and retrieval and supports the relational model  NOT applied to the warehousing database or those that deviated from the traditional relational model for implementation.  Codd established 3 normal forms, other followed but 3NF is considered sufficient for most applications  BCNF THE HISTORY

 Non-normalised databases experience data anomalies  May store data representing data in multiple locations, if data is updated in some but not all locations an UPDATE ANOMALY will occur  Normalised data stores data in one location and links via a FOREIGN KEY  May have inappropriate dependencies. Adding data to this type of database will require first adding unrelated dependency data  Normalised data prevents such INSERTION ANOMALIES by ensuring a database relation/record mirrors functional dependencies.  May not be able to delete data without having to delete data you don’t want to remove as all data is clumped together DELETION ANOMALIES  Normalisation uniquely identifies records through keys and no extraneous information. WHY NORMALISE?

 De-normalised data is simply a list of the data elements in one clump  First normal form requires data be identified by a primary key and a number of atomic values / attributes  Second normal form and third normal forms deal with the relationship of non-key attributes to the primary key  Third normal form is classed as fully normalised and can be ‘tweaked’ to get to BCNF  Forth and fifth normal forms deal specifically with the representation of many to many and one to many relationships  Sixth normal form only applies to temporal databases. NORMAL FORMS

 This table is not very efficient with storage (you need a column/attribute for every author, some books have 4 or 5!)  The design does not protect data integrity  The table will not scale well ILLUSTRATION TitleAuthor 1Author2ISBNSubjectPagesPublisher Database Systems: the complete book Hector Garcia- Molina Jeffrey D Ullman XDatabases, Computers1152Pearson Database Design for mere mortals Michael J Hernadex Computers, Databases672Addison Wesley SQL queries for mere mortals John L ViescasMichael J Hernandex Databases, SQL672Addison Wesley

 All data values should be atomic  All column cells should have single values rather than composite values or set of objects / values FIRST NORMAL FORM TitleAuthor 1Author2ISBNSubjectPagesPublisher Database Systems: the complete book Hector Garcia- Molina Jeffrey D Ullman XDatabases, Computers1152Pearson Database Design for mere mortals Michael J Hernadex Computers, Databases672Addison Wesley SQL queries for mere mortals John L ViescasMichael J Hernandex Databases, SQL672Addison Wesley

 The 2 nd author attribute has been removed  Duplicate row with different author to ensure data is not lost  Duplicate the row for each subject classification  Problems:  INSERT ANOMALIES – cannot add a new Author without a Book etc.  UPDATE ANOMALIES – cannot change 1 publisher for ‘Database design for mere mortals’ we have to change 2 rows  DELETE ANOMALIES – if we remove ‘SQL queries for mere mortals’ we have to remove the SQL subject as well FIRST NORMAL FORM (1NF) TitleAuthorISBNSubjectPagesPublisher Database Systems: the complete book Hector Garcia-Molina XDatabases,1152Pearson Database Systems: the complete book Jeffrey D Ullman XComputers1152Pearson SQL queries for mere mortalsJohn L Viescas SQL672Addison Wesley SQL queries for mere mortalsMichael J Hernandex Databases672Addison Wesley Database Design for mere mortals Michael J Hernadex Databases672Addison Wesley Database Design for mere mortals Michael J Hernadex Computers672Addison Wesley 2 records to split the subject 2 records to split the Author

 The table above may be in 1 st NF but it violates 2 nd NF  A better solution is to split the data into separate tables  Author  Subject  Book  Functional dependencies need to be considered. SPLITTING THE TABLE - PROBLEMS TitleAuthor 1ISBNSubjectPagesPublisher Database Systems: the complete book Hector Garcia- Molina XDatabases,1152Pearson Database Systems: the complete book Jeffrey D Ullman XComputers1152Pearson SQL queries for mere mortals John L Viescas SQL672Addison Wesley SQL queries for mere mortals Michael Hernandex Databases672Addison Wesley Database Design for mere mortals Michael J Hernadex Databases672Addison Wesley Database Design for mere mortals Michael J Hernadex Computers672Addison Wesley

 Redundancy is caused by a functional dependency  Functional dependency is a like between 2 sets of attributes (tables/relations)  Normalising to 2NF removes undesirable FD’s  A set of attributes determining another  E.g. if we have the student ID then we can find out all the student details. The attribute ‘student ID’ will give us all the values in the ‘student’ table whatever table holds the ‘student ID’ attribute.  Split the tables and then add the dependencies …. FUNCTIONAL DEPENDENCIES

 The data is split into 3 tables  We have added an identifier to the subject and author tables  There needs to be a PRIMARY KEY in each table  Uniquely identifies each record in the table.  Don’t need to add a PK to the book table as it has the ISBN which is unique. 1 TABLE INTO 3 Subject IDSubject 1SQL 2Database 3Computers Author IDLastnameForename 1Garcia-MolinaHector 2UllmanJeffery 3ViescasJohn 4HernandexMichael TitleISBNPagesPublisher Database Systems: the complete book X1152Pearson SQL queries for mere mortals Addison Wesley Database Design for mere mortals Addison Wesley SUBJECTAUTHOR BOOK

 An author will have written many books, a book may have many authors, this is a many to many relationship. This is not ideal and needs to be replaced with an interlink table DEFINING THE RELATIONSHIPS BookAuthor writes BookAuthor has BookAuthors writes ISBNAuthor id X ISBNSubject id X BookAuthors BookSubject

 First normal form deals with redundant data across the horizontal row  Second normal form deals with redundancy of data in vertical columns  Normal forms are progressive, to get to second the data should be already in first SECOND NORMAL FORM (2NF) TitleISBNPagesPublisher Database Systems: the complete book X1152Pearson SQL queries for mere mortals Addison Wesley Database Design for mere mortals Addison Wesley Book The duplicated and split elements of author and subject have been removed, publisher is duplicated and publisher data should be held separately. Remove Publisher and place in separate table.

 Data pertaining to the publisher is extracted and held in a different table.  This allows the data to be maintained separately  If name changes, address moves etc you update the PUBLISHER table rather than every single record affected in the book table. SECOND NORMAL FORM TitleISBNPagesPublisher Database Systems: the complete book X1152Pearson SQL queries for mere mortals Addison Wesley Database Design for mere mortals Addison Wesley Book Publisher IDPublisherlocation 1 PearsonLondon 2 Addison Wesley New York TitleISBNPagesPublisher Database Systems: the complete book X11521 SQL queries for mere mortals Database Design for mere mortals Book Publisher Separate table allows additional data to be held centrally

 The relationship between book and publisher is one to many.  A book only has one publisher  A publisher may publish many books but it will publish at least 1  There needs to be a link between the book and the publisher  Foreign key  In 2NF you cannot have any data in a table with a composite key that does not relate to all portions of the composite key  No obscure data, all data must relate to that table or be part of the link key. SECOND NORMAL FORM BookPublisher publishes This notation indicates that a book has one publisher but a publisher has many books (and at least 1) The ERD also indicates that there must be a published by one publisher.

 3NF requires there are no functional dependencies other than to data in other tables via the FK  A table is in 3NF if all of the non-primary key attributes are mutually independent.  Link via FK do not hold data that can be sectioned off elsewhere in a table. THIRD NORMAL FORM