Structuring Data A Study of Primary Keys and Foreign Keys, Normalisation and Duplicated vs Redundant Data.

Slides:



Advertisements
Similar presentations
Normalisation.
Advertisements

Normalization Rules for Database Tables
Normalisation The theory of Relational Database Design.
Normalisation Ensuring data integrity in database design 1.
Athabasca University Under Development for COMP 200 Gary Novokowsky
Monash University Week 7 Data Modelling Relational Database Theory IMS1907 Database Systems.
Accounting 6500 Relational Databases: Accounting Applications Introduction to Normalization.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
CS263:Revision on Normalisation
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 David M. Kroenke Database Processing Chapter 3 Normalization.
1 NORMALISATION. 2 Introduction Overview Objectives Intro. to Subject Why we normalise 1, 2 & 3 NF Normalisation Process Example Summary.
Project and Data Management Software
Database – Part 2a Dr. V.T. Raja Oregon State University.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Database Normalization CP3410 Daryle Niedermayer, I.S.P., PMP.
Page 1 ISMT E-120 Introduction to Microsoft Access & Relational Databases The Influence of Software and Hardware Technologies on Business Productivity.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
Page 1 ISMT E-120 Desktop Applications for Managers Introduction to Microsoft Access.
Chapter 3 The Relational Model and Normalization
Normalization Rules for Database Tables Northern Arizona University College of Business Administration.
Week 6 Lecture Normalization
XP Chapter 1 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Level 3 Objectives: Identifying and Eliminating Database.
Modelling Techniques - Normalisation Description and exemplification of normalisation.Description and exemplification of normalisation. Creation of un-normalised.
CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)
Concepts and Terminology Introduction to Database.
Relational databases and third normal form As always click on speaker notes under view when executing to get more information!
Fundamentals, Design, and Implementation, 9/e. Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/2 Copyright.
Avoiding Database Anomalies
Normalization A technique that organizes data attributes (or fields) such that they are grouped to form stable, flexible and adaptive entities.
Concepts of Database Management, Fifth Edition
A Normalisation Example Mark Kelly McKinnon Secondary College Vceit.com Based on work by Robert Timmer-Arends.
The Relational Model and Normalization R. Nakatsu.
Normalization Are we Normal. Normalization Normalization is the process of converting complex data structures into simple, stable data structures It also.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
Database Design (Normalizations) DCO11310 Database Systems and Design By Rose Chang.
Database Normalization Lynne Weldon July 17, 2000.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
In this chapter, you learn about the following: ❑ Anomalies ❑ Dependency and determinants ❑ Normalization ❑ A layman’s method of understanding normalization.
Chapter 7 1 Database Principles Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that.
CORE 2: Information systems and Databases NORMALISING DATABASES.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Unit 4 Normalisationand Relational Database Management Systems.
M1G Introduction to Database Development 4. Improving the database design.
Quiz questions. 1 A data structure that is made up of fields and records? Table.
A337 - Reed Smith1 Structure What is a database? –Table of information Rows are referred to as records Columns are referred to as fields Record identifier.
NORMALIZATION. What is Normalization  The process of effectively organizing data in a database  Two goals  To eliminate redundant data  Ensure data.
* Database is a group of related objects * Objects can be Tables, Forms, Queries or Reports * All data reside in Tables * A Row in a Table is a record.
Logical Database Design and the Relational Model.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Normalization Hour1,2 Presented & Modified by Mahmoud Rafeek Alfarra.
IS6145 Database Analysis and Design Lecture 10: Normalization of Data Tables Rob Gleasure
Lecture 4: Logical Database Design and the Relational Model 1.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
Logical Database Design and Relational Data Model Muhammad Nasir
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
What Is Normalization  In relational database design, the process of organizing data to minimize redundancy  Usually involves dividing a database into.
MS Access. Most A2 projects use MS Access Has sufficient depth to support a significant project. Relational Databases. Fairly easy to develop a good user.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 4: PART C LOGICAL.
Normalisation FORM RULES 1NF 2NF 3NF. What is normalisation of data? The process of Normalisation organises your database to: Reduce or minimise redundant.
Database Normalization. What is Normalization Normalization allows us to organize data so that it: Normalization allows us to organize data so that it:
NORMALISATION OF DATABASES. WHAT IS NORMALISATION? Normalisation is used because Databases need to avoid have redundant data, which makes it inefficient.
Flat file and relational databases Flat file database In a flat file database information is held in a single table. Student IDStudent name GenderDOBCourse.
INLS 623 – Database Normalization
Database Normalization
Database Normalisation
DATABASE DESIGN & DEVELOPMENT

Presentation transcript:

Structuring Data A Study of Primary Keys and Foreign Keys, Normalisation and Duplicated vs Redundant Data

Rows and Columns A “Row” in a database represents a thing So a row on the Student table represents a student A “Column” (also called an “Attribute”) is a list of a specific piece of data A column on the students table might contain student names The intersection of row and column contains a single piece of data This is important – no lists allowed! We’ll see more details on this later Each row in a table must be distinct This means that a row is always identifiable by some combination of its attributes

A Relation is a two dimensional table or array A Row is a record A Column is a data item, field or attribute

The cells of a table are single valued, i.e. no repeating groups are allowed. Any entry in a column must be of the same kind. Example: Student Id. The Order of the columns is unimportant. The Order of the rows is unimportant. No two rows of the table can be identical. The column must have a unique name. For a Table to be a Relation:

A Primary Key is a group of one or more attributes (columns) that uniquely identifies a row In theory a key can be made up of all attributes on a table – in practice we will generally use an ID on each table A Foreign Key is a column which exists on table B, while also being the Primary Key on table A Keys:

Student NoNameDoB 1 John 2Jane 3Bob Enrolment ID Subject Name Academic YearStudent No 1Econ Econ Econ Student Table Enrolment Table Primary KeyForeign Key

Anomalies

IDNameDept NoDept Name 1000Black610Personnel 1001White640Admin 1004Yellow710Manufacturing 1005Green710Manufacturing

If we delete row 1 (1000, Black) we remove data about the employee, but also lose information about the Personnel department Deletion Anomaly:

If we want to insert a new department, but we don’t have any employees in that department, we are constrained from doing so Insertion Anomaly:

If we make a change to department name (eg “Manufacturing” becomes “Engineering”) we must duplicate this change across every row with Dept No 710. This is prone to errors Modification Anomaly:

IDName Dept No 1000Black White Yellow Green710 Dept NoDept Name 610Personnel 640Admin 710Manufacturing Primary Key Foreign Key

Fixing a poorly conceived database design – almost a certain exam question!

OrderIDProductIDProduct DescriptionCustomer NameCustomer Address 11Black InkJohn Smith123 Fake Street 12Cyan InkJohn Smith123 Fake Street 13Magenta InkJohn Smith123 Fake Street 21Black InkJane Doe234 Main Street 23Magenta InkJane Doe234 Main Street 24Yellow InkJane Doe234 Main Street 25Grey InkJane Doe234 Main Street 35Grey InkJohn Smith123 Fake Street 42Cyan InkSteven Hancock10 Schoolhouse Lane 53Magenta InkEva May15 Main Street 51Black InkEva May15 Main Street

Product ID Product Description 1Black Ink 2Cyan Ink 3Magenta Ink 4Yellow Ink 5Grey Ink Order ID Product ID Customer NameCustomer AddressCustomer ID John Smith123 Fake Street1 Jane Doe234 Main Street2 Steven Hancock10 Schoolhouse Lane3 Eva May15 Main Street4 OrderIDCustomer ID

First Normal Form Removing repeating attributes – that is, “a data field which may occur for multiple values for a single value of the key” (we’re getting rid of “lists of items”) Move these to a new table, with a copy of the key

Order IDProduct ID Product Description 11Black Ink 12Cyan Ink 13Magenta Ink 21Black Ink 23Magenta Ink 24Yellow Ink 25Grey Ink 35 42Cyan Ink 53Magenta Ink 51Black Ink Order ID Customer NameCustomer Address Customer ID 1John Smith123 Fake Street1 1John Smith123 Fake Street1 1John Smith123 Fake Street1 2Jane Doe234 Main Street2 2Jane Doe234 Main Street2 2Jane Doe234 Main Street2 2Jane Doe234 Main Street2 3John Smith123 Fake Street1 4 Steven Hancock10 Schoolhouse Lane3 5Eva May15 Main Street4 5Eva May15 Main Street4

Second Normal Form Take each non-key attribute and check if it dependent on one part of the key If it is, move it to a new table

Product ID Product Description 1Black Ink 2Cyan Ink 3Magenta Ink 4Yellow Ink 5Grey Ink Order ID Product ID Customer NameCustomer AddressCustomer ID John Smith123 Fake Street1 Jane Doe234 Main Street2 Steven Hancock10 Schoolhouse Lane3 Eva May15 Main Street4 OrderIDCustomer ID

Third Normal Form If a non-key attribute is more dependent on another non-key attribute than on the key, move them to a new table Leave the non-key attribute on which it is dependent in the original table and mark it a foreign key (Not relevant to this example – we’ll see this in the next example)

Project CodeProject Title Project Manager Project Budget Employee No Employee Name Department No Department Name Hourly Rate PC010 Pensions SystemsM Phillips € 24,500.00S10001A SmithL004IT € PC010 Pensions SystemsM Phillips € 24,500.00S10030L JonesL023Pensions € PC010 Pensions SystemsM Phillips € 24,500.00S21010P LewisL004IT € PC045 Salaries SystemsH Martin € 17,400.00S10010B JonesL004IT € PC045 Salaries SystemsH Martin € 17,400.00S10001A SmithL004IT € PC045 Salaries SystemsH Martin € 17,400.00S31002T GilbertL028Database € PC045 Salaries SystemsH Martin € 17,400.00S13210W RichardsL008Salary € PC064HR SystemsK Lewis € 12,250.00S31002T GilbertL028Database € PC064HR SystemsK Lewis € 12,250.00S21010P LewisL004IT € PC064HR SystemsK Lewis € 12,250.00S10034B JamesL009HR € 16.50

First Normal Form Removing repeating attributes – that is, “a data field which may occur for multiple values for a single value of the key” (we’re getting rid of “list of items”) Move these to a new table, with a copy of they key

Project CodeProject TitleProject Manager Project Budget PC010Pensions SystemsM Phillips € 24, PC045Salaries SystemsH Martin € 17, PC064HR SystemsK Lewis € 12, Project CodeEmployee NoEmployee Name Department No Department Name Hourly Rate PC010S10001A SmithL004IT € PC010S10030L JonesL023Pensions € PC010S21010P LewisL004IT € PC045S10010B JonesL004IT € PC045S10001A SmithL004IT € PC045S31002T GilbertL028Database € PC045S13210W RichardsL008Salary € PC064S31002T GilbertL028Database € PC064S21010P LewisL004IT € PC064S10034B JamesL009HR € 16.50

Second Normal Form Take each non-key attribute and check if it dependent on one part of the key If it is, move it to a new table

Project CodeProject TitleProject Manager Project Budget PC010Pensions SystemsM Phillips € 24, PC045Salaries SystemsH Martin € 17, PC064HR SystemsK Lewis € 12, Employee No Employee Name Department No Department Name S10001A SmithL004IT S10030L JonesL023Pensions S21010P LewisL004IT S10010B JonesL004IT S10001A SmithL004IT S31002T GilbertL028Database S13210W RichardsL008Salary S31002T GilbertL028Database S21010P LewisL004IT S10034B JamesL009HR Project CodeEmployee No Hourly Rate PC010S10001 € PC010S10030 € PC010S21010 € PC045S10010 € PC045S10001 € PC045S31002 € PC045S13210 € PC064S31002 € PC064S21010 € PC064S10034 € 16.50

Third Normal Form If a non-key attribute is more dependent on another non-key attribute than on the key, move them to a new table Leave the non-key attribute on which it is dependent in the original table and mark it a foreign key

Project CodeProject TitleProject Manager Project Budget PC010Pensions SystemsM Phillips € 24, PC045Salaries SystemsH Martin € 17, PC064HR SystemsK Lewis € 12, Project CodeEmployee No Hourly Rate PC010S10001 € PC010S10030 € PC010S21010 € PC045S10010 € PC045S10001 € PC045S31002 € PC045S13210 € PC064S31002 € PC064S21010 € PC064S10034 € Employee No Employee Name Departmen t No S10001A SmithL004 S10030L JonesL023 S21010P LewisL004 S10010B JonesL004 S10001A SmithL004 S31002T GilbertL028 S13210W RichardsL008 S31002T GilbertL028 S21010P LewisL004 S10034B JamesL009 Departme nt No Department Name L004IT L023Pensions L028Database L008Salary

Summary of Questions for Revision Class

Project CodeProject Title Project Manager Project Budget Employee No Employee Name Department No Department Name Hourly Rate PC010 Pensions SystemsM Phillips € 24,500.00S10001A SmithL004IT € PC010 Pensions SystemsM Phillips € 24,500.00S10030L JonesL023Pensions € PC010 Pensions SystemsM Phillips € 24,500.00S21010P LewisL004IT € PC045 Salaries SystemsH Martin € 17,400.00S10010B JonesL004IT € PC045 Salaries SystemsH Martin € 17,400.00S10001A SmithL004IT € PC045 Salaries SystemsH Martin € 17,400.00S31002T GilbertL028Database € PC045 Salaries SystemsH Martin € 17,400.00S13210W RichardsL008Salary € PC064HR SystemsK Lewis € 12,250.00S31002T GilbertL028Database € PC064HR SystemsK Lewis € 12,250.00S21010P LewisL004IT € PC064HR SystemsK Lewis € 12,250.00S10034B JamesL009HR € 16.50

OrderIDProductIDProduct DescriptionCustomer NameCustomer AddressCustomer ID 11Black InkJohn Smith123 Fake Street1 12Cyan InkJohn Smith123 Fake Street1 13Magenta InkJohn Smith123 Fake Street1 21Black InkJane Doe234 Main Street2 23Magenta InkJane Doe234 Main Street2 24Yellow InkJane Doe234 Main Street2 25Grey InkJane Doe234 Main Street2 35Grey InkJohn Smith123 Fake Street1 42Cyan InkSteven Hancock10 Schoolhouse Lane3 53Magenta InkEva May15 Main Street4 51Black InkEva May15 Main Street4

NINContract NoHours Employee Name Company ID Company Location BSC102572P. WhiteSC115Belfast ASC102548R. PressSC115Belfast BSC102624P. SmithSC23Bangor BSC102624P. WhiteSC23Bangor A engineering consultancy firm supplies temporary specialized staff to bigger companies in the country to work on their project for certain amount of time. The table below lists the time spent by each of the company’s employees at other companies to carry out projects. The National Insurance Number (NIN) is unique for every member of staff. Explain in which normal form this table is Find the Primary Key for this relation and explain your choice. Normalise the table to 2NF Normalise the tables to 3NF