VCE IT Theory Slideshows - ITA By Mark Kelly McKinnon Secondary College Vceit.com Database Normalisation Version 1.

Slides:



Advertisements
Similar presentations
VCE IT Theory Slideshows By Mark Kelly Vceit.com Referential Integrity in databases.
Advertisements

Organisation Of Data (1) Database Theory
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
First Normal Form Second Normal Form Third Normal Form
Topic Database Normalisation S McKeever Advanced Databases 1.
Normalisation Ensuring data integrity in database design 1.
Relational Data Analysis Learning outcomes  understand the process of normalisation;  perform Relational Data Analysis;  recognise the importance of.
The Relational Database Model:
Database Normalization CP3410 Daryle Niedermayer, I.S.P., PMP.
Normalization Rules for Database Tables Northern Arizona University College of Business Administration.
Week 6 Lecture Normalization
VCE IT Theory Slideshows
CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)
Cambridge TEC - Level 3 Certificate/Diploma IT. ICT Dept ScenarioLO1LO2LO3.
U3O2: Structure & Role of Relational Databases
Relational databases and third normal form As always click on speaker notes under view when executing to get more information!
VCE IT Theory Slideshows - ITA By Mark Kelly McKinnon Secondary College Vceit.com Updated by Jenny Gielb Chisholm Institute of TAFE, Dandenong Database.
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Concepts of Database Management, Fifth Edition
A Normalisation Example Mark Kelly McKinnon Secondary College Vceit.com Based on work by Robert Timmer-Arends.
Module III: The Normal Forms. Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form. The database.
資料庫正規化 Database Normalization 取材自 AIS, 6 th edition By Gelinas et al.
Database Normalization Lynne Weldon July 17, 2000.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
CORE 2: Information systems and Databases NORMALISING DATABASES.
MS Access: Creating Relational Databases Instructor: Vicki Weidler Assistant: Joaquin Obieta.
FEN Quality checking table design: Design Guidelines Normalisation Table Design Is this OK?
Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance.
Normalisation Africamuseum 5 June What is ‘Normalisation’?  Theoretical: satisfying the requirements of the different ‘Normal Forms’, as spelled.
M1G Introduction to Database Development 2. Creating a Database.
What's a Database A Database Primer Let’s discuss databases n Why they are hard n Why we need them.
Databases Unit 3_6. Flat File Databases One table containing data Data must be entered as a whole each time e.g. customer name and address each time (data.
IT Applications Theory Slideshows By Mark Kelly Vceit.com Last modified : 6 Dec 2013 Databases II: Structure, naming, data types, data formats.
1 U3O2: Structure & Role of Relational Databases Flat File Vs Relational Database Refer to the software as “database management system (DBMS)” –E.g. Access,
M1G Introduction to Database Development 4. Improving the database design.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
IT Applications Theory Slideshows Databases II: Structure, Naming, data types, data formats.
VCE IT Theory Slideshows by Mark Kelly study design By Mark Kelly, vceit.com, Begin.
Normalisation RELATIONAL DATABASES.  Last week we looked at elements of designing a database and the generation of an ERD  As part of the design and.
VCE IT Theory Slideshows By Mark Kelly Vceit.com Problem Solving Methodology 1 Analysis.
NORMALIZATION. What is Normalization  The process of effectively organizing data in a database  Two goals  To eliminate redundant data  Ensure data.
VCE IT Theory Slideshows by Mark Kelly study design By Mark Kelly, vceit.com, Begin.
Postgresql East Philadelphia, PA Databases – A Historical Perspective.
Brian Thoms.  Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain.
IST Database Normalization Todd Bacastow IST 210.
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
MS Access. Most A2 projects use MS Access Has sufficient depth to support a significant project. Relational Databases. Fairly easy to develop a good user.
Normalisation Unit 6: Databases. Just to recap  What is an Entity  What is an Attribute?
Lecture # 17 Chapter # 10 Normalization Database Systems.
Normalisation FORM RULES 1NF 2NF 3NF. What is normalisation of data? The process of Normalisation organises your database to: Reduce or minimise redundant.
Database Normalization. What is Normalization Normalization allows us to organize data so that it: Normalization allows us to organize data so that it:
Systems Analysis & Design Methods III Classic normalization rules for relational databases III Classic normalization rules for relational databases.
Relational Databases – Further Study I think we’ve covered all you need to know for GCSE about relational databases I’m not aware of any practical coursework.
VCE IT Theory Slideshows
VCE IT Theory Slideshows
U3O2: Structure & Role of Relational Databases
Understanding Data Storage
INLS 623 – Database Normalization
Database, tables and normal forms
Revised: 2 April 2004 Fred Swartz
Database Normalization
Database Normalisation VCE IT Theory Slideshows - Informatics
VCE IT Theory Slideshows by Mark Kelly study design
Chapter 4.1 V3.0 Napier University Dr Gordon Russell
Relational Database Model
BTEC ICT – Unit 18 With Mr Griffiths.
Normalisation 1 Unit 3.1 Dr Gordon Russell, Napier University
Presentation transcript:

VCE IT Theory Slideshows - ITA By Mark Kelly McKinnon Secondary College Vceit.com Database Normalisation Version 1

Contents What is normalisation? Why normalise? Normal forms 1,2,3

What is normalisation? Organising a relational database so… – Data repetition is minimised – Data access is maximised

Why normalise? Removing data repetition saves lots of storage space and speeds up data access. Changes need only be made in one place rather than in many places. More powerful data access is possible

The normal forms Are called 1NF (first normal form) to 5NF, but only 1-3 matter here. Are guidelines (not laws) for structuring database tables and fields. Note: they are often applied instinctively as part of skilled database design, and are not an extra step to do after databases are created.

1NF First Normal Form - sets the most basic rules for an organised database The 1NF guidelines are common sense. 1. Eliminate duplicate columns from the same table. (But how thick would you have to be to allow duplicate columns in a table?)

1NF – First normal form 2. Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key). (Experts still argue about what exactly 1NF actually defines – e.g. ‘tables within tables’ is OK to some theorists, but not OK to others)

Things NF1 wants* Rows and columns do not have to be sorted in a particular way for the table to work. – E.g. Excel VLOOKUP and HLOOKUP requires a lookup table to be sorted alphabetically or numerically for a range lookup to work. This would violate NF1. * According to Chris Date in “What First Normal Form Really Means”

Things NF1 wants No duplicate rows (records). Each row must be unique in some way. Each field entry can only contain one piece of data. E.g. A name field containing “Fred Smith” has surname and first name, violating 1NF.

Things NF1 wants Each field entry can only contain one piece of data. E.g. A Phone number field with more than one phone number entered for a person

Things NF1 wants Each field entry can only contain one piece of data. Why? You cannot easily access the data embedded in the single field (e.g. grab a postcode) You can’t use embedded data for sorting You can’t use data like “2kg” as a number for calculations, sorting, summaries etc.

Your turn… repair this! Customer IDNamePhone 111Fred Smith Mary Jones Tim Blogs

Repaired! Customer IDFirstNameSurname 111FredSmith 222MaryJones 333TimBlogs Now, customers can be sorted and searched by first name and/or surname separately. Also, the names can be used individually, like “Dear Fred” instead of “Dear Fred Smith”

Repair This! Product IDColourWeight A345Red4kg A568Blue300g B695White1.5kg

Repaired! Product IDColourWeight (g) A345Red4000 A568Blue300 B695White1500

Repair This! AlbumTrackLength Monster13:23 Monster24:12 Collapse into Now14:01

Repaired AlbumTrackLength (sec) Monster1203 Monster2252 Collapse into Now1241 Time notation like “3:23” represents two pieces of data – minutes and seconds – that mean nothing to a database Cannot be understood a database without serious text parsing Single “seconds” value can be sorted, searched, compared

Repair This! And address like “3 Fred St, Sale, 3586” has 3 pieces of data: street address, town, postcode. Customer IDAddress Lake Rd, Mentone, /45 Richmond Lane, Richmond, Spring St, Melbourne, 3000

Repaired! Now each field can be searched & sorted and used individually (e.g. addressing envelopes) Customer IDStreetSuburbPostcode Lake RdMentone /45 Richmond LaneRichmond Spring StMelbourne3000

CUSTOMER Customer IDNamePhone 111Fred Smith Mary Jones (BH) (AH) 333Tim Blogs Repair this… it’s tricky!

Customer IDNamePhone1Phone2 111Fred Smith Mary Jones Tim Blogs First attempt… Problems: Trouble querying the table: “Which customer has phone # ?” Have to search more than 1 field… messy. Ugly. Can’t enforce validation rules to prevent duplicate phone #s Can’t enter three or more phone numbers Waste of space for all people with only 1 number

CUSTOMER NAME TABLE Customer IDName 111Fred Smith 222Mary Jones 333Tim Blogs Second attempt… CUSTOMER PHONE TABLE Customer IDPhone Benefits: Unlimited phone numbers for everyone! No need to search multiple Phone fields No need to tear apart text from one field to extract a particular number All we need is a 1:many relationship between customer name table and customer phone table using the ID as the key field.

Tip Don’t try using a database TIME data type to store durations of time The TIME data type stores a time of day (e.g. 9:17 A.M.) Elapsed time is stored as a number of seconds, minutes, hours, days etc.

2NF

2NF – Second Normal Form Achieving 2NF means 1NF has already been achieved Each normal form builds on the previous forms 2NF removes more duplicate data. 2NF deals with design problems that could threaten data integrity.

2NF – Second Normal Form Removes subsets of data that apply to multiple rows of a table and places them in separate tables. Creates relationships between these new tables and their predecessors using foreign keys.

2NF example CustomerIDGnameSnamePhone 111FredSmith MaryJones MaryJones IkeTurner If Mary Jones got married and changed her name, changes would need to be made in more than one record. If one change were missed, the integrity of the data would be damaged. Making multiple changes like this is also time-consuming and repetitious, thereby eating up storage space. Solution: Store names only once in a separate table, as in the phone number example before. Name changes now only need to be made once.

Solution CUSTOMER NAME TABLE Customer IDName 111Fred Smith 222Mary Jones 333Tim Blogs CUSTOMER PHONE TABLE Customer IDPhone Ignore the full name field above. I’m lazy!

Without NF2: flat fileWith NF2: relational Department data is only stored once. So: Less storage space required Department changes now only made once, not once for each worker in that dept!

2NF The table above is a problem. Let’s say {Model Full Name} is the primary key. The {Manufacturer Country} field is based on the {Manufacturer} field, and will need to be constantly updated if manufacturers change their location. To be properly 2NF, you’d need to do this…

2NF

Break the data into two tables

2NF Make the same key fields in each table

2NF Set up the relationship between the key fields in each table

3NF

Third normal form (3NF) goes one large step further Remove columns that are not dependent upon the primary key.

Remember… Every non-prime attribute of relationship R is non-transitively dependent on every candidate key of R. Glad we cleared that up…

To revise E.F. Codd first described normalisation in NF ensures that every attribute (like a field) must give a fact about the key field. 2NF ensures attributes give a fact about the entire key, not just part of it. E.g. if a table key was surname and postcode, a field might give information about just the postcode. 3NF ensures that attributes give information on nothing but the key field.

In other words Non-key attributes must give information about the key, the whole key, and nothing but the key, so help me Codd. (Bill Kent)

3NF FAIL Field name underlining indicates key fields. You may have a gut feeling that this table is not good. But why?

3NF FAIL Each attribute (‘field’) should be giving information about the key field (a particular tournament + year).

3NF FAIL But the DOB field is not describing the tournament – it’s describing the tournament’s winner.

3NF FAIL But the DOB field is not describing the tournament – it’s describing the tournament’s winner.

3NF FAIL This is bad because the DOB does not describe the key field (tournament). It describes a looked- up value (the tournament’s winner).

3NF FAIL It’s like your mum keeping her knickers in your sock drawer because you’re related to her. They don’t belong there!

3NF FTW! Now the two tables are 3NF, and update anomalies cannot occur (e.g. updating a DOB in one record but missing it in another record).

In other words Let X → A be a nontrivial FD (i.e. one where X does not contain A) and let A be a non-key attribute. Also let Y be a key of R. Then Y → X. Therefore A is not transitively dependent on Y if and only if X → Y, that is, if and only if X is a superkey. ’kay?

By Mark Kelly McKinnon Secondary College vceit.com These slideshows may be freely used, modified or distributed by teachers and students anywhere on the planet (but not elsewhere). They may NOT be sold. They must NOT be redistributed if you modify them. VCE IT THEORY SLIDESHOWS