Download presentation
Presentation is loading. Please wait.
Published byPaige Towsley Modified over 10 years ago
1
1 Week 4: Normalisation: Redundant data becomes inconsistent data; therefore … “The key, the whole key, and nothing but the key,so help me, Codd”
2
2 BoyGirl Database, Version 0 NOTE: Not a good design! Because one girl can have many boys ….. we are storing redundant mobile data about Bonnie. “Redundant data becomes inconsistent data”
3
3 A Better BoyGirl Database Eliminated redundant data about Bonnie Two tables with a One-to-Many relationship … … linked by a Foreign Key http://www-staff.it.uts.edu.au/~raymond/db/boygirl.sql Foreign key Primary key
4
4 The Menace of Redundant Data Redundant data becomes inconsistent data Insert, modify, and delete more data than desired Strive for one fact in one place More technically, strive for 3NF or BCNF (more later)
5
5 Big University Table – What’s redundant here? StdSSNStdClassOfferNoOffYearGradeCourseNoCrsDesc S1JUNO120033.5C1DB S1JUNO220033.3C2VB S2JUNO320033.1C3OO S2JUNO220033.4C2VB StdSSN OfferNo CourseNo StdSSN, OfferNo StdCity, StdClass OffTerm, OffYear, CourseNo CrsDesc EnrGrade Forgotten what these American University terms mean???? See next slide … Solution is to split the single table into two or more Tables. Like we did with BoyGirl.
6
6 Big University Table – Forgotten these American terms ???? StdSSNStdClassOfferNoOffYearGradeCourseNoCrsDesc StdSSN – student number (student social security number?) StdClass – Freshman, Sophomore, Junior, Senior OfferNo – e.g. 31061, Autumn 2007 Grade – A student’s grade point average a the start of semester CourseNo – e.g. 31061 or 32606 … Americans say “course” for “subject” CrsDec – course description Primary Key Attributes OffYear CourseNo Offering CourseNo CrsDesc Course OfferNo
7
7 X Y “X (functionally) determines Y” For each X value, there is at most one Y value “Normalisation” is the process of splitting tables to remove redundancies Functional Dependencies (FDs) x y f (x) = 2x
8
8 FD’s in Data Prove non-existence (but not existence) by looking at data Two rows that have the same X value but a different Y value Understand business rules (or common sense) StdSSNStdClassOfferNoOffYearGradeCourseNoCrsDesc S1JUNO120033.5C1DB S1JUNO220033.3C2VB S2JUNO320033.1C3OO S2JUNO220033.4C2VB
9
9 Toward First Normal Form (1NF) Normalisation step by step StdSSNStdClassOfferNoOffYearGradeCourseNoCrsDesc S1JUNO120033.5C1DB S1JUNO220033.3C2VB S2JUNO320033.1C3OO S2JUNO220033.4C2VB Step 1: Identification of superkeys. According to the previous FD diagram, we know that StdSSN, OfferNo and CourseNo form a “superkey”
10
10 StdSS N StdClas s OfferN o OffYea r Grad e CourseN o CrsDes c S1JUNO120033.5C1DB S1JUNO220033.3C2VB S2JUNO320033.1C3OO S2JUNO220033.4C2VB Step 2: Determining primary key (minimal superkey). If we analyze our superkey we can conclude that CourseNo is not contributing to the uniqueness of our superkey, therefore we can take it out of the key. First Normal Form (1NF)
11
11 Second Normal Form (2NF) Every non-key column depends on a whole key, not part of a key Violations Part of key non-key Violations only for combined keys “combined” “composite”
12
12 2NF Example (problem) Violations of 2NF form in the 1NF big university database table StdSSN StdCity, StdClass OfferNo OffTerm, OffYear, CourseNo, CrsDesc StdSSNStdClassOfferNoOffYearGradeCourseNoCrsDesc S1JUNO120033.5C1DB S1JUNO220033.3C2VB S2JUNO320033.1C3OO S2JUNO220033.4C2VB
13
13 2NF Example (solution) Splitting the table UnivTable0 (StdSSN*, OfferNo*, EnrGrade) UnivTable1 (StdSSN, StdCity, StdClass) UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc) Where … Underlining means “primary key” Asterisk means foreign key
14
14 To get to Third Normal Form (3NF) We need to ensure that … Every non-key column depends only on a key not on non-key columns Violations: Non-key Non-key OfferNo CourseNo, CourseNo CrsDesc then OfferNo CrsDesc
15
15 3NF Example One violation in UnivTable2 CourseNo CrsDesc Splitting the table UnivTable2-1 (OfferNo, OffTerm, OffYear, CourseNo*) UnivTable2-2 (CourseNo, CrsDesc)
16
16 BCNF A special case not covered by 3NF Where two things can be used as substitute primary keys for each other E.g. staff number, tax file number, email address
17
17 BCNF Example StdSSNOfferNoEnrGradeEmail S1O13.5joe@bigu S1O23.6joe@bigu S2O13.8mary@bigu S2O33.5mary@bigu UnivTable4 (Mannino’s example page 236-238) Convert from 3NF to BCNF by placing the redundant keys in a table by themselves and only using one of them in other tables: UnivTable4-1 (StdSSN*, OfferNo, EnrGrade) UnivTable4-2 (StdSSN,Email)
18
18 Role of Normalisation Normalisation and drawing ERDS are complimentary ways of designing databases. Strive to reach at least 3NF, hopefully BCNF. There are even higher normal forms, 4NF, 5NF, etc, but we don’t talk about them in this course. They are almost never an issue in real work databases. May reverse engineer an ERD
19
19 Today’s Lab Exercise (this slide not part of today’s lecture handout) Familiarisation with next week’s lab exam database. Download the lab exam database into your PostgreSQL account. See URL on page 2 of handout. Attempt the sample questions on page 5 of handout. Answers on page 6.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.