Lesson Objectives Aims You should know about: 1.3.2: (a) Relational database, flat file, primary key, foreign key, secondary key, entity relationship modelling, normalisation and indexing.
Some key terms Indexing: An index is a data structure used to shorten the length of time it takes to search a database. An index may point to other sub indexes Because these structures are smaller than the whole database it increases search speed
Normalisation Normalisation is the process of converting a flat file database (with a single table) to a relational database (with many tables). There are various levels of normalisation that remove repetition to a greater or lesser extent.
Normalisation Weird. Because logically, you’d never put data in to the first normal form… But they are clear, well defined steps. Read and make notes on the “Normalisation to 3NF” in the shared area
Task Step 1: Turn the un-normalised data in to 1NF PupilID PupilName DOB ExamID Subject Level Date RoomID RoomName P99010 Jane Grey 12.03.86 CP101 EN004 AR075 Computing English Art AS GCSE 15.05.01 24.05.01 12.06.01 UH UG Hall Gym P99205 Tom Jones 05.11.86 MA110 PH190 Maths Physics 15.06.01 08.06.01 58 Science Lab P99311 Sam Hill 16.08.86 FIRST STEP: To change into First Normal Form – the repeated groups of fields must go. To do this, data must be split in to separate tables
1st Normal Form A table is in First Normal Form (1NF) if there are no repeating groups. In other words, each column must contain only a single value and each row must have an item in every column. This can usually be done by putting the data into two tables ... separating the repeated data into a separate group.
1NF PupilID PupilName DOB ExamID P99010 Jane Grey 12.03.86 CP101 EN004 AR075 P99205 Tom Jones 05.11.86 MA110 PH190 P99311 Sam Hill 16.08.86 ExamID Subject Level Date RoomID RoomName CP101 Computing AS 15.05.01 UH Hall AR075 Art 12.06.01 UG Gym MA110 Maths 15.06.01 PH190 Physics 08.06.01 58 Science Lab EN004 English GCSE 24.05.01
To move to 2NF, any partial dependencies must be removed This basically means each record should not have a composite primary key This removes: Many to many relationships Repeated Data
In the first table, there is a composite key (PupilID and ExamID) Many to many In the first table, there is a composite key (PupilID and ExamID) There is also a lot of data repetition (many students taking many exams) PupilID PupilName DOB ExamID P99010 Jane Grey 12.03.86 CP101 EN004 AR075 P99205 Tom Jones 05.11.86 MA110 PH190 P99311 Sam Hill 16.08.86
Second Normal Form PupilID PupilName DOB P99010 Jane Grey 12.03.86 Tom Jones 05.11.86 P99311 Sam Hill 16.08.86 PupilID ExamID P99010 CP101 EN004 AR075 P99205 MA110 PH190 P99311 ExamID Subject Level Date RoomID RoomName CP101 Computing AS 15.05.01 UH Hall AR075 Art 12.06.01 UG Gym MA110 Maths 15.06.01 PH190 Physics 08.06.01 58 Science Lab EN004 English GCSE 24.05.01
3rd Normal Form removes something called “Transitive Dependency” Third Normal Form 3rd Normal Form removes something called “Transitive Dependency” The advantage of removing transitive dependency is: Amount of data duplication is reduced. Data integrity achieved.
What on earth is transitive dependency? Basically it means all data in the table should be dependent solely on the primary key. Any other data should be in a new table
In our 2NF table, ExamID is the PK However, RoomName does NOT depend on ExamID, it is dependent on RoomID. Therefore this data should be in a new table ExamID Subject Level Date RoomID RoomName CP101 Computing AS 15.05.01 UH Hall AR075 Art 12.06.01 UG Gym MA110 Maths 15.06.01 PH190 Physics 08.06.01 58 Science Lab EN004 English GCSE 24.05.01
PUPILS (PupilID, PupilName, DOB) Jane Grey 12.03.86 P99205 Tom Jones 05.11.86 P99311 Sam Hill 16.08.86 PupilID ExamID P99010 CP101 EN004 AR075 P99205 MA110 PH190 P99311 ExamID Subject Level Date RoomID CP101 Computing AS 15.05.01 UH AR075 Art 12.06.01 UG MA110 Maths 15.06.01 PH190 Physics 08.06.01 58 EN004 English GCSE 24.05.01 RoomID RoomName UH Hall UG Gym 58 Science Lab PUPILS (PupilID, PupilName, DOB) EXAMS (ExamID, Subject, Level, Date, RoomID) PUPIL_SITS (PupilID, ExamID) ROOMS (RoomID, RoomName)
Review/Success Criteria You should know: How to normalise data The definitions of the three normal forms The purpose of normalisation