1 Database Design Issues, Part II Hugh Darwen CS252.HACD: Fundamentals of Relational Databases Section.

Slides:



Advertisements
Similar presentations
Relational Algebra Part II
Advertisements

Dependency preservation, 3NF revisited and BCNF
Henry VIII and his 6 Wives
1 Introduction to Relational Databases Hugh Darwen CS252.HACD: Fundamentals of Relational Databases.
1 Constraints and Updating Hugh Darwen CS252.HACD: Fundamentals of Relational Databases Section 7: Constraints.
1 Database Design Issues, Part I Hugh Darwen CS252.HACD: Fundamentals of Relational Databases Section.
1 Relational Algebra, Part III, and Other Operators Hugh Darwen CS252.HACD: Fundamentals of Relational.
1 Relational Algebra, Principles and Part I Hugh Darwen CS252.HACD: Fundamentals of Relational Databases.
1 Values, Types, Variables, Operators Hugh Darwen CS252.HACD: Fundamentals of Relational Databases Section.
FEN Introduction to the database field: Quality checking table design: Design Guidelines Normalisation Seminar: Introduction to relational.
Schema Refinement: Normal Forms
Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina.
Database Systems Lecture 12 Natasha Alechina
Schema Refinement and Normal Forms Given a design, how do we know it is good or not? What is the best design? Can a bad design be transformed into a good.
Relational Normalization Theory. Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
Lossless Decomposition (2) Prof. Sin-Min Lee Department of Computer Science San Jose State University.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Chapter 8 Normal Forms Based on Functional Dependencies Deborah Costa Oct 18, 2007.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
4 TH NORMAL FORM By: Karen McVay. REVIEW OF NFs 1NF  All values of the columns are atomic. That is, they contain no repeating values. 1NF  All values.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Databases 6: Normalization
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
Why Normalization? To Reduce Redundancy to 1.avoid modification, insertion, deletion anomolies 2.save space Goal: One Fact in One Place.
MVDs: 1 Join Dependencies—Example Let r = A B C = A B |  | A C 1 a x 1 a 1 x 1 a y 1 b 1 y 1 b x 2 a 2 y 1 b y 2 b 2 a y 2 b y Observe: r =  AB r | 
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
Department of Computer Science and Engineering, HKUST Slide 1 7. Relational Database Design.
Functional Dependencies and Relational Schema Design.
Introduction to Schema Refinement
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Copyright © Curt Hill Schema Refinement III 4 th NF and 5 th NF.
King Henry viii All about his wives. Catherine of Aragon Was his first wife Was his first wife Born in 1485 Born in 1485 When she was 18 and Henry was.
Copyright, Harris Corporation & Ophir Frieder, Normal Forms “Why be normal?” - Author unknown Normal.
Chapter 13 Further Normalization II: Higher Normal Forms.
Your name here. Improving Schemas and Normalization What are redundancies and anomalies? What are functional dependencies and how are they related to.
Further Normalization II: Higher Normal Forms Prof. Yin-Fu Huang CSIE, NYUST Chapter 13.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Lecture 09: Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
FEN Quality checking table design: Design Guidelines Normalisation Table Design Is this OK?
Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
Normalization Ioan Despi 2 The basic objective of logical modeling: to develop a “good” description of the data, its relationships and its constraints.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
Henry 8 th and his wives By Anna Henry 8 th Henry had six wives 2 of them were be-headed 1 of them survived 1 of them died And 2 were divorced.
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Ch 7: Normalization-Part 1
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Advanced Database System
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Henry VIII By Taylor and Evan..
Henry VIII and his 6 Wives
Advanced Normalization
CS 480: Database Systems Lecture 22 March 6, 2013.
STRUCTURE OF PRESENTATION :
Advanced Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Fourth normal form: 4NF.
Functional Dependencies and Relational Schema Design
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Henry VIII and his 6 Wives
Designing Relational Databases
Presentation transcript:

1 Database Design Issues, Part II Hugh Darwen CS252.HACD: Fundamentals of Relational Databases Section 9: Database Design Issues, Part II

2 First Normal Form (1NF) On CS252 we shall assume that every relation, and therefore every relvar, is in 1NF. The term (due to E.F. Codd) is not clearly defined, partly because it depends on an ill-defined concept of “atomicity” (of attribute values). Some authorities take it that a relation is in 1NF iff none of its attributes is relation-valued or tuple-valued. It is certainly recommended to avoid use of such attributes (especially RVAs) in database relvars.

3 2NF and 3NF These normal forms, originally defined by E.F. Codd, were really “mistakes”. You will find definitions in the textbooks but there is no need to learn them. The faults with Codd’s original definition of 3NF were reported to him by Raymond Boyce. Together they worked on an improved, simpler normal form, which became known as Boyce-Codd Normal Form (BCNF).

4 Boyce/Codd Normal Form (BCNF) BCNF is defined thus: Relvar R is in BCNF if and only if for every nontrivial FD A → B satisfied by R, A is a superkey of R. More loosely, “every nontrivial determinant is a [candidate] key”. BCNF addresses redundancy arising from JDs that are consequences of FDs. (Not all JDs are consequences of FDs. We will look at the others later.)

5 Splitting ENROLMENT (bis) StudentIdName S1Anne S2Boris S3Cindy S4Devinder S5Boris StudentIdCourseId S1C1 S1C2 S2C1 S3C3 S4C1 IS_CALLEDIS_ENROLLED_ON The attributes involved in the “offending” FD have been separated into IS_CALLED, and now we can add student S5!

6 Advantages of BCNF With reference to our ENROLMENT example, decomposed into the BCNF relvars IS_CALLED and IS_ENROLLED_ON: Anne’s name is recorded twice in ENROLMENT, but only once in IS_CALLED. In ENROLMENT it might appear under different spellings (Anne, Ann), unless the FD { StudentId } → { Name} is declared as a constraint. Redundancy is the problem here. With ENROLMENT, a student’s name cannot be recorded unless that student is enrolled on some course, and an anonymous student cannot be enrolled on any course. Lack of orthogonality is the problem here.

7 Another Kind of Offending FD StudentIdTutorIdTutorNameCourseId S1T1HughC1 S1T2MaryC2 S2T3LisaC1 S3T4FredC3 S4T1HughC1 TUTORS_ON Assume the FD { TutorId } → { TutorName } holds.

8 Splitting TUTORS_ON TutorIdTutorName T1Hugh T2Mary T3Lisa T4Fred T5Zack TUTOR_NAME StudentIdTutorIdCourseId S1T1C1 S1T2C2 S2T3C1 S3T4C3 S4T1C1 TUTORS_ON_BCNF Now we can put Zack, who isn’t assigned to anybody yet, into the database. Note the FK required for TUTORS_ON_BCNF.

9 Dependency Preservation StudentIdCourseIdOrganiserRoom S1C1Owen13 S1C2Olga24 S2C1Owen13 SCOR Assume FDs: { CourseId } → { Organiser } { Organiser } → { Room } { Room } → { Organiser } Which one do we address first?

10 Try 1: {CourseId}→{Organiser} StudentIdCourseIdRoom S1C113 S1C224 S2C113 SCR “Loses” { Room } → { Organiser } and {Organiser } → { Room } CO CourseIdOrganiser C1Owen C2Olga

11 Try 2: {Room}→{Organiser} StudentIdCourseIdRoom S1C113 S1C224 S2C113 SCR “Loses” { CourseId } → { Organiser } OR OrganiserRoom Owen13 Olga24

12 Try 3: {Organiser}→{Room} StudentIdCourseIdOrganiser S1C1Owen S1C2Olga S2C1Owen SCO Preserves all three FDs! OR OrganiserRoom Owen13 Olga24 (But we must still decompose SCO, of course)

13 An FD That Cannot Be Preserved Now assume the FD { TutorId } → { CourseId } holds. StudentIdTutorIdCourseId S1T1C1 S1T2C2 S2T3C1 S3T4C3 S4T1C1 TUTOR_FOR This is a third kind of offending FD.

14 Splitting TUTOR_FOR StudentIdTutorId S1T1 S1T2 S2T3 S3T4 S4T1 TUTORS TutorIdCourseId T1C1 T2C2 T3C1 T4C3 TEACHES Note the keys. Have we “lost” the FD { StudentId, CourseId } → { TutorId } ? And the FK referencing IS_ENROLLED_ON?

15 Reinstating The Lost FD CONSTRAINT KEY_OF_TUTORS_JOIN_TEACHES WITH TUTORS JOIN TEACHES AS TJT : COUNT ( TJT ) = COUNT ( TJT { StudentId, CourseId } ) ; Need to add the following constraint: CONSTRAINT KEY_OF_TUTORS_JOIN_TEACHES IS_EMPTY ( ( TUTORS JOIN TEACHES ) GROUP { ALL BUT StudentId, CourseId } AS G WHERE COUNT ( G ) > 1 ) ; or equivalently:

16 And The Lost Foreign Key The “lost” foreign key is easier: CONSTRAINT FK_FOR_TUTORS_JOIN_TEACHES IS_EMPTY ( ( TUTORS JOIN TEACHES ) NOT MATCHING IS_ENROLLED_ON ) ;

17 In BCNF But Still Problematical Assume the JD *{ { Teacher, Book }, { Book, CourseId } } holds. TeacherBookCourseId T1Database SystemsC1 T1Database in DepthC1 T1Database SystemsC2 T1Database in DepthC2 T2Database in DepthC2 TBC1

18 Normalising TBC1 TeacherBook T1Database Systems T1Database in Depth T2Database in Depth TB BookCourseId Database SystemsC1 Database in DepthC1 Database SystemsC2 Database in DepthC2 BC We have lost the constraint implied by the JD, but does a teacher really have to teach a course just because he or she uses a book that is used on that course?

19 Fifth Normal Form (5NF) 5NF caters for all harmful JDs. Relvar R is in 5NF iff every nontrivial JD that holds in R is implied by the keys of R. (Fagin’s definition, 1979) To explain “nontrivial”: A JD is trivial if and only if one of its operands is the entire heading of R (because every such JD is clearly satisfied by R). Apart from a few weird exceptions, a JD is “implied by the keys” if every projection is a superkey. (Date’s definition – but see the Notes for this slide)

20 A JD of Degree > 2 Now assume the JD *{ { Teacher, Book }, { Book, CourseId }, { Teacher, CourseId } } holds. TeacherBookCourseId T1Database SystemsC1 T1Database in DepthC1 T1Database SystemsC2 T1Database in DepthC2 T2Database in DepthC2 TBC2

21 Normalising TBC2 TeacherBook T1Database Systems T1Database in Depth T2Database in Depth TB BookCourseId Database SystemsC1 Database in DepthC1 Database SystemsC2 Database in DepthC2 BC TeacherCourseId T1C1 T1C2 T2C2 TC (and we’ve “lost” the constraint again)

22 Sixth Normal Form (6NF) 6NF subsumes 5NF and is the strictest NF: Relvar R is in 6NF if and only if every JD that holds in R is trivial. 6NF provides maximal orthogonality, as already noted, but is not normally advised. It addresses additional anomalies that can arise with temporal data (beyond the scope of this course—and, what’s more, the definition of join dependency has to be revised).

23 Wives of Henry VIII in 6NF Wife#FirstName 1Catherine 2Anne 3Jane 4Anne 5Catherine 6 W_FN Wife#LastName 1of Aragon 2Boleyn 3Seymour 4of Cleves 5Howard 6Parr W_LN Wife#Fate 1divorced 2beheaded 3died 4divorced 5beheaded 6survived W_F Not a good idea!

24 EXERCISE (see Notes)