Presentation is loading. Please wait.

Presentation is loading. Please wait.

Normalization Beyond Third Normal Form

Similar presentations


Presentation on theme: "Normalization Beyond Third Normal Form"— Presentation transcript:

1 Normalization Beyond Third Normal Form
Hugo Kornelis

2 Database design Normalization is boring fun Normalization is hard
Normalization is not important fun easy very

3 Hugo Kornelis Independent database consultant Community addict Speaker, blogger, author, technical editor, Pluralsight author, etc. MVP (SQL Server/Data Platform) Blog:

4 Thank You To Our Sponsors!

5 Overview The basics … … and beyond Key concepts
Normalization up to Third Normal Form … and beyond All “higher” normal forms EKNF, BCNF, 4NF, 5NF, DK/NF, ONF, 6NF

6 Overview The basics … … and beyond Key concepts
Normalization up to Third Normal Form … and beyond All “higher” normal forms EKNF, BCNF, 4NF, 5NF, DK/NF, ONF, 6NF

7 Key Concepts Universe of Discourse (UoD)
Subset of reality … (or of a virtual reality) … as it is seen by the business If it’s not in the UoD, we don’t care.

8 in violation of business rules
Key Concepts Purpose of normalization Prevent data that is incorrect Normal forms defined at a per-table level impossible inconsistent in violation of business rules

9 Key Concepts Functional dependency
Column A determines column B (A  B), if … for each possible value for A, there is … either one value for B or no value for B but NEVER more than one value Examples: Chair number  Name Chair number  Birthdate Birthdate  Name

10 Key Concepts Functional dependency - terminology
These all mean the same: Name depends on Badge number Badge number determines Name (short form) Badge number  Name “Functional dependency” – sometimes “FD”

11 Key Concepts Composite functional dependency
{A, B}  C if for each combination of A and B, there is … either one value for C or no value for C but NEVER more than one value Example: {Room number, Chair number}  Name

12 Key Concepts “Cheating” composite functional dependency
These are irrelevant Badge number  Name {Badge number, Chair number}  Name Cheater!

13 Key Concepts Composite the other way around??
A  {B, C} Completely equivalent to A  B and A  C

14 Key Concepts Candidate Key: Within a table Column
(or combination of columns) Determines every other column in the table BadgeNo Room Chair Name 123 A 25 Marge 124 B 3 William 126 24 Julie 127 André 128 C 5 Kathryn

15 Key Concepts Candidate Key May be more than one One “Primary” key
Rest “Alternate” key

16 First Normal Form Table is in First Normal Form (1NF) if
Table has at least one candidate key All columns are atomic No repeating groups No composite values Depends highly on UoD!

17 First Normal Form No repeating groups Speaker Country Sessions
Allan Mitchell United Kingdom AD-205, AD-207, BI-205 Oliver Engels Germany DBA-305, PD-203 Speaker Country Session Allan Mitchell United Kingdom AD-205 AD-207 BI-205 Oliver Engels Germany DBA-305 PD-203

18 First Normal Form No composite values Speaker Country Session
Allan Mitchell United Kingdom AD-205 AD-207 BI-205 Oliver Engels Germany DBA-305 PD-203 Speaker Country Track Session No Allan Mitchell United Kingdom AD 205 207 BI Oliver Engels Germany DBA 305 PD 203

19 First Normal Form No composite values Speaker Country Session
Allan Mitchell United Kingdom AD-205 AD-207 BI-205 Oliver Engels Germany DBA-305 PD-203 First name Last name Country Session Allan Mitchell United Kingdom AD-205 AD-207 BI-205 Oliver Engels Germany DBA-305 PD-203

20 Second Normal Form Table is in Second Normal Form (2NF) if
Table is in First Normal Form Non-key columns depend on the whole keys Only applies for columns that are not part of any key Can only be violated if at least one key is composite

21 Second Normal Form Non-key columns depend on the whole keys   
Session Room Start time Room capacity AD-205 Blue 10:00 125 AD-207 Grand 12:45 228 BI-205 605 16:00 60 DBA-305 PD-203 225 14:30 Session  Room capacity {Room, Start time}  Room capacity Room  Room capacity Cheater!

22 Second Normal Form Non-key columns depend on the whole keys
Session Room Start time Room capacity AD-205 Blue 10:00 125 AD-207 Grand 12:45 228 BI-205 605 16:00 60 DBA-305 PD-203 225 14:30 Room  Room capacity Room Room capacity Blue 125 Grand 228 605 60 225

23 Second Normal Form Non-key columns depend on the whole keys Session
Room Start time AD-205 Blue 10:00 AD-207 Grand 12:45 BI-205 605 16:00 DBA-305 PD-203 225 14:30 Room Room capacity Blue 125 Grand 228 605 60 225

24 Third Normal Form Table is in Third Normal Form (3NF) if
Table is in Second Normal Form Non-key columns depend on nothing but the keys Only applies for columns that are not part of any key Violated if non-key column depends on … … one or more other non-key columns … two or more key columns that are part of different keys … one or more non-key columns combined with one or more key columns Not violated if non-key column depends on … … one or more columns that are all part of the same key (because that would already violate 2NF)

25 Third Normal Form Non-key columns depend on nothing but the keys   
Badge number Speaker Session 123 Oliver Engels DBA-305 124 Allan Mitchell BI-205 126 127 128 Lara Rubbelke AD-101 Badge number  Speaker Badge number  Session Speaker  Session Session  Speaker

26 Third Normal Form Non-key columns depend on nothing but the keys
Badge number Speaker Session 123 Oliver Engels DBA-305 124 Allan Mitchell BI-205 126 127 128 Lara Rubbelke AD-101 Session  Speaker Session Speaker DBA-305 Oliver Engels BI-205 Allan Mitchell AD-101 Lara Rubbelke

27 Third Normal Form Non-key columns depend on nothing but the keys
Badge number Session 123 DBA-305 124 BI-205 126 127 128 AD-101 Session Speaker DBA-305 Oliver Engels BI-205 Allan Mitchell AD-101 Lara Rubbelke

28 Illustration: Michael J. Swart
Summary Table is in Third Normal Form if every non-key column depends on The keys, The whole keys, And nothing but the keys (so help me Codd) Bernstein’s algorithm for synthesis of a Third Normal Form schema Dr. E.J. Codd Illustration: Michael J. Swart

29 Boyce-Codd Normal Form
Remember Third Normal Form? Every non-key column depends on the keys, the whole keys, and nothing but the keys Here’s Boyce-Codd Normal Form (BCNF):

30 Boyce-Codd Normal Form
Key columns depend on the whole keys Cheater! Track Session number Room Start time AD 205 Blue 10:00 207 12:45 BI 605 16:00 DBA 305 Grand PD 203 225 14:30 {Track, Session number}  Room {Track, Session number}  Start time {Room, Start time}  Session number {Room, Start time}  Track Track  Room Room  Track Cheater! 605

31 Boyce-Codd Normal Form
Key columns depend on the whole keys Track Session number Room Start time AD 205 Blue 10:00 207 12:45 BI 605 16:00 DBA 305 Grand PD 203 225 14:30 Track  Room Room  Track Track Room AD Blue BI 605 DBA Grand PD 225

32 Boyce-Codd Normal Form
Key columns depend on the whole keys Track Session number Room Start time AD 205 Blue 10:00 207 12:45 BI 605 16:00 DBA 305 Grand PD 203 225 14:30 Track Room AD Blue BI 605 DBA Grand PD 225

33 Boyce-Codd Normal Form
Key columns depend on the whole keys ? Track Session number Start time AD 205 10:00 207 12:45 BI 16:00 DBA 305 PD 203 14:30 Track Room AD Blue BI 605 DBA Grand PD 225

34 Boyce-Codd Normal Form
Key columns depend on the whole keys Track Session number Start time AD 205 10:00 207 12:45 BI 16:00 DBA 305 PD 203 14:30 Track Room AD Blue BI 605 DBA Grand PD 225

35 Boyce-Codd Normal Form
Key columns depend on the whole keys Track Session number Start time AD 205 10:00 207 12:45 BI 16:00 DBA 305 PD 203 14:30 Track Room AD Blue BI 605 DBA Grand PD 225

36 Boyce-Codd Normal Form
Key columns depend on the whole keys Not always possible to achieve Cheater! Track Session number Room Start time AD 205 Blue 10:00 207 12:45 BI 605 16:00 DBA 305 Grand PD 203 14:30 {Track, Session number}  Room {Track, Session number}  Start time {Room, Start time}  Session number {Room, Start time}  Track Track  Room Room  Track

37 Boyce-Codd Normal Form
Key columns depend on the whole keys Not always possible to achieve Track Session number Room Start time AD 205 Blue 10:00 207 12:45 BI 605 16:00 DBA 305 Grand PD 203 14:30 Track  Room Room  Track Track Room AD Blue BI 605 DBA Grand PD

38 Boyce-Codd Normal Form
Key columns depend on the whole keys Not always possible to achieve ? Track Session number Start time AD 205 10:00 207 12:45 BI 16:00 DBA 305 PD 203 14:30 Track Room AD Blue BI 605 DBA Grand PD

39 Boyce-Codd Normal Form
Key columns depend on the whole keys Not always possible to achieve Track Session number Start time AD 205 10:00 207 12:45 BI 16:00 DBA 305 PD 203 14:30 Track Room AD Blue BI 605 DBA Grand PD 14:30

40 Boyce-Codd Normal Form
Key columns depend on the whole keys Not always possible to achieve Alternative form is not safe either Track Session number Room Start time AD 205 Blue 10:00 207 12:45 BI 605 16:00 DBA 305 Grand PD 203 14:30 Track Room AD Blue BI 605 DBA Grand PD 605

41 Boyce-Codd Normal Form
Key columns depend on the whole keys Not always possible to achieve Alternative form is not safe either, unless you add a “weird” foreign key Track Session number Room Start time AD 205 Blue 10:00 207 12:45 BI 605 16:00 DBA 305 Grand PD 203 14:30 Track Room AD Blue BI 605 DBA Grand PD

42 Elementary Key Normal Form
Remember Third Normal Form? Every non-key column depends on the keys, the whole keys, and nothing but the keys Remember Boyce-Codd Normal Form? Every column depends on the keys, the whole keys, and nothing but the keys Here’s Elementary Key Normal Form (EKNF): Every non-elementary key column depends on the keys, the whole keys, and nothing but the keys

43 Elementary Key Normal Form
What is an elementary key? Based on elementary dependencies {A, B}  C is not elementary if C  A or C  B Elementary key is any key that implements at least one elementary dependency EKNF is same as 3NF, except for columns in non-elementary keys Highest normal form that is guaranteed achievable Bernstein’s algorithm for synthesis of a Third Normal Form schema Does not solve BCNF violations

44 Fourth Normal Form ? Fourth Normal Form not violated
On Monday, you can ask Erland about Design ? Day Expert Subject Monday Erland Design Tuning Oliver BI Tuesday Wednesday Hugo On Monday, you can ask Erland questions Erland knows about Design Fourth Normal Form not violated

45 Fourth Normal Form ? ! Fourth Normal Form IS violated!
On Monday, you can ask Erland about Design ? ! Day Expert Subject Monday Erland Design Tuning Oliver BI Tuesday Wednesday Hugo On Monday, you can ask Erland questions Erland knows about Design Fourth Normal Form IS violated! Facts are represented multiple times

46 Fourth Normal Form ! On Monday, you can ask Erland about Design Day
Expert Subject Monday Erland Design Tuning Oliver BI Tuesday Wednesday Hugo On Monday, you can ask Erland questions Erland knows about Design Day Expert Monday Erland Oliver Tuesday Wednesday Hugo Expert Subject Erland Design Tuning Oliver BI Hugo

47 Fourth Normal Form Table is in Fourth Normal Form (4NF) if
Table is in Boyce-Codd Normal Form No multivalued dependencies between subset of columns There will always (by definition) be multivalued dependencies between all columns in the table

48 Fourth Normal Form Multivalued dependency
Column A ”multidetermines” column B (A ↠ B), if … for each possible value for A, there are … zero, one or more values for B, regardless of values in other columns Examples: Session ↠ Attendee Presenter ↠ Attendee

49 Fourth Normal Form Composite multivalued dependency
{A, B} ↠ C if or each combination of A and B, there are … zero, one or more values for C, regardless of values in other columns Example: {Conference, Session} ↠ Attendee

50 Fourth Normal Form Composite the other way around??
A ↠ {B, C} Is NOT equivalent to A ↠ B and A ↠ C

51 Fourth Normal Form Fourth Normal Form not violated
On Monday, you can ask Erland about Design Day Expert Subject Monday Erland Design Tuning Oliver BI Tuesday Wednesday Hugo On Monday, you can ask Erland questions Erland knows about Design Expert ↠ {Day, Subject} Expert ↠ Day Expert ↠ Subject Fourth Normal Form not violated

52 Fourth Normal Form Expert ↠ {Day, Subject} Expert ↠ Day
On Monday, you can ask Erland about Design Day Expert Subject Monday Erland Design Tuning Oliver BI Tuesday Wednesday Hugo On Monday, you can ask Erland questions Erland knows about Design Expert ↠ {Day, Subject} Expert ↠ Day Expert ↠ Subject Day Expert Monday Erland Oliver Tuesday Wednesday Hugo Expert Subject Erland Design Tuning Oliver BI Hugo

53 Fifth Normal Form Fourth Normal Form not violated
On Monday, you can ask Erland about Design Day Expert Subject Monday Erland Design Tuning Oliver BI Tuesday Wednesday Hugo On Monday, you can ask Erland questions Erland knows about Design On Monday, you can ask about design Expert ↠ {Day, Subject} Expert ↠ Day Expert ↠ Subject Fourth Normal Form not violated … but Fifth Normal Form IS violated!

54 Fifth Normal Form Expert ↠ {Day, Subject} Expert ↠ Day
On Monday, you can ask Erland about Design Day Expert Subject Monday Erland Design Tuning Oliver BI Tuesday Wednesday Hugo On Monday, you can ask Erland questions Erland knows about Design On Monday, you can ask about design Expert ↠ {Day, Subject} Expert ↠ Day Expert ↠ Subject Day Subject Monday Design Tuning BI Tuesday Wednesday Day Expert Monday Erland Oliver Tuesday Wednesday Hugo Expert Subject Erland Design Tuning Oliver BI Hugo

55 Fifth Normal Form Table is in Fifth Normal Form (5NF) if
Table is in Fourth Normal Form No join dependencies, unless implied by a key

56 Fifth Normal Form JOIN Join dependency Day Expert Subject Monday
Erland Design Tuning Oliver BI Tuesday Wednesday Hugo Day Subject Monday Design Tuning BI Tuesday Wednesday Day Expert Monday Erland Oliver Tuesday Wednesday Hugo Expert Subject Erland Design Tuning Oliver BI Hugo

57 Sixth Normal Form Remember Fifth Normal Form?
No join dependencies, unless implied by a key Here’s Sixth Normal Form (6NF):

58 Sixth Normal Form JOIN Session Topic First name Last name AD-205
Indexes Tim Chapman AD-207 Performance Margarita Naumova BI-205 Dashboards Oliver Engels DBA-305 Backups Grant Fritchey PD-203 Negotiating Steve Jones Session Topic AD-205 Indexes AD-207 Performance BI-205 Dashboards DBA-305 Backups PD-203 Negotiating Session First name AD-205 Tim AD-207 Margarita BI-205 Oliver DBA-305 Grant PD-203 Steve Session Last name AD-205 Chapman AD-207 Naumova BI-205 Engels DBA-305 Fritchey PD-203 Jones

59 Sixth Normal Form JOIN Session Topic First name Last name AD-205
Indexes Tim Chapman AD-207 NULL Margarita Naumova BI-205 Dashboards Oliver Engels DBA-305 Backups PD-203 Steve Jones Session Topic AD-205 Indexes AD-207 Performance BI-205 Dashboards DBA-305 Backups PD-203 Negotiating Session First name AD-205 Tim AD-207 Margarita BI-205 Oliver DBA-305 Grant PD-203 Steve Session Last name AD-205 Chapman AD-207 Naumova BI-205 Engels DBA-305 Fritchey PD-203 Jones

60 Sixth Normal Form JOIN Session Topic First name Last name AD-205
Indexes Tim Chapman AD-207 NULL Margarita Naumova BI-205 Dashboards Oliver Engels DBA-305 Backups PD-203 Steve Jones Session Topic AD-205 Indexes BI-205 Dashboards DBA-305 Backups Session First name AD-205 Tim AD-207 Margarita BI-205 Oliver PD-203 Steve Session Last name AD-205 Chapman AD-207 Naumova BI-205 Engels PD-203 Jones Chapman Chapman

61 Optimal Normal Form JOIN Session Topic First name Last name AD-205
Indexes Tim Chapman AD-207 NULL Margarita Naumova BI-205 Dashboards Oliver Engels DBA-305 Backups PD-203 Steve Jones Session Topic AD-205 Indexes BI-205 Dashboards DBA-305 Backups Session First name Last name AD-205 Tim Chapman AD-207 Margarita Naumova BI-205 Oliver Engels PD-203 Steve Jones

62 Optimal Normal Form Optimal Normal Form (ONF)
Based on fact-based modeling methods (e.g. ORM, NIAM) Every elementary fact type becomes a table End result is mostly 6NF, … … except in some situations Composite foreign keys Composite alternate keys No academic foundation (as far as I know)

63 Domain-Key Normal Form
Requirements for Domain-Key Normal Form (DK/NF): Is NOT based on dependencies Based on: Domains Keys Constraints Every constraint must be implied by the keys and domains Implies Fifth Normal Form (and lower) Probably implies Optimal Normal Form Does not imply Sixth Normal Form, nor is it implied by Sixth Normal Form Which values are allowed in a column? All candidate keys Rules for valid data

64 Domain-Key Normal Form
Relevance of Domain-Key Normal Form Domains  declared, enforced (no code needed) Keys  declared, enforced (no code needed) Other constraints  code needed to enforce Code = cost factor: Time to write Time to test and debug Future maintenance

65 Domain-Key Normal Form
Achievability of Domain-Key Normal Form Sometimes impossible “Every presenter delivers at least three sessions” Otherwise often requires extra tables (subtypes) Introduces need for more (and more complex) code for queries Code = cost factor: Time to write Time to test and debug Future maintenance

66 Overview 1NF 2NF 3NF BCNF 4NF 5NF 6NF

67 Overview 1NF 2NF 3NF BCNF 4NF 5NF 6NF EKNF

68 Overview 1NF 2NF 3NF BCNF 4NF 5NF 6NF EKNF ONF DK/NF

69 Click “Sessions”–“Schedule” … “Download”
T H E E N D Download deck: Click “Sessions”–“Schedule” … “Download”


Download ppt "Normalization Beyond Third Normal Form"

Similar presentations


Ads by Google