Download presentation
Presentation is loading. Please wait.
Published byBeverly Willis Modified over 9 years ago
1
IS6145 Database Analysis and Design Lecture 10: Normalization of Data Tables Rob Gleasure R.Gleasure@ucc.ie www.robgleasure.com
2
IS6145 Today’s session What did we learn from the reports? Normalisation
3
Themes from the reports Three Vs Datafication Recording things is the first step Information asymmetry It’s bad for a market when one side knows more than the other about the quality of specific instances Privacy Data has value to a consumer
4
Inferring Functional Dependencies (The Armstrong Axioms) 1. Reflexivity: If Y is a subset of X, then X Y 2. Augmentation: If X Y, then XZ YZ 3: Transitivity: If X Y, and Y Z, then X Z
5
Normalisation: Orders Table Full_ Name AddressZoneOrder _ID DateProduct _1 Cost_ P1 Units_ P1 Product _2 Cost_ P2 Units_ P2 Product _3 Cost_ P3 Units_ P3 John Murphy 123 Fake St Inner- city S34531/12/ 2014 Football$20.002Gloves$53.501Whistle$5.001 Mary Byrne Kildaman- fadar RuralR3679/9/ 2014 Helmet$30.501 Anne Dunne 123 Fake St Inner- city N65410/6/ 2014 Pants$13.752Hat$11.002 Jim Feltz 20c Fake St Inner- city D89613/06/ 2014 Hat$28.752Boots$75.951 John Murphy 123 Fake St Inner- city S3541/01/ 2015 Socks$3.505
6
Normalisation: First Normal Form First_ Name AddressZoneOrder _ID DateProductCostUnits John Murphy 123 Fake St Inner-cityS34531/12/ 2014 Football$20.002 John Murphy 123 Fake St Inner-cityS34531/12/ 2014 Gloves$53.501 John Murphy 123 Fake St Inner-cityS34531/12/ 2014 Whistle$5.001 John Murphy 123 Fake St Inner-cityS35431/12/ 2014 Socks$3.505 Mary Ahern Kildaman- fadar RuralR3679/9/ 2014 Helmet$30.501 Anne Dunne 123 Fake St Inner-cityN65410/6/ 2014 Pants$13.752 Anne Dunne 123 Fake St Inner-cityN65410/6/ 2014 Hat$11.002 Jim Feltz 20c Fake St Inner-cityD89613/06/ 2014 Hat$28.752 Jim Feltz 20c Fake St Inner-cityD89613/06/ 2014 Boots$75.951
7
First Normal Form (continued) First_ Name Last_ Name AddressZoneOrder _ID DateProductCostUnits JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Football$20.002 JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Gloves$53.501 JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Whistle$5.001 JohnMurphy123 Fake St Inner- city S35431/12/ 2014 Socks$3.505 MaryByrneKildaman- fadar RuralR3679/9/ 2014 Helmet$30.501 AnneDunne123 Fake St Inner- city N65410/6/ 2014 Pants$13.752 AnneDunne123 Fake St Inner- city N65410/6/ 2014 Hat$11.002 JimFeltz20c Fake St Inner- city D89613/06/ 2014 Hat$28.752 JimFeltz20c Fake St Inner- city D89613/06/ 2014 Boots$75.951
8
Summary of First Normal Form (1NF) A database is in the first normal form when Attributes store only atomic values Duplicate columns are removed
9
Moving to Second Normal Form First_ Name Last_ Name AddressZoneOrder _ID DateProductCostUnits JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Football$20.002 JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Gloves$53.501 JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Whistle$5.001 JohnMurphy123 Fake St Inner- city S35431/12/ 2014 Socks$3.505 MaryByrneKildaman- fadar RuralR3679/9/ 2014 Helmet$30.501 AnneDunne123 Fake St Inner- city N65410/6/ 2014 Pants$13.752 AnneDunne123 Fake St Inner- city N65410/6/ 2014 Hat$11.002 JimFeltz20c Fake St Inner- city D89613/06/ 2014 Hat$28.752 JimFeltz20c Fake St Inner- city D89613/06/ 2014 Boots$75.951
10
Second Normal Form Cust_ ID Order _ID DateProductCostUnits 1S34531/12/ 2014 Football$20.002 1S34531/12/ 2014 Gloves$53.501 1S34531/12/ 2014 Whistle$5.001 1S35431/12/ 2014 Socks$3.505 2R3679/9/ 2014 Helmet$30.501 3N65410/6/ 2014 Pants$13.752 3N65410/6/ 2014 Hat$11.002 4D89613/06/ 2014 Hat$28.752 4D89613/06/ 2014 Boots$75.951 Cust_IDFirst_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city
11
Second Normal Form (Continued) Cust_ ID Order _ID DateProductUnits 1S34531/12/ 2014 12 1S34531/12/ 2014 21 1S34531/12/ 2014 31 1S35431/12/ 2014 45 2R3679/9/ 2014 51 3N65410/6/ 2014 62 3N65410/6/ 2014 72 4D89613/06/ 2014 72 4D89613/06/ 2014 81 Cust_ ID First_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95
12
Second Normal Form (Continued) Cust_ ID Order _ID ProductUnits 1S34512 1 21 1 31 1S35445 2R36751 3N65462 3 72 4D89672 4 81 Cust_ ID First_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95 Order _ID Date S34531/12/ 2014 S35431/12/ 2014 R36709/09/ 2014 N65410/6/ 2014 D89613/06/ 2014
13
Second Normal Form (Continued) Order _ID ProductUnits S34512 21 31 S35445 R36751 N65462 72 D89672 81 Cust_ ID First_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95 Order _ID Cust_ ID S3451 R3672 N6543 D8964 Order _ID Date S34531/12/ 2014 S35431/12/ 2014 R36709/09/ 2014 N65410/6/ 2014 D89613/06/ 2014
14
Summary of Second Normal Form (2NF) A database is in the second normal form when It satisfies the criteria for the first normal form Each non-candidate key is dependent on the whole candidate key (i.e. subsets of data across multiple rows are removed) Put differently, we have no partial dependencies via a concatenated key Takes advantage of reflexivity and augmentation
15
Moving to Third Normal Form Order _ID ProductUnits S34512 21 31 S35445 R36751 N65462 72 D89672 81 Cust_ ID First_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95 Order _ID Cust_ ID S3451 R3672 N6543 D8964 Order _ID Date S34531/12/ 2014 S35431/12/ 2014 R36709/09/ 2014 N65410/6/ 2014 D89613/06/ 2014
16
Moving to Third Normal Form Order _ID ProductUnits S34512 21 31 S35445 R36751 N65462 72 D89672 81 Cust_ ID First_ Name Last_ Name Address 1JohnMurphy123 Fake St 2MaryByrneKildaman- fadar 3AnneDunne123 Fake St 4JimFeltz20c Fake St Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95 Order _ID Cust_ ID S3451 R3672 N6543 D8964 AddressZone 123 Fake St Inner-city 20c Fake St Inner-city Kildaman- fadar Rural Order _ID Date S34531/12/ 2014 S35431/12/ 2014 R36709/09/ 2014 N65410/6/ 2014 D89613/06/ 2014
17
Summary of Third Normal Form (3NF) A database is in the second normal form when It satisfies the criteria for the second normal form Each non-key attribute that depends on anything other than the entire primary key is removed (insertion anomalies are impossible) Put differently, we have no transitive dependencies via non- key attributes Takes advantage of transitivity
18
Readings Some more descriptions of normal forms http://databases.about.com/od/specificproducts/a/normalization.ht m http://phlonx.com/resources/nf3/ http://www.bkent.net/Doc/simple5.htm
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.