Presentation is loading. Please wait.

Presentation is loading. Please wait.

IS6125 Database Analysis and Design Lecture 11: Normalization of Data Tables Rob Gleasure

Similar presentations


Presentation on theme: "IS6125 Database Analysis and Design Lecture 11: Normalization of Data Tables Rob Gleasure"— Presentation transcript:

1 IS6125 Database Analysis and Design Lecture 11: Normalization of Data Tables Rob Gleasure R.Gleasure@ucc.ie www.robgleasure.com

2 IS6125 Today’s session  Normalisation  Revision: subjects covered and the types of questions to expect Essay style questions Modelling questions

3 Normalisation Not actually as terrifying as it sounds… Just about making a database as efficient as possible by breaking big tables with redundant data into smaller tables with less redundant data We do this by taking advantage of functional dependencies

4 Inferring Functional Dependencies (The Armstrong Axioms) 1. Reflexivity:  If Y is a subset of X,  then X Y 2. Augmentation:  If X Y,  then XZ YZ 3: Transitivity:  If X Y, and Y Z,  then X Z

5 Normalisation: Orders Table Full_ Name AddressZoneOrder _ID DateProduct _1 Cost_ P1 Units_ P1 Product _2 Cost_ P2 Units_ P2 Product _3 Cost_ P3 Units_ P3 John Murphy 123 Fake St Inner- city S34531/12/ 2014 Football$20.002Gloves$53.501Whistle$5.001 Mary Byrne Kildaman- fadar RuralR3679/9/ 2014 Helmet$30.501 Anne Dunne 123 Fake St Inner- city N65410/6/ 2014 Pants$13.752Hat$11.002 Jim Feltz 20c Fake St Inner- city D89613/06/ 2014 Hat$28.752Boots$75.951 John Murphy 123 Fake St Inner- city S3541/01/ 2015 Socks$3.505

6 Normalisation: First Normal Form First_ Name AddressZoneOrder _ID DateProductCostUnits John Murphy 123 Fake St Inner-cityS34531/12/ 2014 Football$20.002 John Murphy 123 Fake St Inner-cityS34531/12/ 2014 Gloves$53.501 John Murphy 123 Fake St Inner-cityS34531/12/ 2014 Whistle$5.001 John Murphy 123 Fake St Inner-cityS35431/12/ 2014 Socks$3.505 Mary Ahern Kildaman- fadar RuralR3679/9/ 2014 Helmet$30.501 Anne Dunne 123 Fake St Inner-cityN65410/6/ 2014 Pants$13.752 Anne Dunne 123 Fake St Inner-cityN65410/6/ 2014 Hat$11.002 Jim Feltz 20c Fake St Inner-cityD89613/06/ 2014 Hat$28.752 Jim Feltz 20c Fake St Inner-cityD89613/06/ 2014 Boots$75.951

7 First Normal Form (continued) First_ Name Last_ Name AddressZoneOrder _ID DateProductCostUnits JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Football$20.002 JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Gloves$53.501 JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Whistle$5.001 JohnMurphy123 Fake St Inner- city S35431/12/ 2014 Socks$3.505 MaryByrneKildaman- fadar RuralR3679/9/ 2014 Helmet$30.501 AnneDunne123 Fake St Inner- city N65410/6/ 2014 Pants$13.752 AnneDunne123 Fake St Inner- city N65410/6/ 2014 Hat$11.002 JimFeltz20c Fake St Inner- city D89613/06/ 2014 Hat$28.752 JimFeltz20c Fake St Inner- city D89613/06/ 2014 Boots$75.951

8 Summary of First Normal Form (1NF) A database is in the first normal form when  Attributes store only atomic values  Duplicate columns are removed

9 Moving to Second Normal Form First_ Name Last_ Name AddressZoneOrder _ID DateProductCostUnits JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Football$20.002 JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Gloves$53.501 JohnMurphy123 Fake St Inner- city S34531/12/ 2014 Whistle$5.001 JohnMurphy123 Fake St Inner- city S35431/12/ 2014 Socks$3.505 MaryByrneKildaman- fadar RuralR3679/9/ 2014 Helmet$30.501 AnneDunne123 Fake St Inner- city N65410/6/ 2014 Pants$13.752 AnneDunne123 Fake St Inner- city N65410/6/ 2014 Hat$11.002 JimFeltz20c Fake St Inner- city D89613/06/ 2014 Hat$28.752 JimFeltz20c Fake St Inner- city D89613/06/ 2014 Boots$75.951

10 Second Normal Form Cust_ ID Order _ID DateProductCostUnits 1S34531/12/ 2014 Football$20.002 1S34531/12/ 2014 Gloves$53.501 1S34531/12/ 2014 Whistle$5.001 1S35431/12/ 2014 Socks$3.505 2R3679/9/ 2014 Helmet$30.501 3N65410/6/ 2014 Pants$13.752 3N65410/6/ 2014 Hat$11.002 4D89613/06/ 2014 Hat$28.752 4D89613/06/ 2014 Boots$75.951 Cust_IDFirst_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city

11 Second Normal Form (Continued) Cust_ ID Order _ID DateProductUnits 1S34531/12/ 2014 12 1S34531/12/ 2014 21 1S34531/12/ 2014 31 1S35431/12/ 2014 45 2R3679/9/ 2014 51 3N65410/6/ 2014 62 3N65410/6/ 2014 72 4D89613/06/ 2014 72 4D89613/06/ 2014 81 Cust_ ID First_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95

12 Second Normal Form (Continued) Cust_ ID Order _ID ProductUnits 1S34512 1 21 1 31 1S35445 2R36751 3N65462 3 72 4D89672 4 81 Cust_ ID First_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95 Order _ID Date S34531/12/ 2014 S35431/12/ 2014 R36709/09/ 2014 N65410/6/ 2014 D89613/06/ 2014

13 Second Normal Form (Continued) Order _ID ProductUnits S34512 21 31 S35445 R36751 N65462 72 D89672 81 Cust_ ID First_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95 Order _ID Cust_ ID S3451 R3672 N6543 D8964 Order _ID Date S34531/12/ 2014 S35431/12/ 2014 R36709/09/ 2014 N65410/6/ 2014 D89613/06/ 2014

14 Summary of Second Normal Form (2NF) A database is in the second normal form when  It satisfies the criteria for the first normal form  Each non-candidate key is dependent on the whole candidate key (i.e. subsets of data across multiple rows are removed)  Put differently, we have no partial dependencies via a concatenated key Takes advantage of reflexivity and augmentation

15 Moving to Third Normal Form Order _ID ProductUnits S34512 21 31 S35445 R36751 N65462 72 D89672 81 Cust_ ID First_ Name Last_ Name AddressZone 1JohnMurphy123 Fake StInner-city 2MaryByrneKildaman- fadar Rural 3AnneDunne123 Fake StInner-city 4JimFeltz20c Fake StInner-city Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95 Order _ID Cust_ ID S3451 R3672 N6543 D8964 Order _ID Date S34531/12/ 2014 S35431/12/ 2014 R36709/09/ 2014 N65410/6/ 2014 D89613/06/ 2014

16 Moving to Third Normal Form Order _ID ProductUnits S34512 21 31 S35445 R36751 N65462 72 D89672 81 Cust_ ID First_ Name Last_ Name Address 1JohnMurphy123 Fake St 2MaryByrneKildaman- fadar 3AnneDunne123 Fake St 4JimFeltz20c Fake St Product _ID Product _1 Cost_P 1 1Football$20.00 2Gloves$53.50 3Whistle$5.00 4Socks$3.50 5Helmet$30.50 6Pants$13.75 7Hat$11.00 8Boots$75.95 Order _ID Cust_ ID S3451 R3672 N6543 D8964 AddressZone 123 Fake St Inner-city 20c Fake St Inner-city Kildaman- fadar Rural Order _ID Date S34531/12/ 2014 S35431/12/ 2014 R36709/09/ 2014 N65410/6/ 2014 D89613/06/ 2014

17 Summary of Third Normal Form (3NF) A database is in the second normal form when  It satisfies the criteria for the second normal form  Each non-key attribute that depends on anything other than the entire primary key is removed (insertion anomalies are impossible)  Put differently, we have no transitive dependencies via non- key attributes Takes advantage of transitivity

18 Exam Revision Image from http://www.studentmoneysaver.co.uk/article/6-revision-tips-which-actually-work/

19 Essay-Style Questions Topics covered  The cloud and datafication What is data and how does something become ‘datafied’? How and why did cloud technologies evolve? What does it mean in terms of technological and business capabilities? What is the Internet of Things? What does the future hold? Can you use contrasting examples of different businesses to discuss each of these headings?

20 Essay-Style Questions Topics covered  Big data When is data ‘big data’? How and why did we get from ‘small data’ to ‘big data’? What does big data let businesses do that they couldn’t do previously? What businesses are a good example off this? What are the issues and challenges arising from big data? Can you use contrasting examples of different businesses to discuss each of these headings?

21 Essay-Style Questions Topics covered  Business intelligence What kind of intelligence is enabled as we increase our measurement and analysis capabilities? How do we get from an individual case to a large-scale pattern, and back again? What are the challenges of translating intelligence from an individual case to large-scale patterns, and back again? What businesses exemplify the ability to generate intelligence from the increased capacity for data handling, and why? Can you use contrasting examples of different businesses to discuss each of these headings?

22 Modelling Questions Topics covered  Modelling question will expect 1. A model 2. Constraints 3. Assumptions  You may also be asked to discuss issues, such as Differences between stages of ER modelling The reasons for a staggered approach to data modelling Commonly encountered issues

23 Answering Questions Exam technique  Manage your time  Plan your answers  Sketch out your diagrams very quickly as roughwork if you’re not sure how to make them fit together  Answer your best questions first  Use examples Have these lined up as part of your revision

24 Readings Some more descriptions of normal forms  http://databases.about.com/od/specificproducts/a/normalization.ht m  http://phlonx.com/resources/nf3/  http://www.bkent.net/Doc/simple5.htm


Download ppt "IS6125 Database Analysis and Design Lecture 11: Normalization of Data Tables Rob Gleasure"

Similar presentations


Ads by Google