Normalization Also called “loss-less decomposition”

Slides:



Advertisements
Similar presentations
5 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
Advertisements

Chapter 5 Normalization of Database Tables
Boyce-Codd NF Takahiko Saito Spring 2005 CS 157A.
Normalization Dr. Mario Guimaraes. Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints.
Normal Forms By Christopher Archibald October 16 th 2007.
Jump to first page Normalization Jump to first page Topics n Why normalization is needed n What causes anomalies n What the 4 normal forms are n How.
Chapter 8 Normal Forms Based on Functional Dependencies Deborah Costa Oct 18, 2007.
Design Guidelines Normalisation Table Design. Informal Design Guidelines Table Semantics A table should hold information about one and only one entity/concept.
Need for Normalization
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Boyce-Codd Normal Form Kelvin Nishikawa SE157a-03 Fall 2006 Kelvin Nishikawa SE157a-03 Fall 2006.
1 © Prentice Hall, 2002 Chapter 5: Logical Database Design and the Relational Model Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B.
Normalization of Database Tables
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 5 The Relational Model and Normalization.
Databases 6: Normalization
NORMALIZATION N. HARIKA (CSC).
Week 6 Lecture Normalization
XP Chapter 1 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Level 3 Objectives: Identifying and Eliminating Database.
Lecture 12 Inst: Haya Sammaneh
Avoiding Database Anomalies
Bordoloi Database Design: Normalization Dr. Bijoy Bordoloi.
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Concepts of Database Management, Fifth Edition
Database Management COP4540, SCS, FIU Relation Normalization (Chapter 14)
1 Database Design and Development: A Visual Approach © 2006 Prentice Hall Chapter 4 DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH Chapter 4 Normalization.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Database Design (Normalizations) DCO11310 Database Systems and Design By Rose Chang.
Your name here. Improving Schemas and Normalization What are redundancies and anomalies? What are functional dependencies and how are they related to.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
Data Normalization Normal is not something to aspire to, it's something to get away from. ~ Jodie Foster ~
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
Chapter 7 1 Database Principles Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
The Relational Model and Normalization The Relational Model Normalization First Through Fifth Normal Forms Domain/Key Normal Form The Synthesis of Relations.
Database Management Systems Introduction. In the Beginning… Customer Program 1.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
What's a Database A Database Primer Let’s discuss databases n Why they are hard n Why we need them.
By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
1 Functional Dependencies and Normalization Chapter 15.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Data Analysis Improving Database Design. Normalization The process of transforming a data model into a flexible, stable structure. Reduces anomalies Anomaly.
Relational Model & Normalization Relational terminology Anomalies and the need for normalization Normal forms Relation synthesis De-normalization.
What is normalization ? Proposed by Codd in 1972 Takes a relation through a series of steps to certify whether it satisfies a certain normal form Initially.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2.
CS 405G: Introduction to Database Systems Database Normalization.
Logical Database Design and the Relational Model.
Ch 7: Normalization-Part 1
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
1 CS 430 Database Theory Winter 2005 Lecture 7: Designing a Database Logical Level.
FEN Introduction to the database field: The development process Seminar: Introduction to relational databases Development process: Analyse.
Copyright © Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF.
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
Normalization Or theoretical and common sense approaches to redesigning a database.
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
Relational Data Model, Review Relation Tuple Attribute Domains Candidate key, primary key Key attribute, non-key attribute.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Normal Forms 1NF – A table that qualifies as a relation is in 1NF. (Back)(Back) 2NF – A relation is in 2NF if all of its nonkey attributes are dependent.
A brief summary of database normalization
UNIT -4 NORMALIZATION.
Payroll Management System
Database Management Systems
4 Normal Form.
Sampath Jayarathna Cal Poly Pomona
Database Normalization.
Presentation transcript:

Normalization Also called “loss-less decomposition” Process of optimizing table structures to eliminate redundancy and avoid anomalies and problems with extensibility. Supports the golden rule: Each fact should be stored in the database only once. Does not provide the solution to all design problems but provides a solid foundation.

Normal Forms 1st Normal Form 2nd Normal Form 3rd Normal Form BCNF 4th Normal Form 5th Normal Form Domain-Key Normal Form

1st Normal Form The relation has no identifiable primary key. First Normal Form is violated if: The relation has no identifiable primary key. Any attempt has been made to store a multi-valued fact in a tuple.

1st NF - Example Query-ability Join-ability Constrain-ability Evaluate the design solutions on the next four slides for: Query-ability Join-ability Constrain-ability Extensibility (of Language Domain) Extensibility (of Schema)

1NF Example – Schema 1 (correct) Programs Table Employees Table EMPID LANGUAGE EMPID LNAME FNAME SEX DEPT PHONE SALARY 23 COBOL 23 Jones Mark M ITR 555-1087 45000 23 JAVA 25 Smith Sara F FINC 555-2222 55000 23 SQL 26 Billings David M ACTG 555-4356 42000 31 SQL 31 Dance Ivanna F ACTG 444-4887 60000 32 JAVA 32 Jones Mary F ITR 555-8745 70000 32 SQL 35 Barker Bob M ACTG 555-6565 44000 32 VB 36 Woods Robin M ITR 555-9812 90000 32 COBOL 37 Jones Mary F FINC 555-1234 56000 36 VB 36 SQL 36 JAVA Languages Table 37 COBOL 37 SQL NAME FULLNAME COBOL COmmon Business Oriented Language JAVA JAVA SQL Structured Query Language VB Visual Basic

1NF Example – Schema 2 (incorrect) Employees Table EMPID LNAME FNAME SEX DEPT PHONE SALARY LANGUAGES 23 Jones Mark M ITR 555-1087 45000 COBOL, JAVA, SQL 25 Smith Sara F FINC 555-2222 55000 26 Billings David M ACTG 555-4356 42000 31 Dance Ivanna F ACTG 444-4887 60000 SQL 32 Jones Mary F ITR 555-8745 70000 JAVA, SQL, VB, COBOL 35 Barker Bob M ACTG 555-6565 44000 36 Woods Robin M ITR 555-9812 90000 VB, SQL, JAVA 37 Jones Mary F FINC 555-1234 56000 COBOL, SQL Languages Table NAME FULLNAME COBOL COmmon Business Oriented Language JAVA JAVA SQL Structured Query Language VB Visual Basic

1NF Example – Schema 3 (incorrect) Employees Table EMPID LNAME FNAME SEX DEPT PHONE SALARY LANG1 LANG2 LANG3 LANG4 23 Jones Mark M ITR 555-1087 45000 COBOL JAVA SQL 25 Smith Sara F FINC 555-2222 55000 26 Billings David M ACTG 555-4356 42000 31 Dance Ivanna F ACTG 444-4887 60000 SQL 32 Jones Mary F ITR 555-8745 70000 JAVA SQL VB COBOL 35 Barker Bob M ACTG 555-6565 44000 36 Woods Robin M ITR 555-9812 90000 VB SQL JAVA 37 Jones Mary F FINC 555-1234 56000 COBOL SQL Languages Table NAME FULLNAME COBOL COmmon Business Oriented Language JAVA JAVA SQL Structured Query Language VB Visual Basic

1NF Example – Schema 4 (incorrect) Employees Table EMPID LNAME FNAME SEX DEPT PHONE SALARY COBOL JAVA SQL VB 23 Jones Mark M ITR 555-1087 45000 T T T F 25 Smith Sara F FINC 555-2222 55000 F F F F 26 Billings David M ACTG 555-4356 42000 F F F F 31 Dance Ivanna F ACTG 444-4887 60000 F F T F 32 Jones Mary F ITR 555-8745 70000 T T T T 35 Barker Bob M ACTG 555-6565 44000 F F F F 36 Woods Robin M ITR 555-9812 90000 F T T T 37 Jones Mary F FINC 555-1234 56000 T F T F Languages Table NAME FULLNAME COBOL COmmon Business Oriented Language JAVA JAVA SQL Structured Query Language VB Visual Basic

2nd Normal Form First Normal Form is violated Second Normal Form is violated if: First Normal Form is violated If there exists a non-key field(s) which is functionally dependent on a partial key. partial key non-key

2NF Example – Raw Data JE #1 02-JAN-2003 100 Cash 310 Smith-Capital (owner investment) 20,000 20,000 JE #2 03-JAN-2003 100 Cash 220 Notes Payable (borrowed money) 30,000 30,000 JE #3 03-JAN-2003 120 Supplies 100 Cash 220 Notes Payable (purchased supplies) 5,000 1,000 4,000

2NF Example – Violation Transactions Table JENO LINENO DATE DESCRIPTION ACCTNO ACCTNAME AMOUNT 1 1 02-JAN-2003 Owner investment 100 Cash 20,000 1 2 02-JAN-2003 Owner investment 310 Smith-Capital (20,000) 2 1 03-JAN-2003 Borrowed money 100 Cash 30,000 2 2 03-JAN-2003 Borrowed money 220 Notes Payable (30,000) 3 1 03-JAN-2003 Purchased Supplies 120 Supplies 5,000 3 2 03-JAN-2003 Purchased Supplies 100 Cash (1,000) 3 3 03-JAN-2003 Purchased Supplies 220 Notes Payable (4,000) Is there a non-key field which is functional dependent on a partial key?

2NF Example – Violation FDs that indicate violation of 2NF JENO LINENO DATE DESCRIPTION ACCTNO ACCTNAME AMOUNT 1 1 02-JAN-2003 Owner investment 100 Cash 20,000 1 2 02-JAN-2003 Owner investment 310 Smith-Capital (20,000) 2 1 03-JAN-2003 Borrowed money 100 Cash 30,000 2 2 03-JAN-2003 Borrowed money 220 Notes Payable (30,000) 3 1 03-JAN-2003 Purchased Supplies 120 Supplies 5,000 3 2 03-JAN-2003 Purchased Supplies 100 Cash (1,000) 3 3 03-JAN-2003 Purchased Supplies 220 Notes Payable (4,000)

2NF Example – Corrected Journal_Entry Table Transactions Table JENO DATE DESCRIPTION 1 02-JAN-2003 Owner investment 2 03-JAN-2003 Borrowed money 3 03-JAN-2003 Purchased Supplies Transactions Table JENO LINENO ACCTNO ACCTNAME AMOUNT 1 1 100 Cash 20,000 1 2 310 Smith-Capital (20,000) 2 1 100 Cash 30,000 2 2 220 Notes Payable (30,000) 3 1 120 Supplies 5,000 3 2 100 Cash (1,000) 3 3 220 Notes Payable (4,000)

3rd Normal Form Second Normal Form is violated Third Normal Form is violated if: Second Normal Form is violated If there exists a non-key field(s) which is functionally dependent on another non-key field(s). non-key non-key Note: A candidate key is not a non-key field.

3NF Example – Violation Journal_Entry Table Are there any non-key fields which functional determine another non-key field? JENO DATE DESCRIPTION 1 02-JAN-2003 Owner investment 2 03-JAN-2003 Borrowed money 3 03-JAN-2003 Purchased Supplies Transactions Table JENO LINENO ACCTNO ACCTNAME AMOUNT 1 1 100 Cash 20,000 Are there any redundant facts? 1 2 310 Smith-Capital (20,000) 2 1 100 Cash 30,000 2 2 220 Notes Payable (30,000) 3 1 120 Supplies 5,000 3 2 100 Cash (1,000) 3 3 220 Notes Payable (4,000)

3NF Example – Violation FD that indicates violation of 3NF Journal_Entry Table Anomalies if not corrected: update (if name of account 100 changes it must be changed in multiple places risking inconsistancy) deletion (can't delete JE#3 and its transactions without losing information about account 120) insertion (can't set up a new account, Jones-capital, for a new partner unless we first have a transaction involving that account. JENO DATE DESCRIPTION 1 02-JAN-2003 Owner investment 2 03-JAN-2003 Borrowed money 3 03-JAN-2003 Purchased Supplies JENO LINENO ACCTNO ACCTNAME AMOUNT 1 1 100 Cash 20,000 1 2 310 Smith-Capital (20,000) 2 1 100 Cash 30,000 2 2 220 Notes Payable (30,000) 3 1 120 Supplies 5,000 3 2 100 Cash (1,000) 3 3 220 Notes Payable (4,000)

3NF Example – Corrected Journal_Entry Table Accounts Table JENO DATE DESCRIPTION ACCTNO ACCTNAME 1 02-JAN-2003 Owner investment 100 Cash 2 03-JAN-2003 Borrowed money 120 Supplies 3 03-JAN-2003 Purchased Supplies 220 Notes Payable 310 Smith-Capital Transactions Table JENO LINENO ACCTNO AMOUNT 1 1 100 20,000 1 2 310 (20,000) 2 1 100 30,000 2 2 220 (30,000) 3 1 120 5,000 3 2 100 (1,000) 3 3 220 (4,000)

3NF Example – Corrected Final Dependencies JENO DATE DESCRIPTION ACCTNO ACCTNAME 1 02-JAN-2003 Owner investment 100 Cash 2 03-JAN-2003 Borrowed money 120 Supplies 3 03-JAN-2003 Purchased Supplies 220 Notes Payable 310 Smith-Capital JENO LINENO ACCTNO AMOUNT 1 1 100 20,000 All non-key fields are FD on the PK and only the PK. 1 2 310 (20,000) 2 1 100 30,000 2 2 220 (30,000) 3 1 120 5,000 3 2 100 (1,000) 3 3 220 (4,000)

BCNF Normal Form Third Normal Form is violated Boyce-Codd Normal Form is violated if: Third Normal Form is violated If there exists a partial key which is functionally dependent on a non-key field(s). non-key partial-key

BCNF Example Semantics A student can have more than one major A student has a different advisor for each major. Each advisor advises for only one major.

BCNF Example – Violation Student_Majors Table SID MAJOR ADVISOR 1 PHYSICS EINSTEIN 1 BIOLOGY LIVINGSTON 2 PHYSICS BOHR 2 COMPUTER SCIENCE CODD 3 PHYSICS EINSTEIN 4 BIOLOGY LIVINGSTON 4 ACCOUNTING PACIOLI 5 PHYSICS EINSTEIN 6 PHYSICS BOHR 6 BIOLOGY DARWIN 7 COMPUTER SCIENCE CODD 7 BIOLOGY DARWIN Does this relation violate third normal form? Are there any redundant facts?

BCNF Example – Violation FD that violates BCNF SID MAJOR ADVISOR It is important that you convince yourself that major does not FD advisor. 1 PHYSICS EINSTEIN 1 BIOLOGY LIVINGSTON 2 PHYSICS BOHR 2 COMPUTER SCIENCE CODD 3 PHYSICS EINSTEIN 4 BIOLOGY LIVINGSTON 4 ACCOUNTING PACIOLI 5 PHYSICS EINSTEIN 6 PHYSICS BOHR 6 BIOLOGY DARWIN 7 COMPUTER SCIENCE CODD 7 BIOLOGY DARWIN

BCNF Example – Corrected Advisors Table ADVISOR MAJOR BOHR PHYSICS Student_Advisors Table CODD COMPUTER SCIENCE DARWIN BIOLOGY SID ADVISOR EINSTEIN PHYSICS 1 EINSTEIN LIVINGSTON BIOLOGY 1 LIVINGSTON PACIOLI ACCOUNTING 2 BOHR 2 CODD 3 EINSTEIN Note that the if the original key, counter-intuitively, in schema 1 had been defined as SID & ADVISOR this would have been a 2NF violation. 4 LIVINGSTON 4 PACIOLI 5 EINSTEIN 6 BOHR 6 DARWIN 7 CODD 7 DARWIN

4th Normal Form Boyce Codd Normal Form is violated 4th Normal Form is violated if: Boyce Codd Normal Form is violated If there exists a partial key which has multiple independent multi-valued functional dependencies to other partial keys. partial-key1 partial-key2 partial-key3

4NF Example – Violation Instruments_Languages Name Instrument Language Fred Piano French Fred Flute Italian Fred Flute Spanish Jane Piano French Jane Oboe French Sam Piano French Sam Oboe Spanish Sam Flute Spanish

4NF Example – Violation Name Instrument Language Fred Piano French Fred Flute Italian Fred Flute Spanish Jane Piano French Jane Oboe French Sam Piano French Sam Oboe Spanish Sam Flute Spanish Does this relation violate 1st, 2nd, 3rd, or BCNF? Are there any redundant facts?

4NF Example – Correction LanguagesSpoken InstrumentsPlayed Name Language Name Instrument Fred French Fred Piano Fred Italian Fred Flute Fred Spanish Jane Piano Jane French Jane Oboe Sam French Sam Piano Sam Spanish Sam Oboe Sam Flute