1 First Normal Form (1NF) Unnormalized table : Contains a repeating group –Eg: from multi-valued attributes –Eg: from many-many relationship Table in 1NF:

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

Chapter 5 Normalization of Database Tables
Logical DB Design: ER to Relational Entity sets to tables. Employees ssn name lot CREATE TABLE Employees (ssn CHAR (11), name CHAR (20), lot INTEGER, PRIMARY.
Functional Dependencies and Normalization for Relational Databases
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
NORMALIZATION FIRST NORMAL FORM (1NF): A relation R is in 1NF if all attributes have atomic value = one value for an attribute = no repeating groups =
Normalisation The theory of Relational Database Design.
Relational Normalization Theory. Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
Murali Mani Normalization. Murali Mani What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert,
Design Guidelines Normalisation Table Design. Informal Design Guidelines Table Semantics A table should hold information about one and only one entity/concept.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design: Normalization.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Databases 6: Normalization
Why Normalization? To Reduce Redundancy to 1.avoid modification, insertion, deletion anomolies 2.save space Goal: One Fact in One Place.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design 1: Normalization.
Introduction to Schema Refinement
Ch 7: Normalization-Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Week 6 Lecture Normalization
DBSQL 4-1 Copyright © Genetic Computer School 2009 Chapter 4 Database Design.
A Guide to SQL, Eighth Edition Chapter Two Database Design Fundamentals.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Concepts of Database Management Sixth Edition Chapter 5 Database Design 1: Normalization.
Concepts of Database Management, Fifth Edition
1 A Guide to MySQL 2 Database Design Fundamentals.
FEN Quality checking table design: Design Guidelines Normalisation Table Design Is this OK?
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
Functional Dependencies and Normalization for Relational Databases.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
1 A Guide to MySQL 2 Database Design Fundamentals.
How to build ER diagrams
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
COMP1212 COMP1212 Anomalies and Dependencies Dr. Mabruk Ali.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
1 Functional Dependencies and Normalization Chapter 15.
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization.
Concepts of Database Management Seventh Edition Chapter 5 Database Design 1: Normalization.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Ch 7: Normalization-Part 1
Chapter 8: Object-Relational Modeling Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich, Jeffrey A. Hoffer.
1 CS 430 Database Theory Winter 2005 Lecture 7: Designing a Database Logical Level.
Al-Imam University Girls Education Center Collage of Computer Science 1 st Semester, 1432/1433H Chapter 10_part 1 Functional Dependencies and Normalization.
1 Dept. of CIS, Temple Univ. CIS661 – Principles of Data Management V. Megalooikonomou Database design and normalization (based on slides by C. Faloutsos.
Southern Methodist University CSE CSE 2337 Introduction to Data Management Chapter 5 Part II.
ITD1312 Database Principles Chapter 4C: Normalization.
Copyright © Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF.
Constraints and Views Chap. 3-5 continued (7 th ed. 5-7)
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
1 Normalization David J. Stucki. Outline Informal Design Guidelines Normal Forms  1NF  2NF  3NF  BCNF  4NF 2.
A Guide to SQL, Eighth Edition
Functional Dependency and Normalization
Normalization Karolina muszyńska
Relational Database Design by Dr. S. Sridhar, Ph. D
Concepts of Database Management Eighth Edition
Module 5: Overview of Normalization
Normalization Murali Mani.
Normalization – Part II
Normalization.
NORMALIZATION FIRST NORMAL FORM (1NF):
Chapter 14 Normalization Pearson Education © 2009.
Chapter 7a: Overview of Database Design -- Normalization
Presentation transcript:

1 First Normal Form (1NF) Unnormalized table : Contains a repeating group –Eg: from multi-valued attributes –Eg: from many-many relationship Table in 1NF: Contains no repeating groups All tables are in 1NF –Then why have this as a NF ? Not true when Codd first defined it.

2 Pratt and Adamski Figure 5.5: 1NF Example Unnormalized Table (not in 1NF)

3 Pratt and Adamski Figure 5.6: 1NF Example (con’t.) Conversion to 1NF

4 Functional Dependencies Column B is functionally dependent on Column A (A  B) if A’s value determines a single value for B. –By looking at A, we will know what is the value of B –For a single value of A, can’t have multiple values of B –Eg: In Emp, ssn  address, ssn  salary and others Eg: In MyWorksOn (ssn, lname, P#, PName). f.d.s? ssn  lname, P#  Pname, Pname  P#

5 Functional Dependencies Can f.d.s be determined by looking at relational instance alone ? No ! Have to look at the semantics of the data By looking at relational instance, might be able to say f.d. is false –But can never say that a f.d. is true –might happen to be true for that instance, but not always true. –See examples on next slide

6 Pratt and Adamski Figures : Functional Dependence Example Rep Table Where LastName can determine record Rep Table Where LastName cannot determine record

7 Pratt and Adamski Figure 5.7: Are following f.d.s. consistent with above table ? –OrderNum  PartNum, PartNum  Description –Price  PartNum, NumOrdered  OrderNum

8 Keys in terms of f.d.s Column(s) C is key for table T if: –All columns in T are functionally dependent on C –As long as no duplicate rows CDCD x1x1 x1x1

9 Functional Dependencies A  B does not imply B  A Left part can be composite. – Eg: In WorksOn {ESSN, PROJ#}  HOURS A  A f.d is a type of integrity constraint Trivial dependencies: A  A, key  A Modification anomalies: caused by tables which have some undesirable type of f.d. Normalization: split tables to get rid of this f.d –Without information loss

10 Pratt and Adamski Figure 5.7:

11 Dependency diagram, Pratt and Adamski Eg Eg: Consider the table Orders (OrderNum, OrderDate, PartNum, Description, NumOrdered, QuotedPrice) Apart from trivial f.d.s, other f.d.s ? OrderNum  OrderDate, PartNum  Description Dependency diagram: pictorial way of representing f.d.s.

12 Problems with Orders table Redundancy ? OrderDate 10/20/2007 repeated for OrderNum Description Gas Range repeated for PartNum DR93 Inconsistency ? If we change Description Gas Range to Oven in one row for PartNum DR93, but leave as Gas Range in the other row.

13 Problems with Orders table Update Anomaly? If we want to change Description Gas Range to Oven for PartNum DR93, to maintain consistency, have to make this change in two rows. Deletion Anomaly? If we no longer have OrderNum 21610, lose info that PartNum DW11 is a washer. Insertion Anomaly? Can’t insert that PartNum AB47 is a range oven till you have an order including it.

14 How to fix Orders table ? What do we want ? –Algorithm to identify these problems –Way of safely breaking up table Need to be careful on how to break it up So don’t lose info Eg of lossy decomposition on Orders table ? Non-loss decomposition: a way of breaking table up without loosing information. –Eg on Orders table ? 2NF guarantees non-loss decomposition

15 Partial functional Dependencies Partial f.d: attribute depends only on a portion of the composite primary key (and not the whole primary key) i.e. we can take away part of the primary key, and still have an f.d. Eg: OrderNum  OrderDate is partial f.d. PartNum  Description is partial f.d. We don’t like partial f.d.s: Why ? Consider the Orders table again. What is the problem: redundancies, modification anomalies. –Caused by the partial f.d.s

16 Second Normal Form Defn Non-key attribute (column): not part of the primary key 2 nd Normal Form Table: In 1NF and no non- key column is partially dependent on primary key –i.e. no non-key column is dependent on only a portion of the primary key. –Want to ensure we don’t have following: (part of PK)  (non-key attribute) If a table does not have a partial f.d., is in 2NF

17 Second Normal Form Eg Is this true of Orders table? No ! What are the f.d.s which are the problem ? OrderNum  OrderDate is partial f.d. –OrderDate is non-key, so this is partial f.d. PartNum  Description is partial f.d. – Description is non-key

18 Second Normal Form Eg We can break the table up to get rid of these partial f.d.s by putting the columns for the partial f.d.s into separate tables. –Guaranteed that this will not cause loss of info. –How to break up Orders table? Orders (OrderNum, OrderDate) Part (PartNum, Description) OrderLine (OrderNum, PartNum, NumOrdered,QuotedPrice)  Intuition: OrderNum  OrderDate is f.d., so can safely move to another table without losing info.  What will be the PK of new table ?  Same with PartNum  Description  Is non-loss decomposition?

19 Pratt and Adamsi Figure 5.9: 2NF Example

20 Gotten rid of problems ?  Redundancy:  OrderDate 10/20/2007 repeated for OrderNum ?  Description Gas Range repeated for PartNum DR93 ?  Inconsistency: If we change Description Gas Range to Oven for PartNum DR93 ?  Update Anomaly: If we want to change Description Gas Range to Oven for PartNum DR93 and maintain consistency, do we have to make this change in two rows ?

21 Gotten rid of problems ?  Deletion Anomaly:  If we no longer have OrderNum 21608, do we lose info that PartNum DW11 is a washer ?  Insertion Anomaly:  Can we insert info that PartNum AB47 is a range oven without having an order including it ?  Have we lost any information  Spurious tuples when we join new tables?  No !

22 Another 2NF example, is 2NF enough? PartSupp( P#, S#, Sname, SAdd, Price) –Price depends on both P# and S# Non-trivial f.d.s? S#  Sname, S#  SAdd Problems? 2NF? Modified Tables ?Problems fixed ? Table in 2NF may still have problem Consider the Customer table from next two figures with primary key CustomerNum –Is this table OK?

23 Pratt and Adamski Figure 5.10: Combined C and R

24 Problems with Combined C and R Redundancy ? Rep’s info repeated for each Customer Update Anomaly ? If we want to change Rep 35’s last name to Green, to maintain consistency, have to change in multiple rows. Deletion Anomaly? If we no longer have Customer 356, 452, lose name info for Rep 35 Insertion Anomaly? Can’t insert that Rep 45’s name is Bob Smith without a Customer having 45 as their Rep.

25 Boyce Codd Normal Form (BCNF) Suppose we put Balance, RLName, RFName in second table, and also leave Balance in first table. Is this a good decomposition? No. Lossy. Can get spurious tuples. –When do a join over Balance Consider a different decomposition in the next slide:

26 Pratt and Adamski Figure 5.13: Decomposition Example

27 BCNF example Figure 5.13: is this a good way to break up the Customer table ? Is this lossy i.e. when we join the two new tables, do we get any spurious tuples? No spurious tuples. Why ? Any other problems? Update anomalies. Eg? If we want to change Rep’s last name Also insertion and deletion anomalies

28 Pratt and Adamski Figure 5.11: Customers Dependency Diagram

29 BCNF example Is the Customer table in 2NF ? –Is there a partial dependency ? No partial dependency. Why ? Not a composite key, so can’t have partial dependency –No partial dependency : is in 2NF What is the problem – which are undesirable f.d.s? RepNum  Lname, RepNum  Fname Need to break this table up. How ? Want guidelines for doing this.

30 BCNF Definition BCNF: A table is in BCNF if –it is in 2NF and –for any f.d. X  A, if X ≠ A, then X is a key either primary or secondary key  Customer Table: is it in BCNF ?  No: which are the problem f.d.s?  RepNum  Lname, RepNum  Fname. Why ?  RepNum is not a key  Can break the table up according to these f.d.s

31 Pratt and Adamski Figure 5.12: BCNF Example

32 BCNF In this new table, have problems been fixed? –Redundancy ? –Inconsistency/Multiple rows to update ? –Deletions ? –Insertions ? –Spurious tuples when we join the two new tables?

33 BCNF – Another Example Eg [Date]: Supp(S#, Sname, City, Status) City  Status Problems? 2NF? BCNF? Modified Tables ? Problems fixed ? Third Normal Form: a slightly weaker version of BCNF. We won’t cover. –Some books refer to BCNF as 3NF.

34 Constraints and Assertions Use queries to capture constraints –Useful when more general Integrity Constraints than keys are involved. Constraints –Attribute-based CHECK –Tuple-based CHECK Assertions –Extended Assertions Commercial DBMSs generally support tuple- based check, not assertions, though assertions part of SQL standard

35 Attribute-Based Checks Easy way to introduce more powerful tuple- based checks. Want to specify that every employee must have a salary of at least 20,000 Attrribute based check: When we specify attribute in CREATE TABLE –Follow this by a condition that must hold. CREATE TABLE EMPLOYEE … SALARY INTEGER CHECK (SALARY >= 20000) Checked only when associated attribute changes –An insert occurs : new employee inserted –An update occurs: when SALARY is changed

36 Attribute-Based Checks Condition can also refer to (in subquery) –other columns –other tables However, condition is only checked when the “main” attribute is changed –not when other attributes changed.

37 Attribute-Based Checks Eg: in SP table, want to make S# foreign key to S. CREATE TABLE SP … S# CHAR(5) CHECK (S# IN (SELECT S# FROM S)) If try to insert row with S8 into SP ? rejected If try to change S1 in SP to S8 ? rejected If delete S2 from S ? Not rejected. Why ? Won’t check when changes to S

38 Tuple-Based Checks Similar to attribute-based check – but done on whole table In S table, we want to enforce : –all London suppliers have STATUS = 20 Conceptually we would think of this as –If a supplier is in London, then the STATUS = 20 – CITY = ‘London’ => STATUS = 20 Problem : no “=>” in SQL –How to do ?

39 Tuple-Based Checks We assert: CITY ‘London’ OR STATUS = 20 – why is this the same as CITY = ‘London’ => STATUS = 20 If CITY = ‘London’ then the first condition (CITY ‘London’ ) is false, so ? second condition (STATUS = 20) has to be true CREATE TABLE S( S# … CITY … STATUS … CHECK (CITY ‘London’ OR STATUS = 20)

40 Tuple-Based Checks Stronger than attribute based check since checked when tuple inserted (but not deleted) or any attribute updated –With attribute based check, only when associated attribute is update. Suppose done as attribute based check STATUS INTEGER CHECK (CITY ‘London’ OR STATUS = 20) Suppose CITY changes ? Problem ? Won’t check How about with tuple-based ? Problem ?

41 Tuple-Based Checks Condition being checked can be anything that could follow a WHERE Can refer to other tables –No checking done when other tables changed CREATE TABLE SP( S# … CHECK (S# IN (SELECT S# FROM S) Will this work ? No : not when S# changes/gets deleted in S.

42 SQL Assertions Database-schema constraint –Not associated with any table Used when we want to do a check over multiple tables Checked when any involved tables changes. Part of SQL standard, not currently supported by most commercial DBMS (eg: Oracle, SQL Server) because of performance penalty Syntax: CREATE ASSERTION name CHECK (condition);

43 Assertion Example Want to assert that there are at least 4 employees. Is it OK if we do as tuple-based check ? No: why not ? Won’t check in case of DELETION Do as assertion CREATE ASSERTION atleastfouremp CHECK( (SELECT COUNT (ssn) FROM EMPLOYEE) >= 4);

44 Assertion Example Condition: If A then B specified as –not exists (A and not B) Any supplier whose STATUS >= 20 should be supplying at least 3 different parts CREATE ASSERTION highstat CHECK( NOT EXISTS ( SELECT * FROM S WHERE STATUS > = 20 AND (SELECT COUNT(DISTINCT P#) FROM SP WHERE SP.S# = S.S#) < 3 ));

45 SQL3 Extended Assertions [UW] SQL2: system decides when a constraint might be violated, decides when to perform check. SQL3: programmer specifies when to check. The salary of an employee who works in accounting must be at least In SQL2 with assertions CREATE ASSERTION acctsalSQL2assert CHECK (NOT EXISTS ( SELECT * FROM EMPLOYEE, DEPT WHERE EMPLOYEE.DNO = DEPT.DNUMBER AND DNAME = ‘ACCOUNTING’ AND EMPLOYEE.SALARY < ));

46 SQL3 Extended Assertions [UW] SQL3: with extended assertions –when should check be performed ? CREATE ASSERTION acctsalSQL3extassert INSERT ON EMPLOYEE UPDATE ON DEPARTMENT UPDATE OF SALARY, DNO ON EMPLOYEE CHECK (NOT EXISTS ( EMPLOYEE.DNO = DEPT.DNUMBER AND DNAME = ‘ACCOUNTING’ AND EMPLOYEE.SALARY < ));

47 SQL3 Extended Assertions [UW] Advantages of SQL3 extended assertions ? Programmer has more control. –Eg: for efficiency reason, may choose not to perform check when DNO is changed Makes it easier for system implementers –They don’t have to figure out when violation can occur Disadvantages of SQL3 extended assertions ? Programmer must discover when violation might possibly occur –o/w database could go into inconsistent state