Copyright © 2004 Pearson Education, Inc. Functional Dependencies & DB Normalization The main reference of this presentation is the textbook and PPT from.

Slides:



Advertisements
Similar presentations
Functional Dependencies and Normalization for Relational Databases
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
NORMALIZATION. Normalization Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Ch 10, Functional Dependencies and Normal forms
Functional Dependencies and Normalization for Relational Databases.
Ms. Hatoon Al-Sagri CCIS – IS Department Normalization.
Functional Dependencies and Normalization for Relational Databases
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
METU Department of Computer Eng Ceng 302 Introduction to DBMS Functional Dependencies and Normalization for Relational Databases by Pinar Senkul resources:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 8 Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Announcements Homework 1 due Friday. Slip it under my office door (1155) or put in my mailbox on 5 th floor. Program 2 has been graded ;-( Program 3 out.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Functional Dependencies and Normalization for Relational Databases.
FUNCTIONAL DEPENDENCIES. Chapter Outline 1 Informal Design Guidelines for Relational Databases 1.1Semantics of the Relation Attributes 1.2 Redundant Information.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 6 NORMALIZATION FOR RELATIONAL DATABASES Instructor Ms. Arwa Binsaleh.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Topic 10 Functional Dependencies and Normalization for Relational Databases Faculty of Information Science and Technology Mahanakorn University of Technology.
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 4 Normalization.
Top-Down Database Design Mini-world Requirements Conceptual schema E1 E2 R Relation schemas ?
Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Chapter Functional Dependencies and Normalization for Relational Databases.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
Ihr Logo Fundamentals of Database Systems Fourth Edition El Masri & Navathe Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Ihr Logo Fundamentals of Database Systems Fourth Edition El Masri & Navathe Chapter 10 Functional Dependencies and Normalization for Relational Databases.
1 Functional Dependencies and Normalization Chapter 15.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Lecture 8: Database Concepts May 4, Outline From last lecture: creating views Normalization.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.
Database Design FUNCTIONAL DEPENDENCES NORMAL FORMS D. Christozov / G.Tuparov INF 280 Database Systems: DB design: Normal Forms 1.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
Deanship of Distance Learning Avicenna Center for E-Learning 1 Session - 7 Sequence - 1 Normalization DB Design Guidelines Presented by: Dr. Samir Tartir.
Al-Imam University Girls Education Center Collage of Computer Science 1 st Semester, 1432/1433H Chapter 10_part 1 Functional Dependencies and Normalization.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Functional Dependencies and Normalization for Relational Databases تنبيه : شرائح العرض (Slides) هي وسيلة لتوضيح الدرس واداة من الادوات في ذلك. حيث المرجع.
10/3/2017.
10/3/2017.
Functional Dependency and Normalization
CHAPTER 14 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for RDBs
Database Management systems Subject Code: 10CS54 Prepared By:
Normalization DB Design Guidelines Presented by: Dr. Samir Tartir
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Chapter Outline 1 Informal Design Guidelines for Relational Databases
Presentation transcript:

Copyright © 2004 Pearson Education, Inc. Functional Dependencies & DB Normalization The main reference of this presentation is the textbook and PPT from : Elmasri & Navathe, Fundamental of Database Systems, 4 th edition, 2004, Chapter 10 Additional resources: presentation prepared by Prof Steven A. Demurjian, Sr (

Slide 10-2 Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Chapter Outline 1 Informal Design Guidelines for Relational Databases 1.1Semantics of the Relation Attributes 1.2 Redundant Information in Tuples and Update Anomalies 1.3 Null Values in Tuples 1.4 Spurious Tuples 2 Functional Dependencies (FDs) 2.1 Definition of FD 2.2 Inference Rules for FDs 2.3 Equivalence of Sets of FDs

Slide 10-3 Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Chapter Outline(contd.) 3 Normal Forms Based on Primary Keys 3.1 Normalization of Relations 3.2 Practical Use of Normal Forms 3.3 Definitions of Keys and Attributes Participating in Keys 3.4First Normal Form 3.5Second Normal Form 3.6Third Normal Form

Slide 10-4 Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Design Process Conceptual Design Conceptual Schema (ER Model) Logical Design Logical Schema (Relational Model) Analysis of Schema Step 2: Normalization Analyzing the Schema from Performance/Efficiency Perspectives to arrive at “Optimal” Schema Normalized Schema

Slide 10-5 Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Informal Design Guidelines for Relational Databases (1) What is relational database design? The grouping of attributes to form "good" relation schemas Two levels of relation schemas  The logical "user view" level  The storage "base relation" level Design is concerned mainly with base relations What are the criteria for "good" base relations?

Slide 10-6 Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Informal Design Guidelines for Relational Databases (2) We first discuss informal guidelines for good relational design Then we discuss formal concepts of functional dependencies and normal forms - 1NF (First Normal Form) - 2NF (Second Normal Form) - 3NF (Third Normal Form) Additional types of dependencies, further normal forms, relational design algorithms by synthesis are discussed in Chapter 11

Slide 10-7 Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Semantics of the Relation Attributes GUIDELINE 1: Informally, each tuple in a relation should represent one entity or relationship instance. (Applies to individual relations and their attributes).  Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be mixed in the same relation  Only foreign keys should be used to refer to other entities  Entity and relationship attributes should be kept apart as much as possible. Bottom Line: Design a schema that can be explained easily relation by relation. The semantics of attributes should be easy to interpret.

Slide 10-8 Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Figure 10.1 A simplified COMPANY relational database schema Note: The above figure is now called Figure 10.1 in Edition 4

Slide 10-9 Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Redundant Information in Tuples and Update Anomalies Mixing attributes of multiple entities may cause problems Information is stored redundantly wasting storage Problems with update anomalies  Insertion anomalies  Deletion anomalies  Modification anomalies

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 EXAMPLE OF AN UPDATE ANOMALY (1) Consider the relation: EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours) Update Anomaly: Changing the name of project number P1 from “Billing” to “Customer-Accounting” may cause this update to be made for all 100 employees working on project P1.

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 EXAMPLE OF AN UPDATE ANOMALY (2) Insert Anomaly: Cannot insert a project unless an employee is assigned to. Inversely - Cannot insert an employee unless an he/she is assigned to a project. Delete Anomaly: When a project is deleted, it will result in deleting all the employees who work on that project. Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project.

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Figure 10.3 Two relation schemas suffering from update anomalies Note: The above figure is now called Figure 10.3 in Edition 4

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Figure 10.4 Example States for EMP_DEPT and EMP_PROJ Note: The above figure is now called Figure 10.4 in Edition 4

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Guideline to Redundant Information in Tuples and Update Anomalies GUIDELINE 2: Design a schema that does not suffer from the insertion, deletion and update anomalies. If there are any present, then note them so that applications can be made to take them into account

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Null Values in Tuples GUIDELINE 3: Relations should be designed such that their tuples will have as few NULL values as possible Attributes that are NULL frequently could be placed in separate relations (with the primary key) Reasons for nulls:  attribute not applicable or invalid  attribute value unknown (may exist)  value known to exist, but unavailable

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Information Loss & Spurious Tuples Bad designs for a relational database may result in erroneous results for certain JOIN operations The "lossless join" property is used to guarantee meaningful results for join operations

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Information Loss & Spurious Tuples We’ve had Guidelines for: One Concept/Relation, Avoiding Update Anomalies, Null Values Two Other “Related” Concerns Can Arise  First, in Decomposing (Splitting) a Relation Apart, we May “Lose” Information  Second, in Attempting to Reassemble Two or More Relations into One (via a Join), Spurious Tuples may Result A Spurious Tuple “Wasn’t” Present Originally and Makes No Sense - Didn’t Exist and its Existence is Inconsistency

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Suppose Split EMP_PROJ

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 What are Semantics of Split? EMP_LOCS Means the Employee ENAME Works on Some Project at PLOCATION EMP_PROJ1 Means the Employee Identified by SSN Works HOURS per Week on Project Identified by PNAME, PNUMBER, PLOCATION

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Recall EMP_PROJ

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 What are Tuples After Split?

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 What is the Issue? Suppose EMP_PROJ1 and EMP_LOCS used in Place of EMP_PROJ The Split is Legitimate if we Can Recover the Information Originally in EMP_PROJ How could you Recover the Information?  Natural Join on EMP_PROJ1 and EMP_LOCS  What would be the Result? Note: “*’ed” Entries are Spurious Tuples We do not Obtain the “Correct” Information We have Conducted a “Lossy” Decomposition

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 What Happens When we Join? What do “*”ed Tuples Represent?

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Guideline 4 GUIDELINE 4:  The relations should be designed to satisfy the lossless join condition.  No spurious tuples should be generated by doing a natural-join of any relations. There are 2 important properties of decompositions (a)non-additive or losslessness of the corresponding join (b)preservation of the functional dependencies.  Property (a) is extremely important and cannot be sacrificed.  Property (b) is less stringent and may be sacrificed. (See Chapter 11).

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 R = (A, B, C) S = (D, C) b2 b4 c1 c2 A B C c1 c2 c3 d1 d2 d4 d5 D C a1 a2 a3 a1 a2 a3 b2 b4 c1 c2 d1 d2 d4 A B C D RS(A, B, C, D) lost info of (d5, c3) after join R & S Guideline 4: Lost Information A First Example of Lost Information  What is Lost in the Join of R and S?

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Guideline 4: Spurious Tuples A Second Example of Spurious Tuples  What are Spurious in the Join of R1and R2? a1 a2 a3 a4 b1 b2 b1 b2 c1 c2 c1 c2 d1 d2 d3 A B C D R(A, B, C, D) R1(B, D) B C b1 b2 b1 b2 d1 d2 d3 D d1 d2 d3 A a1 a2 a3 a4 R2(A, D) a1 a2 a1 a2 a3 a4 b1 b2 b1 b2 c1 c2 c1 c2 d1 d2 d3 A B C D R1 and R2 Join

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Functional Dependencies (1) Functional dependencies (FDs) are used to specify formal measures of the "goodness" of relational designs FDs and keys are used to define normal forms for relations FDs are constraints that are derived from the meaning and interrelationships of the data attributes A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Functional Dependencies (2) X -> Y holds if whenever two tuples have the same value for X, they must have the same value for Y For any two tuples t1 and t2 in any relation instance r(R): If t1[X]=t2[X], then t1[Y]=t2[Y] X -> Y in R specifies a constraint on all relation instances r(R) Denoted by X  Y  X is Called the Left Hand Side of FD  Y is Called the Right Hand Side of FD Read as X Functionally Determines Y in R

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Examples of FD constraints (1) social security number determines employee name SSN -> ENAME project number determines project name and location PNUMBER -> {PNAME, PLOCATION} employee ssn & project number determines the hours per week that the employee works on the project {SSN, PNUMBER} -> HOURS

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Examples of FD constraints (2) An FD is a property of the attributes in the schema R The constraint must hold on every relation instance r(R) If K is a key of R, then K functionally determines all attributes in R (since we never have two distinct tuples with t1[K]=t2[K])

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Example of FDs {S#, CN}  Grade, S#  DNAME, DNAME  DHead. STUDENT_DEPT (S#, DName, DHead, CN, Grade) FDs over STUDENT_DEPT: S# DHeadCN Grade DNAME fd 1 fd 2 fd 3

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Example of FDs SSN  ENAME PNUMBER  {PNAME, PLOCATION} {SSN, PNUMBER}  HOURS SSN  {ENAME, BDATE, ADDRESS, DNUMBER} DNUMBER  {DNAME, DMGRSSN}

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Determining FDs Must Understand the Semantics of Data Based on Schema or Current/Future Instances What are FDs Below? TEXT  COURSE? COURSE  TEXT? What if I add Row “James, Web Databases, Al-Nour”?

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Inference Rules for FDs Given a set of FDs F, we can Infer Additional FDs that Hold whenever the FDs in F Hold For Example, Consider: F = {SSN  {EName, BDate, Address, DNumber}, DNumber  {DName, DMGRSSN} } What are Additional FDs? SSN  EName SSN  BDate SSN  SSN SSN  Address SSN  DNumber DNumber  Dname DNumber  DMGRSSN

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Inference Rules Armstrong’s Inference Rules Derived Inference Rules 1. Reflective (IR1): If X  Y, then X  Y. 2. Augmentation (IR2): If { X  Y} then XZ  YZ. 3. Transitive (IR3): If { X  Y, Y  Z } then X  Z. 4. Decomposition: If { X  YZ } then X  Y. 5. Additive (Union): If {X  Y, X  Z } then X  YZ. 6. Pseudotransitive: If {X  Y, WY  Z } then W X  Z.

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Inference Rules for FDs (3) Closure of a set F of FDs is the set F + of all FDs that can be inferred from F Closure of a set of attributes X with respect to F is the set X + of all attributes that are functionally determined by X X + can be calculated by repeatedly applying IR1, IR2, IR3 using the FDs in F

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Algorithm 10.1 for Closure Given a Set of FDs, Each X on LHS of a FD, Closure X+ are the Attributes Functionally Determined by X Algorithm 10.1 Determining X+, Closure of X under F X+ := X; repeat 1. oldX+ := X+ ; 2. for each FD Y  Z in F do 3. if X+  Y then X+ := X+  Z; until (X+ = old X+ );

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Applying Algorithm 10.1 Let F = {SSN  ENAME, PNUMBER  {PNAME, PLOCATION}, {SSN, PNUMBER}  HOURS } {SSN} + = {SSN, ENAME} {PNUMBER} + = {PNUMBER, PNAME, PLOCATION } {SSN, PNUMBER} + = {SSN, PNUMBER, ENAME, PNAME, PLOCATION, HOURS} For Example: Computer {SSN, PNUMBER}+ Is {SSN, PNUMBER}  SSN? Yes - Add in ENAME Is {SSN, PNUMBER, ENAME}  PNUMBER? Yes - Add in PNAME, PLOCATION Is {SSN, PNUMBER, ENAME, PNAME, PLOCATION}  {SSN, PNUMBER}? Yes - Add in HOURS

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Interpreting Closure Sets What Does {SSN} + = {SSN, ENAME} Equate to? SSN  ENAME, SSN  SSN, SSN  SSN, ENAME What about {PNUMBER} + ={PNUMBER,PNAME,PLOCATION} PNUMBER  PNAME PNUMBER  PLOCATION PNUMBER  PNUMBER PNUMBER  {PNAME, PLOCATION} PNUMBER  {PNUMBER, PLOCATION} PNUMBER  {PNAME, PNUMBER} PNUMBER  {PNUMBER, PNAME, PLOCATION} Can you do: {SSN, PNUMBER} + = {SSN, PNUMBER, ENAME, PNAME, PLOCATION, HOURS}

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Compute {DNUMBER}+ Is {DNUMBER}  SSN ? No - Continue to Next FD Is {DNUMBER}  DNUMBER ? Yes - Add in DNAME, DMGRSSN {DNUMBER}+ = {DNUMBER, DNAME, DMGRSSN} FYI: {SSN} + ={SSN, ENAME, BDATE, ADDRESS, DNUMBER, DNAME, DMGRSSN} Let G = {SSN  {ENAME, BDATE, ADDRESS, DNUMBER}, DNUMBER  {DNAME, DMGRSSN} }

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Inference Rule Properties Theorem 1: X  A 1 A 2... A n Holds in a Relation Scheme R if and only if X  A i Holds for all i = 1,..., n Theorem 2: Armstrong's Inference Rules are Sound and Complete Given a set F of Functional Dependencies:  By Sound we mean every FD that we can infer from F by using Armstrong’s Inference Rules is in F +  By Complete we mean every FD in F + (that F implies) can be Inferred from F by using Armstrong’s Inference Rules

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Equivalence of Sets of FDs E is Covered by F (F is said to cover E), if Every FD in E is also in F+ Two sets of FDs E and F are equivalent if:  Every FD in E can be Inferred from F, and  Every FD in F can be Inferred from E Hence, F and G are Equivalent if F + =G + Definition: F covers F if every FD in F can be Inferred from F (i.e., if F +  E +) E and F are Equivalent if E Covers F and F Covers E

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Normal Forms Based on Primary Keys 3.1Normalization of Relations 3.2Practical Use of Normal Forms 3.3Definitions of Keys and Attributes Participating in Keys 3.4First Normal Form 3.5Second Normal Form 3.6Third Normal Form

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Normalization of Relations (1) Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Normalization of Relations (2) 1NF based on definition of relation 2NF, 3NF, BCNF based on keys and FDs of a relation schema 4NF based on keys, multi-valued dependencies : MVDs; 5NF based on keys, join dependencies : JDs (Chapter 11) Additional properties may be needed to ensure a good relational design (lossless join, dependency preservation; Chapter 11)

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 A Normal Form is a Condition using Keys and FDs to Certify Whether a Relation Schema meets Criteria  Primary keys (1NF, 2NF, 3NF)  All Candidate Keys ( 2NF, 3NF, BCNF)  Multivalued Dependencies (4NF)  Join Dependencies (5NF) 5 NF 4NF 3NF 2NF 1NF What are Normal Form?

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Practical Use of Normal Forms Normalization is carried out in practice so that the resulting designs are of high quality and meet the desirable properties The practical utility of these normal forms becomes questionable when the constraints on which they are based are hard to understand or to detect The database designers need not normalize to the highest possible normal form. (usually up to 3NF, BCNF or 4NF) Denormalization: the process of storing the join of higher normal form relations as a base relation— which is in a lower normal form

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, Definitions of Keys & Attributes Participating in Keys (1) A superkey of a relation schema R = {A 1, A 2,...., A n } is a set of attributes S subset-of R with the property that no two tuples t 1 and t 2 in any legal relation state r of R will have t 1 [S] = t 2 [S] A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey any more.

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Definitions of Keys and Attributes Participating in Keys (2) If a relation schema has more than one key, each is called a candidate key. One of the candidate keys is arbitrarily designated to be the primary key, and the others are called secondary keys. A Prime attribute must be a member of some candidate key A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key.

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 First Normal Form (1NF) All Attributes Must Be Atomic Values:  Only Simple and Indivisible Values in the Domain of Attributes.  Each Attribute in a 1NF Relation is a Single Value  Disallows Composite Attributes, Multivalued Attributes, and Nested Relations (Non-Atomic) 1NF Relation cannot have an Attribute Value :  A Set of Values (Set-Value)  A Tuple of Values (Nested Relation) 1NF is a Standard Assumption of Relation DBs

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 One Example of 1NF Consider Following Department Relation What is the Inherent Problem?

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 What are Possible Solutions? Decompose: Move the Attribute DLOCATIONS that Violates 1NF into a Separate Relation DEPT_LOCATIONS(DNUMBER, DLOCATION) Expand the key to have a Separate Tuple in the DEPARTMENT relation for each location (below) Introduce DLOC1, DLOC2, DLOC3, if there are Three Maximum Locations Problems with Each? Best Solution?

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Another 1NF Example - Nested Relations EMP_PROJ - Table and Tuples Transition to:

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Second Normal Form (2NF) Second Normal Form Focuses on the Concepts of Primary Keys and Full Functional Dependencies Intuitively:  A Relation Schema R is in Second Normal Form (2NF) if Every Non-Prime Attribute A in R is Fully Functionally Dependent on the Primary Key  R can be Decomposed into 2NF Relations via the Process of 2NF Normalization  Successful Process Typically Involves Decomposing R into Two or More Relations  Iteratively Applying to Each Relation in Schema

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Full Functional Dependency Full FD - Formally: Given R(U) and X, Y  U. If X  Y holds, and there exists no such X’ that X’  X, and X’  Y holds over R, then Y is fully dependent on X, denoted as X  Y Full FD- Intuitively: A FD X  Y where Removal of any Attribute from X means the FD no Longer Holds  {SSN, PNUMBER}  HOURS is full since Neither SSN  HOURS nor PNUMBER  HOURS holds  What about in the Following: f {S#, CN}  Grade

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Partial Functional Dependency Partial FD - Formally: Given R(U) and X, Y  U. If X  Y holds but Y is not fully dependent on X ( X  Y), then Y is partially functional dependent on X, denoted by X  Y Partial FD - Intuitively: Removal of a Attribute from the R.H.S. still Results in a Valid FD  {SSN, PNUMBER}  ENAME is Partial since Removing PNUMBER still Results in the Valid FD SSN  ENAME  Are Following Full or Partial? p {S#, CN}  CN,{S#, CN}  S# {S#, CN, DNAME}  Grade f

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Second Normal Form (2NF) Formal 2NF Definition R  2NF iff  (i) R  1NF;  (ii) all Non-Key Attributes in R are Fully Functional Dependent on Every Key. Alternative Definition: R  2NF iff the Attributes are Either  a Candidate Key, or  Fully Dependent on Every Key. Reason: Partial Functional Dependencies may cause Update Problems

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 One Example of 2NF Consider the Example Below STUDENT_DEPT(S#, DName, DHead, CN, Grade) STUDENT_DEPT  1NF “{S#, CN}  DName, DHead” is a Partial FD which causes Update Anomalies But STUDENT_DEPT  2NF S# DHeadCN Grade DName fd 1 fd 2 fd 3

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 One Example of 2NF (Continued) Insertion Anomalies:  No Department Can Be Recorded if it has No Student Who Enrolls Courses Deletion Anomalies:  Delete the Last Student in a Department will also Delete the Department Update Anomalies:  Change a Head of a Department must Modify All Students in that Department Due to Redundancies STUDENT_DEPT(S#, DName, DHead, CN, Grade)

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 One Example of 2NF (Continued) Decomposition into 2NF by Separating Course Information from Department Information (Link S#) S_D(S#, DName, DHead) DHead DName fd 2 fd 3 S# S_C(S#, CN, Grade) fd 1 S#CNGrade

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Another Example of 2NF EMP_PROJ is 1NF with Key SSN, PNUMBER but…  SSN  ENAME - Means ENAME, a Non-Prime Attribute, Depends Partially on SSN, PNUMBER, i.e., Depend on Only SSN and not Both  PNUMBER  {PNAME, PLOCATION} - Means PNAME, PLOCATION, two Non-Prime Attributes, Depends Partially on SSN, PNUMBER, i.e., Depend on Only PNUMEBER and not Both

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Another Example of 2NF What Does Decomposition Below Accomplish?  ENAME Fully Dependent on SSN  PNAME, PLOC Fully Dependent on PNUMBER Result: 2NF for EP1, EP2, and EP3

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Yet Another Example of 2NF Consider 1NF Lots to Track Building Lots for Towns What is the 2NF Problem?  FD3: COUNTY_NAME  TAX_RATE Means TAX_RATE Depends Partially on Candidate Key {PROPERTY_ID#,COUNTY_NAME}  All Other Non-Prime Attributes are Fine

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Yet Another Example of 2NF What Does Decomposition Below Accomplish?  TAX_RATE Fully Dependent on COUNTY_NAME Result: 2NF for LOTS1 and LOTS2

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Third Normal Form (3NF) Third Normal Form Focuses on the Concepts of Primary Keys and Transitive Functional Dependencies Intuitively:  A Relation Schema R is in Third Normal Form (3NF) if it is in 2NF and no Non-Prime Attribute A in R is Transitively Dependent on Primary Key  R can be Decomposed into 3NF Relations via the Process of 3NF Normalization In X  Y and Y  Z, with X as the Primary Key, there is only a a problem only if Y is not a candidate key. EMP(SSN, Emp#, Salary), SSN  Emp#  Salary isn’t Problem Since Emp# is a Candidate Key

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Transitive FDs Transitive FD - Formally: Given R(U) and X, Y  U. If X  Y, Y  X and Y  X, Y  Z, then Z is called transitively functional dependent on X. Transitive FD - Intuitively: a FD X  Z that can be derived from two FDs X  Y and Y  Z  SSN  ENAME is non-transitive Since there is no set of Attributes X where SSN  X and X  ENAME S#  DHead S# DHeadCN Grade DNAME fd 1 fd 2 fd 3

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Third Normal Form (3NF) Formal 3NF Definition R  3NF iff (i) R  2NF; (ii) No Non-Key Attribute of R is Transitively Dependent on Every Candidate Key. Alternative Definition: R  3NF iff for every FD X  Y, either  X is a superkey, or  Y is a key attribute. Reason: Transitive Functional Dependencies may cause Update Problems

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 One Example of 3NF STUDENT_DEPT(S#, DName, DHead, CN, Grade)  2NF S_C(S#, CN, Grade)  2NF S_D(S#, DName, DHead)  2NF S_D  3NF S_C  3NF But “S#  DHead” is a Transitive FD in S_D and “DHead” is non-key attribute.

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 One Example of 3NF S_C(S#, CN, Grade)  2NF S_D(S#, DName, DHead)  2NF S_D (S#, DName) DEPT(DName, DHead)  3NF fd 2 S#  DName fd 3 DName  DHead DHead DName S# fd S#  DHead Decompose to Eliminate the Transitivity Within S_D

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Another Example of 3NF EMP_DEPT is 2NF with Key SSN, but there are Two Transitive Dependencies (Undesirable)  SSN  DNUMBER and DNUMBER  DNAME Means DNAME, Neither Key Nor Subset of Key, is Transitively Dependent on SSN  SSN is the Only Candidate Key of EMP_DEPT!  Note: Also Similar Problem with SSN and DMGRSSN via DNUMBER

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Another Example of 3NF To Attain 3NF, Decompose into ED1 and ED2 Intuitively - we are Separating Out Employees and Departments from One Another

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Yet Another Example of 3NF Recall 2NF Solution for Building Lots Problem What is the 3NF Problem? Violate Alternative Defn.  In LOTS1, FD4 AREA  PRICE AREA is not a Superkey PRICE not a Prime Attribute of LOTS1

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Yet Another Example of 3NF Decompose to Introduce a Separate Key AREA Result: 3NF for LOTS1A and LOTS1B

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Summary of Progression STUDENT_DEPT 1NF1NF S# DHeadCN Grade DName fd 1 fd 2 fd 3 S_C S_D 2NF2NF eliminate partial FDs fd 1 S#CNGrade DHead DName fd 2 fd 3 S# DHead S# S_D DName DEPT S_C 3NF3NF eliminate transitive FDs fd 1 S#CNGrade DName fd 3 fd 2

Slide Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Summary of 1NF, 2NF, 3NF Concepts Test Remedy (Normalization) 1NF Relation should have Form new relations for each nonatomic no nonatomic attributes attribute or nested relation. or nested relations. 2NFFor relations where primary Decompose and set up a new relation key contains multiple for each partial key with its dependent attributes, no nonkey attribute(s). Make sure to keep a attribute should be relation with the original primary key functionally dependent on and any attributes that are fully a part of the primary key. functionally dependent on it. 3NF Relation should not have a Decompose and set up a relation that nonkey attribute functionally includes the nonkey attribute(s) that determined by another nonkey functionally determine(s) other attribute (or by a set of nonkey nonkey attribute(s). attributes.) That is, there should be no transitive dependency of a nonkey attribute on the primary key.