Further Normalization II: Higher Normal Forms Prof. Yin-Fu Huang CSIE, NYUST Chapter 13.

Slides:



Advertisements
Similar presentations
Fourth normal form: 4NF 1. 2 Normal forms desirable forms for relations in DB design eliminate redundancies avoid update anomalies enforce integrity constraints.
Advertisements

5NF and other normal forms
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Shantanu Narang.  Background  Why and What of Normalization  Quick Overview of Lower Normal Forms  Higher Order Normal Forms.
1/22/20091 Study the methods of first, second, third, Boyce-Codd, fourth and fifth normal form for relational database design, in order to eliminate data.
1 CS 430 Database Theory Winter 2005 Lecture 9: Fourth and Fifth Normal Forms.
Wei-Pang Yang, Information Management, NDHU More on Normalization Unit 18 More on Normalization ( 表格正規化探討 ) 18-1.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Normalisation The theory of Relational Database Design.
C.1 Appendix C: Advanced Relational Database Design Reasoning with MVDs Higher normal forms Join dependencies and PJNF DKNF.
The Relational Model System Development Life Cycle Normalisation
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
1 Database Design Theory Which tables to have in a database Normalization.
Functional Dependency Rajhdeep Jandir. Definition A functional dependency is defined as a constraint between two sets of attributes in a relation from.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
1 Multi-valued Dependencies. 2 Multivalued Dependencies There are database schemas in BCNF that do not seem to be sufficiently normalized. Consider a.
MVDs: 1 Join Dependencies—Example Let r = A B C = A B |  | A C 1 a x 1 a 1 x 1 a y 1 b 1 y 1 b x 2 a 2 y 1 b y 2 b 2 a y 2 b y Observe: r =  AB r | 
Normalization II. Boyce–Codd Normal Form (BCNF) Based on functional dependencies that take into account all candidate keys in a relation, however BCNF.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Introduction to Schema Refinement
Ch 7: Normalization-Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Functional Dependencies Prof. Yin-Fu Huang CSIE, NYUST Chapter 11.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
FUNCTIONAL DEPENDENCIES
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Copyright © Curt Hill Schema Refinement III 4 th NF and 5 th NF.
Logical Database Design ( 補 ) Unit 7 Logical Database Design ( 補 )
Chapter 13 Further Normalization II: Higher Normal Forms.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide DESIGNING A SET OF RELATIONS (2) Goals: Lossless join property (a must). Dependency.
Further Normalization I
Computing & Information Sciences Kansas State University Tuesday, 27 Feb 2007CIS 560: Database System Concepts Lecture 18 of 42 Tuesday, 27 February 2007.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 11 Relational Database Design Algorithms and Further Dependencies.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
1 Functional Dependencies and Normalization Chapter 15.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization.
Multivalued Dependencies and 4th NF CIS 4301 Lecture Notes Lecture /21/2006.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Relational Database Design Algorithms and Further Dependencies.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Advanced Database System
Logical Database Design and Relational Data Model Muhammad Nasir
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
4NF & MULTIVALUED DEPENDENCY By Kristina Miguel. Review  Superkey – a set of attributes which will uniquely identify each tuple in a relation  Candidate.
Advanced Normalization
Chapter 15 Relational Design Algorithms and Further Dependencies
Normalization Karolina muszyńska
STRUCTURE OF PRESENTATION :
3.1 Functional Dependencies
Advanced Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Fourth normal form: 4NF.
Module 5: Overview of Normalization
Chapter 7: Relational Database Design
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Appendix C: Advanced Relational Database Design
Chapter 28: Advanced Relational Database Design
Chapter 7a: Overview of Database Design -- Normalization
Presentation transcript:

Further Normalization II: Higher Normal Forms Prof. Yin-Fu Huang CSIE, NYUST Chapter 13

Advanced Database SystemYin-Fu Huang Multi-valued dependency (MVD)  4NF Join dependency (JD)  5NF 13.1 Introduction

Advanced Database SystemYin-Fu Huang Multi-valued Dependencies and Fourth Normal Form (1/5) Example (See Fig. 13.1) Assumptions: 1. For a given course c, there can be any number m of corresponding teachers and any number n of corresponding texts (m>0, n>0). 2. Teachers and texts are quite independent of one another. 3. A given teacher or a given text can be associated with any number of courses.

Advanced Database SystemYin-Fu Huang Eliminating the relation-valued attributes (See Fig. 13.2) Multi-valued Dependencies and Fourth Normal Form (2/5) CTX satisfies the constraint If tuples (c, t1, x1), (c, t2, x2) both appear then tuples (c, t1, x2), (c, t2, x1) both appear also.

Advanced Database SystemYin-Fu Huang Multi-valued Dependencies and Fourth Normal Form (3/5) CTX involves a good deal of redundancy  update anomalies In fact, CTX is in BCNF, since it is “all key.” Multi-valued dependencies (MVDs) are a generalization of functional dependencies. Two MVDs in CTX: Course →> Teacher Course →> Text Multi-valued dependence: Let R be a relvar, and let A, B, and C be subsets of the attributes of R. Then B is multi-dependent on A, A→>B, if and only if, in every legal value of R, the set of B values matching a given AC value pair depends only on the A value and is independent of the C value.

Advanced Database SystemYin-Fu Huang Multi-valued Dependencies and Fourth Normal Form (4/5) MVDs always go together in pairs. e.g., Course →> Teacher|Text The two projections CT and CX do not involve any such MVDs. (See Fig. 13.3)

Advanced Database SystemYin-Fu Huang Multi-valued Dependencies and Fourth Normal Form (5/5) Theorem: Let R{A, B, C} be a relvar, where A, B, and C are sets of attributes. Then R is equal to the join of its projections on {A, B} and {A, C} if and only if R satisfies the MVDs A→>B|C. Fourth normal form: Relvar R is in 4NF if and only if, whenever there exist subsets A and B of the attributes of R such that the nontrivial MVD A→>B is satisfied, then all attributes of R are also functionally dependent on A. In other words, the only nontrivial dependencies in R are of the form K→X. If we start with a relvar involving two or more independent RVAs, it is better to separate the RVAs first.

Advanced Database SystemYin-Fu Huang Join Dependencies and Fifth Normal Form (1/5) A relvar is “n-decomposable” if it can be nonloss-decomposed into n projections but not into m, where 1 < m and m < n. (See Fig. 13.4)

Advanced Database SystemYin-Fu Huang Join Dependencies and Fifth Normal Form (2/5) The constraint (Constraint 3D): if the pair (s1, p1) appears in SP and the pair (p1, j1) appears in PJ and the pair (j1, s1) appears in JS then the triple (s1, p1, j1) appears in SPJ  if (s1, p1, j2), (s2, p1, j1), (s1, p2, j1) appear in SPJ then (s1, p1, j1) appears in SPJ also A relvar will be n-decomposable for some n > 2 if and only if it satisfies some such (n-way) cyclic constraint.

Advanced Database SystemYin-Fu Huang Join Dependencies and Fifth Normal Form (3/5) Join dependency: Let R be a relvar, and let A, B,..., Z be subsets of the attributes of R. Then R satisfies the JD * {A, B,..., Z} if and only if every legal value of R is equal to the join of its projections on A, B,..., Z. Relvar SPJ suffers from a number of update anomalies, anomalies that disappear when it is 3-decomposed. (See Fig. 13.5)

Advanced Database SystemYin-Fu Huang Join Dependencies and Fifth Normal Form (4/5) An MVD is just a special case of a JD, or that JDs are a generalization of MVDs. JDs are the most general form of dependency possible. Fifth normal form: A relvar R is in 5NF, also called projection-join normal form (PJ/NF) if and only if every nontrivial join dependency that is satisfied by R is implied by the candidate keys of R. Relvar SPJ is in 4NF, but not in 5NF. Relvar SPJ can be 3-decomposable and 3-decomposability is not implied by the fact that the combination {S#, P#, J#} is a candidate key.

Advanced Database SystemYin-Fu Huang Join Dependencies and Fifth Normal Form (5/5) Example: * { {S#, Sname, Status}, {S#, City} } This JD is implied by the fact that {S#} is a candidate key. Example: * { {S#, Sname}, {S#, Status}, {Sname, City} } This JD is implied by the fact that {S#} and {Sname} are both candidate keys. Given a relvar R, we can tell if R is in 5NF as long as we know all candidate keys and all JDs in R. However, discovering all of those JDs might itself be a nontrivial exercise. 5NF is the ultimate normal form with respect to projection and join. e.g., the supplier relvar S

Advanced Database SystemYin-Fu Huang The Normalization Procedure Summarized (1/3) Given some 1NF relvar R and some set of FDs, MVDs, and JDs that apply to R, we systematically reduce R to a collection of “smaller” relvars that are equivalent to R. The overall process: 1. 1NF  2NF: to eliminate FDs that are not irreducible 2. 2NF  3NF: to eliminate transitive FDs 3. 3NF  BCNF: to eliminate remaining FDs in which the determinant is not a candidate key. 4. BCNF  4NF: to eliminate MVDs that are not also FDs 5. 4NF  5NF: to eliminate JDs that are not implied by the candidate key.

Advanced Database SystemYin-Fu Huang Several points: 1. Done in a nonloss way, and preferably in a dependency- preserving way as well. 2. There is a very attractive parallelism among the definitions of BCNF, 4NF, and 5NF. 3. The overall objectives: a. To eliminate certain kinds of redundancy b. To avoid certain update anomalies c. To produce a design that is “good” representation of the real world d. To simplify the enforcement of certain integrity constraints The Normalization Procedure Summarized (2/3)

Advanced Database SystemYin-Fu Huang 4. The normalization guidelines are only guidelines, and occasionally there might be good reasons for not normalizing “all the way.” 5. The notions of dependency and further normalization are semantic in nature. 6. The ideas of normalization are not a panacea. a. JDs, MVDs and FDs are not the only kinds of constraints that can arise in practice. b. The decomposition might not be unique. c. The BCNF and dependency preservation objectives can be in conflict. d. Not all redundancies can be eliminated in the normalization procedure The Normalization Procedure Summarized (3/3)

Advanced Database SystemYin-Fu Huang A Note on Denormalization It is often claimed that “denormalization” is necessary to achieve good performance. Full normalization  lots of logically separate relvars  lots of physically separate stored files  lots of I/O More specifically, the objective is to reduce the number of joins that need to be done at run time by doing some of those joins ahead of time, as part of the database design. Example (See Fig. 13.6)

Advanced Database SystemYin-Fu Huang A Note on Denormalization (Cont.) Some problems: a. Once we start denormalizing, it is not clear where we should stop. b. There can be retrieval problems too. e.g., Summarize P Per P{Color} Add Avg (Weight) As Avwt Summarize PSQ {P#, Color, Weight} Per PSQ{Color} Add Avg (Weight) As Avwt c. When we say that denormalization is good for performance, what we really mean is that it is good for the performance of specific applications

Advanced Database SystemYin-Fu Huang Orthogonal Design (A Digression) (See Fig. 13.7) The Principle of Orthogonal Design (initial version): Within a given database, no two distinct base relvars should have overlapping meanings.

Advanced Database SystemYin-Fu Huang Orthogonal Design (A Digression) (Cont.) (See Fig. 13.8) The Principle of Orthogonal Design (final version): Let A and B be distinct base relvars. Then there must not exist nonloss decompositions of A and B into A1, A2, …, Am and B1, B2, …, Bn such that some projection Ai in the set A1, A2, …, Am and some projection Bj in the set B1, B2, …, Bn have overlapping meanings.

Advanced Database SystemYin-Fu Huang Other Normal Forms Dependency theory Domain-key normal form “Restriction-union” normal form (3,3)NF  4NF Sixth normal form

Advanced Database SystemYin-Fu Huang The End.