Normalization Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 19.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Advertisements

ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala N ATIONAL I NSTITUTE OF T ECHNOLOGY A GARTALA Aug-Dec,2010 Normalization 2 CSE-503 :: D ATABASE.
1/22/20091 Study the methods of first, second, third, Boyce-Codd, fourth and fifth normal form for relational database design, in order to eliminate data.
NORMALIZATION. Normalization Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Functional Dependencies and Normalization for Relational Databases.
Murali Mani Normalization. Murali Mani What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert,
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Computing & Information Sciences Kansas State University Tuesday, 27 Feb 2007CIS 560: Database System Concepts Lecture 18 of 42 Tuesday, 27 February 2007.
Logical Database Design (1 of 3) John Ortiz Lecture 6Logical Database Design (1)2 Introduction  The logical design is a process of refining DB schema.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Lecture 8: Database Concepts May 4, Outline From last lecture: creating views Normalization.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 19.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Functional Dependencies and Normalization for Relational Databases تنبيه : شرائح العرض (Slides) هي وسيلة لتوضيح الدرس واداة من الادوات في ذلك. حيث المرجع.
10/3/2017.
10/3/2017.
COP 6726: New Directions in Database Systems
Functional Dependency and Normalization
Advanced Normalization
Chapter 15 Relational Design Algorithms and Further Dependencies
CHAPTER 14 Basics of Functional Dependencies and Normalization for Relational Databases.
Gergely Lukács Pázmány Péter Catholic University
Functional Dependencies and Normalization for Relational Databases
Schema Refinement and Normal Forms
Database Management Systems (CS 564)
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Normalization First Normal Form (1NF) Boyce-Codd Normal Form (BCNF)
Functional Dependencies and Normalization for RDBs
Relational Database Design by Dr. S. Sridhar, Ph. D
3.1 Functional Dependencies
Handout 4 Functional Dependencies
Advanced Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Schema Refinement and Normalization
Database Management systems Subject Code: 10CS54 Prepared By:
Database Normalization
Schema Refinement What and why
Normalization Murali Mani.
Outline: Normalization
Normalization.
Normalization Part II cs3431.
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Normalization cs3431.
CS 405G: Introduction to Database Systems
Normalization February 28, 2019 DB:Normalization.
Schema Refinement and Normal Forms
NORMALIZATION FIRST NORMAL FORM (1NF):
Relational Database Design
Instructor: Mohamed Eltabakh
Chapter Outline 1 Informal Design Guidelines for Relational Databases
Chapter 7a: Overview of Database Design -- Normalization
Functional Dependencies and Normalization
Presentation transcript:

Normalization Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 19

Motivation Update anomalies – unexpected side effects as result of insertion, deletion, or modification of data Insertion anomaly – not possible to store certain information unless some other, unrelated, information is stored Deletion anomaly - not possible to delete certain information w/o losing some other, unrelated, information Modification anomaly – inconsistency due to not all copies of data item being updated

Example Advising (FacSSN, FacName, Office, FacDept, StuSSN, StuName, Address, GPA) What insertion anomalies may occur? What deletion anomalies may occur? What modification anomalies may occur?

Normalization Process that decomposes (poorly designed) relations into simpler (well designed) relations in a way that: minimizes redundancy ensures that a relation’s attributes are properly dependent on (or determined by) the PK

Functional Dependence For given relation, attribute X functionally determines attribute Y (or attribute Y is functionally dependent on attribute X) if each value of X has associated with it exactly one value of Y at any time. X → Y I.e., for every pair of tuples t1 and t2 in the relation, if t1.X = t2.X then t1.Y = t2.Y.

Candidate Key Let K represent an attribute or set of attributes of relation R; let T represent a tuple in R. K is a candidate key of R if R.K → R.T, and no attribute in K can be removed without violating the first condition. A superkey is an attribute or set of attributes in R that satisfies only the first condition.

Closure of Set of FDs Armstrong’s Axioms: Reflexivity: If X ⊇ Y, then X → Y Augmentation: If X → Y, then XZ → YZ for any Z Transitivity: If X → Y and Y → Z, then X → Z These axioms are both sound and complete.

Closure of Set of FDs (continued) Additional rules that are convenient: Union: If X → Y and X → Z, then X → YZ Decomposition: If X → YZ, then X → Y and X → Z

Example Given: Contracts (C, S, J, D, P, Q, V) F = {C → (C, S, J, D, P, Q, V), (J, P) → C, (S, D) → P} Show: (S, D, J) → (C, S, J, D, P, Q, V) What does this mean must be true of (S, D, J)?

Example (continued) C → (C, S, J, D, P, Q,V) given (J, P) → C given (S, D) → P given (J, P) → (C, S, J, D, P, Q, V) transitivity: 1 & 2 (S, D, J) → (J, P) augmentation on 3 6. (S, D, J) → (C, S, J, D, P, Q, V) transitivity: 5 & 4

Trivial Functional Dependencies A trivial FD is an FD in which the right side contains only attributes that also appear on the left side. X → Y where Y ⊆ X Such FDs always hold due to reflexivity. Examples from the Contracts relation: S → S (S, J, D) → (S, D) (P, Q, V) → P (P, Q, V) → (Q, V)

Attribute Closure To determine if a given FD such as X → Y is in the closure set F+, determine attribute closure X+ with respect to (wrt) F. X+ is set of attributes A such that X → A can be inferred using Armstrong’s Axioms.

Attribute Closure Algorithm To compute X+ wrt F for some attribute set X: closure = X; repeat until there is no change: { if there is an FD U → V in F such that U ⊆ closure, then set closure = closure ∪ V } Figure 19.4

First Normal Form A relation is in 1NF if all of its attribute values are atomic. 1NF disallows nested relations. Example of “relation” not in 1NF: EmpProj {SSN, EName, {PNumber, Hours}}

Second Normal Form Let X and Y be sets of attributes of relation R where X consists of more than one attribute. Y is fully functionally dependent on X if (a) Y is functionally dependent on X, and (b) Y is not functionally dependent on some proper subset of X. A relation is in 2NF if it is in 1NF and all of its nonkey attributes are fully functionally dependent on every candidate key.

Third Normal Form Let X, Y, and Z be sets of attributes of relation R. Z is transitively dependent on X if X → Y, Y → Z, and Y -/-> X A relation is in 3NF if it is in 2NF and there are no transitive dependencies.

Motivation for BCNF Relation representing land for sale in various counties: Lots (PropertyID, CountyName, LotNum, Area, Price, TaxRate) CKs: PropertyID and (CountyName, LotNum) Suppose: tax rate is fixed for given county; price of lot (for tax purposes) is determined by area regardless of county. What are the FDs? Fundamentals of Database Systems by Elmasri and Navathe

Motivation (continued) Lots is not in 2NF because TaxRate is partially dependent on (CountyName, LotNum). Decomposition of Lots: Lots1 (PropertyID, CountyName, LotNum, Area, Price) Lots2 (CountyName, TaxRate) Lots1 and Lots2 are in 2NF. But Lots1 is not in 3NF. (Why not?)

Motivation (continued) Decomposition of Lots1: Lots1A (PropertyID, CountyName, LotNum, Area) Lots1B (Area, Price) Lots1A, Lots1B, and Lots2 are in 3NF.

Motivation (continued) Now suppose we will store information about only 2 counties, and that the lot sizes will only be as follows: Marion: 0.5, 0.6, …, 1.0 acres Liberty: 1.1, 1.2, …, 2.0 acres What additional FD is now added to F? Although Lots1A is in 3NF, there is a lot of unncessary redundancy. A stronger normal form is needed to suggest to us the need to decompose Lots1A.

Boyce-Codd Normal Form Let R be a relation schema, F be the set of FDs given to hold over R, X be a subset of the attributes of R, and A be an attribute of R. R is in BCNF if, for every FD X → A in F, one of the following statements is true: A  X, or X is a superkey.

Decomposition into BCNF Suppose R is not in BCNF. Let X  R, A be a single attribute in R, and X → A be an FD that causes a violation of BCNF. Decompose R into R – A and XA. If either R – A or XA is not in BCNF, decompose them further by a recursive application of this algorithm. Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, p. 623

Example of Multivalued Dependencies Relation in BCNF: CourseTeacherBook (Course, Teacher, Book) Sample tuples: <Physics101, Green, Mechanics> <Math301, Green, Mechanics> <Physics101, Green, Optics> <Math301, Green, Vectors> <Physics101, Brown, Mechanics> <Math301, Green, Geometry> <Physics101, Brown, Optics> Decomposition needed to eliminate redundancy: CourseTeacher (Course, Teacher) CourseBook (Course, Book) Adapted from Figure 19.13

Multivalued Dependencies Let R be a relation schema and let X, Y, and Z be subsets of attributes of R. X multidetermines Y (X →→ Y) iff the set of Y values that correspond to a given (X, Z)-pair value depends only on the value of X. Examples from CourseTeacherBook: Course →→ Teacher Course →→ Book

Fourth Normal Form Let R be a relation schema, X and Y be nonempty subsets of the attributes of R, and F be a set of dependencies that includes both FDs and MVDs. R is in 4NF if, for every MVD X →→ Y that holds over R, one of the following statements is true: Y ⊆ X or XY = R, or X is a superkey. I.e., for each MVD X →→ Y, all attributes of R are functionally dependent on X.

Quick Introduction to Relational Algebra Operations A projection extracts only those columns of interest. A natural join of two relations R1 and R2 constructs a new relation such that each tuple in the new relation is the concatenation of two tuples, one from R1 and one from R2, where the two tuples have the same value for two attributes defined over a common domain.

Lossless Decomposition The decomposition process used in normalization is a lossless (or nonloss) decomposition. I.e., the join of the projects is equal to the original relation. O/w spurious data is introduced.

Fifth Normal Form Let X, Y, …, Z represent subsets of the set of attributes for some relation R. R satisfies the join dependency iff it equals the join of its projections on X, Y, …, Z. A relation is in 5NF iff each of its join dependencies is a consequence of its candidate keys.

Example Relation in 4NF: Employee (ID, Skill, Job) Sample tuples: <100, Sk-1, Job-4> <100, Sk-1, Job-2> <100, Sk-14, Job-4> <200, Sk-14, Job-4> <100, Sk-7, Job-2> <200, Sk-14, Job-5>

Example (continued) Projections of Employee: R1 (ID, Skill) R2 (Skill, Job) R3 (ID, Job) Consider the tuples belonging to each projection. Is the join of the projects equal to Employee?

Example (continued) Join R1 and R2 over Skill. Spurious data introduced: <100, Sk-14, Job-5> Join resulting relation and R3 over (ID, Job). Result? original Employee tuples

Example (continued) Odd restriction on updating Employee. E.g., suppose Employee contains: <100, Sk-14, Job-4> <100, Sk-7, Job-2> If we wish to add: <200, Sk-14, Job-2> we must also add: <100, Sk-14, Job-2>