1 Normalization Theory. 2 Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide a way.

Slides:



Advertisements
Similar presentations
primary key constraint foreign key constraint
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala N ATIONAL I NSTITUTE OF T ECHNOLOGY A GARTALA Aug-Dec,2010 Normalization 2 CSE-503 :: D ATABASE.
1 Chapter 6 Relational Normalization Theory. 2 Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
1 Relational Normalization Theory Chapter 8. 2 Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does.
Relational Normalization Theory. Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
CSCI 4333 Database Design and Implementation Review for Midterm Exam II Xiang Lian The University of Texas – Pan American Edinburg, TX 78539
1 Relational Normalization Theory. 2 Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
Relational Normalization Theory
Multivalued Dependency Prepared by Tomasz Kaciak CS157A.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Databases 6: Normalization
Functional Dependencies CS 186, Spring 2006, Lecture 21 R&G Chapter 19 Science is the knowledge of consequences, and dependence of one fact upon another.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Introduction to Normalization CPSC 356 Database Ellen Walker Hiram College.
Relational Database Design by Relational Database Design by Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
CSCD34 - Data Management Systems - A. Vaisman1 Schema Refinement and Normal Forms.
Schema Refinement and Normal Forms Chapter 19 1 Database Management Systems 3ed, R.Ramakrishnan & J.Gehrke.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Ihr Logo Fundamentals of Database Systems Fourth Edition El Masri & Navathe Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Ihr Logo Fundamentals of Database Systems Fourth Edition El Masri & Navathe Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Logical Database Design (1 of 3) John Ortiz Lecture 6Logical Database Design (1)2 Introduction  The logical design is a process of refining DB schema.
Chapter 6 Database Design with the Relational Normalization Theory.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 Functional Dependencies. 2 Motivation v E/R  Relational translation problems : –Often discover more “detailed” constraints after translation (upcoming.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 19.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
CS542 1 Schema Refinement Chapter 19 (part 1) Functional Dependencies.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 19.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
CSCI 6315 Applied Database Systems Review for Midterm Exam II Xiang Lian The University of Texas Rio Grande Valley Edinburg, TX 78539
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
11/16/2016CENG352 Database Management Systems1 Relational Normalization Theory.
Functional Dependency and Normalization
Functional Dependency
Database Management Systems (CS 564)
Normalization Functional Dependencies Presented by: Dr. Samir Tartir
Relational Database Design by Dr. S. Sridhar, Ph. D
Handout 4 Functional Dependencies
Schema Refinement & Normalization Theory
Functional Dependencies
Database Management systems Subject Code: 10CS54 Prepared By:
Normalization Murali Mani.
Functional Dependencies and Normalization
Relational Normalization Theory
Functional Dependencies
Normalization.
Schema Refinement and Normalization
Functional Dependencies
Lecture 07: E/R Diagrams and Functional Dependencies
Normalization cs3431.
CS 405G: Introduction to Database Systems
Instructor: Mohamed Eltabakh
Schema Refinement and Normal Forms
Chapter 19 (part 1) Functional Dependencies
Relational Database Design
CSC 453 Database Systems Lecture
Lecture 6: Functional Dependencies
Presentation transcript:

1 Normalization Theory

2 Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide a way of evaluating alternative schemas Normalization theory provides a mechanism for analyzing and refining the schema produced by an E-R design

3 Redundancy Dependencies between attributes cause redundancy –Ex. All addresses in the same town have the same zip code SSN Name Town Zip 1234 Joe Stony Brook Mary Stony Brook Tom Stony Brook …………………. Redundancy

4 Redundancy and Other Problems Set valued attributes in the E-R diagram result in multiple rows in corresponding table PersonExample: Person (SSN, Name, Address, Hobbies) Person –A person entity with multiple hobbies yields multiple rows in table Person Hence, the association between Name and Address for the same person is stored redundantly –SSN is key of entity set, but (SSN, Hobby) is key of corresponding relation PersonThe relation Person can’t describe people without hobbies

5 Example SSN Name Address Hobby 1111 Joe 123 Main biking 1111 Joe 123 Main hiking ……………. SSN Name Address Hobby 1111 Joe 123 Main {biking, hiking} ER Model Relational Model Redundancy

6 Anomalies Redundancy leads to anomalies: –Update anomaly: A change in Address must be made in several places –Deletion anomaly: Suppose a person gives up all hobbies. Do we have to: Set Hobby attribute to null? No, since Hobby is part of key Delete the entire row? No, since we lose other information in the row –Insertion anomaly: Hobby value must be supplied for any inserted row since Hobby is part of key

7 Decomposition PersonSolution: use two relations to store Person information –Person1 –Person1 (SSN, Name, Address) –Hobbies –Hobbies (SSN, Hobby) The decomposition is more general: people with hobbies can now be described No update anomalies: –Name and address stored once –A hobby can be separately supplied or deleted

8 Normalization Theory Result of E-R analysis need further refinement Appropriate decomposition can solve problems normalization theoryfunctional dependencies multi-valued dependenciesThe underlying theory is referred to as normalization theory and is based on functional dependencies (and other kinds, like multi-valued dependencies) Functional DependencyFunctional Dependency –Main tool for formally measuring the appropriateness of attribute groupings into relation schema

9 Functional Dependencies functional dependencyDefinition: A functional dependency (FD), denoted by X  Y, between two sets of attributes X and Y that are subsets of R specifies a constraint on the possible tuples that can form a relation instance r of R. The constraint states that, for any two tuples t1 and t2 in r such that t1[X] = t2[X], we must also have t1[Y] = t2[Y] This means that the values of the Y component of a tuple in r depend on the values of the X component. –Key constraint is a special kind of functional dependency: all attributes of relation occur on the right-hand side of the FD: SSN  SSN, Name, Address

10 Functional Dependencies - Example Address  ZipCode –Stony Brook’s ZIP is –“given a value of Address, we know the value of ZipCode” ArtistName  BirthYear –Picasso was born in 1881 Autobrand  Manufacturer, Engine type –Pontiac is built by General Motors with gasoline engine Author, Title  PublDate –Shakespeare’s Hamlet published in 1600

11 Entailment, Closure, Equivalence entailsDefinition: If F is a set of FDs on schema R and f is another FD on R, then F entails f if every instance r of R that satisfies every FD in F also satisfies f –Ex: F is {A  B, B  C} and f is A  C If Streetaddr  Town and Town  Zip then Streetaddr  Zip closureDefinition: The closure of F, denoted F +, is the set of all FDs entailed by F equivalentDefinition: F and G are equivalent if F entails G and G entails F

12 Entailment (cont’d) Satisfaction, entailment, and equivalence are semantic concepts – defined in terms of the actual relations in the “real world”. –They define what these notions are, not how to compute them How to check if F entails f or if F and G are equivalent? –Apply the respective definitions for all possible relations? Bad idea: might be infinite number for infinite domains Even for finite domains, we have to look at relations –Solution: find algorithmic, syntactic ways to compute these notions Important: The syntactic solution must be “correct” with respect to the semantic definitions soundnesscompletenessCorrectness has two aspects: soundness and completeness – see later

13 Armstrong’s Axioms for FDs This is the syntactic way of computing/testing the various properties of FDs Reflexivity: If Y  X then X  Y (trivial FD) –A set of attributes always determines itself –Name, Address  Name Augmentation: If X  Y then X Z  YZ –If Town  Zip then Town, Name  Zip, Name –Also, Augmentation: If X  Y then X Z  Y (reflexivity) Transitivity: If X  Y and Y  Z then X  Z

14 Additional Inference Rules for FDs Decomposition: If X  YZ then X  Z –If SSN  Name, Age then SSN  Age Additive : If X  Y and X  Z then X  YZ Pseudo-transitive : If X  Y and WY  Z then WX  Z –( because of Additive WX  WY )

15 Soundness soundAxioms 1 to 3 are sound: If an FD f: X  Y can be derived from a set of FDs F using the axioms, then f holds in every relation that satisfies every FD in F. Example: Given X  Y and X  Z then –Thus, X  Y Z is satisfied in every relation where both X  Y and X  Z are satisfied Additive ruleTherefore, we have derived the Additive rule for FDs: we can take the union of the RHSs of FDs that have the same LHS X  XY Augmentation by X YX  YZ Augmentation by Y X  YZ Transitivity (XY == YX)

16 Completeness completeAxioms 1 to 3 are complete: If F entails f, then f can be derived from F using the axioms A consequence of completeness is the following (naïve) algorithm to determining if F entails f: –Algorithm –Algorithm: Use the axioms in all possible ways to generate F + (the set of possible FD’s is finite so this can be done) and see if f is in F +

17 Correctness The notions of soundness and completeness link the syntax (Armstrong’s axioms) with semantics (the definitions in terms of relational instances) This is a precise way of saying that the algorithm for entailment based on the axioms is “correct” with respect to the definitions

18 Generating F + F AB  C AB  BCD A  D AB  BD AB  BCDE AB  CDE D  E BCD  BCDE Thus, AB  BD, AB  BCD, AB  BCDE, and AB  CDE are all elements of F + union aug trans aug decomp

19 Attribute Closure Calculating attribute closure leads to a more efficient way of checking entailment attribute closureThe attribute closure of a set of attributes, X, with respect to a set of functional dependencies, F, (denoted X + F ) is the set of all attributes, A, such that X  A –X + F1 is not necessarily the same as X + F2 if F1  F2 Attribute closure and entailment: –Algorithm –Algorithm: Given a set of FDs, F, then X  Y if and only if X + F  Y

20 Example - Computing Attribute Closure F: AB  C A  D D  E AC  B X X F + A {A, D, E} AB {A, B, C, D, E} (Hence AB is a key) B {B} D {D, E} Is AB  E entailed by F? Yes Is D  C entailed by F? No Result: X F + allows us to determine FDs of the form X  Y entailed by F

21 Computation of Attribute Closure X + F closure := X; // since X  X + F repeat old := closure; if there is an FD Y  Z in F such that Y  closure and Z  closure then closure := closure  Z until old = closure – If T  closure then X  T is entailed by F

22 Example: Computation of Attribute Closure AB  C (a) A  D (b) D  E (c) AC  B (d) Problem: Compute the attribute closure of AB with respect to the set of FDs : Initially closure = {AB} Using (a) closure = {ABC} Using (b) closure = {ABCD} Using (c) closure = {ABCDE} Solution:

23 Denormalization Tradeoff: Judiciously introduce redundancy to improve performance of certain queries TranscriptExample: Add attribute Name to Transcript –Join is avoided Transcript –If queries are asked more frequently than Transcript is modified, added redundancy might improve average performance Transcript –But, Transcript is no longer in BCNF since key is (StudId, CrsCode, Semester) and StudId  Name SELECT T.Name, T.Grade Transcript FROM Transcript T WHERE T.CrsCode = ‘CS305’ AND T.Semester = ‘S2002’