Your name here. Improving Schemas and Normalization What are redundancies and anomalies? What are functional dependencies and how are they related to.

Slides:



Advertisements
Similar presentations
Functional dependencies 1. 2 Outline motivation: update anomalies cause: not expressed constraints on data (FDs) functional dependencies (FDs) definitions.
Advertisements

primary key constraint foreign key constraint
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
 Definition  Components  Advantages  Limitations Contents  Definition Definition  Normal Forms Normal Forms  First Normal Form First Normal Form.
Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina.
Manipulating Functional Dependencies Zaki Malik September 30, 2008.
Boyce-Codd NF Takahiko Saito Spring 2005 CS 157A.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
+ Review: Normalization and data anomalies CSCI 2141 W2013 Slide set modified from courses.ischool.berkeley.edu/i257/f06/.../Lecture06_257.ppt.
Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Normalization I.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Databases 6: Normalization
Why Normalization? To Reduce Redundancy to 1.avoid modification, insertion, deletion anomolies 2.save space Goal: One Fact in One Place.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Functional Dependencies and Normalization for Relational Databases.
DBSQL 4-1 Copyright © Genetic Computer School 2009 Chapter 4 Database Design.
Introduction to Normalization CPSC 356 Database Ellen Walker Hiram College.
Concepts of Database Management, Fifth Edition
Database Management COP4540, SCS, FIU Relation Normalization (Chapter 14)
1 Pertemuan 23 Normalisasi Matakuliah: >/ > Tahun: > Versi: >
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Database Normalization.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
Lecture 1 of Advanced Databases Basic Concepts Instructor: Mr.Ahmed Al Astal.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
1 Functional Dependencies and Normalization Chapter 15.
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Lecture 8: Database Concepts May 4, Outline From last lecture: creating views Normalization.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
11/06/97J-1 Principles of Relational Design Chapter 12.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Southern Methodist University CSE CSE 2337 Introduction to Data Management Chapter 5 Part II.
1 COP 4710 Databases Fall, 2000 Today’s Topic Chapter 5: Improving the Quality of Relational Schemas David A. Gaitros September 18, 2000 Department of.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Functional Dependency and Normalization
Advanced Normalization
A brief summary of database normalization
Advanced Normalization
Functional Dependencies and Normalization for Relational Databases
Functional Dependencies and Normalization
Normalization.
Instructor: Mohamed Eltabakh
Presentation transcript:

Your name here

Improving Schemas and Normalization What are redundancies and anomalies? What are functional dependencies and how are they related to schema quality? What is a superkey? What is a inference rule and how can we infer functional dependencies How are keys determined by functional dependencies How can we modify a schema to improve it? What are normal forms and why are they important?

Redundancy and Anomalies in Relation Schemas Anomalies occur when data is inconsistent Redundancy of values is the source of anomalies Update anomaly occurs when values are inconsistent –if title, genre, length or rating changed in any one or two of the green rows

Redundancy and Anomalies in Relation Schemas Anomalies occur when data is inconsistent Redundancy of values is the source of anomalies Deletion anomaly caused by deletion of row with videoId1243 (pink) –Information about movie is deleted along with video Insertion anomaly caused by last row (blue) –Length and rating are inconsistent with other rows

Functional Dependencies Between Attributes A functional dependency is a strong connection between two or more attributes in a table. –one attribute is functionally dependent on another attribute when any two rows of the table that have the same value of the second attribute must have the same value for the first Example: movieId determines title, genre, length, rating –Each row with movieId 123 has the same values for other attributes –FD2: movieId  {title, genre, length, rating}

City, State, Zipcode Dependencies FD4: zipcode  {city, state} FD5: {street, city, state}  zipcode

Superkeys and Keys A key constraint is a functional dependency Example: accountId is key of Customer –FD6: accountId  {lastName, firstName, street, city, state, zipcode} A superkey is a set of attributes that determine the rest of the attributes of a schema –FD7: {accountId, lastName}  (firstName, street, city, state, zipcode}

Using Functional Dependencies Functional dependencies are used for –Determining keys –Finding sources of redundancy and hence trouble Functional dependencies are declared –Designer defines FDs based on the semantics of the schemas –Additional dependencies can be found from those that are declared Keys and redundancies are based on the full set of FDs –All declared FDs –FDs inferred by applying inference rules

Inferring Additional Functional Dependencies Main inference rules –Rule 1: Reflexivity, a set of attributes X determines a subset Y of itself: –If X  Y, then X  Y. –Rule 2: Augmentation, a set of attributes Z can be added to both sides of X  Y: –If X  Y, then XZ  YZ. –Rule 3: Transitivity, we can follow chains of dependencies from X to Y to Z: If X  Y and Y  Z, then X  Z. Additional rules for convenience –Rule 4: Decomposition, we can remove a set of attributes Z from the right side of X  YZ: –if X  YZ, then X  Y. –Rule 5: Union, we can put two dependencies X  Y and X  Z together if they have the same left side Z: –if X  Y and X  Z then X  YZ –Rule 6: Pseudo-transitivity, a combination of augmentation by adding W to both sides of X  Y and transitivity in going from WX to WY to Z: –if X  Y and WY  Z, then WX  Z. Apply rules to FDs to find new rules –Closure is the set of all FDs that can be inferred

Example of Inference Consider these how to infer FD7 from FD6 –FD6: accountId  {lastName, firstName, street, city, state, zipcode} –FD7: {accountId, lastName}  (firstName, street, city, state, zipcode} Infer FD8 with augmentation: add lastName to left side –FD8: {accountId, lastName}  (firstName, street, city, state, zipcode, lastName} Use decomposition: remove lastName from right side –FD7: {accountId, lastName}  (firstName, street, city, state, zipcode}

Determining Keys from Functional Dependencies Start with closure of functional dependencies Any functional dependency that includes all attributes has a superkey as the left side If no subset of the left side is a super key –The left side is a key A set of attributes is a key if and only if the above holds Some terminology –Key is a set of attributes that determine all other attributes –Key attribute is an attribute that is part of a key –Non-key attribute is an attribute that is not part of any key –Primary key is one of the keys that has been selected to identify the objects of the schema –Secondary key is a key that is not the primary key

Normalization Normalization is the process of transforming some objects into a structural form that satisfies some collection of rules Any schema that is in normal form is guaranteed to have certain quality characteristics Each normal form has a rule that describes what kinds of functional dependencies the normal form allows. –Normalization is the process of transforming schemas in order to remove violations of the normal form rules. –Normalization is applied independently to each relation schema in a database schema. –A a database schema is said to be in normal form if each of its relation schemas is in the normal form.

Third Normal Form A relation schema is in third normal form (3NF) if for every functional dependency –The left side (determinant) is a superkey or –The right side attributes are all key attributes A functional dependency is a 3NF violation if –The left side is not a superkey and –The right side attributes are all non-key attributes Consider the schema and FDs –VideoMovie:(videoId, dateAcquired, movieId, title, genre, length, rating) –FD1: movieId  title –FD2: movieId  {title, genre, length, rating} –FD9: videoId  (dateAcquired, movieId} –FD10: videoId  movieId –FD11: videoId  (title, genre, length, rating} –FD12: videoId  (dateAcquired, movieId, title, genre, length, rating} FD1, FD2 are 3NF violations FD9, FD10, FD11, FD12 are not 3NF violations because videoId (left side) is a key

Decomposition Remove violations by decomposition –Create a new schema from FD –Remove right hand attributes of FD from original schema –Left side of FD becomes foreign key in original schema Consider the schema and 3NF violations –VideoMovie:(videoId, dateAcquired, movieId, title, genre, length, rating) –FD1: movieId  title –FD2: movieId  {title, genre, length, rating} Can decompose by either FD1 or FD2 –Better to use the larger FD New schemas –Video: (videoId, dateAcquired, movieId references Movie) –Movie: (movieId, title, genre, length, rating)

First and Second Normal Form The traditional presentation of 3NF includes two other normal forms: 1NF and 2NF –E.F. Codd (1970) defined several normal forms –Subsequent analysis simplified the definition of 3NF 1NF specifies that every attribute must be single valued –1NF has been incorporated into definition of relational model 2NF makes a technical distinction about why an FD is a violation –The goal of normalization is to achieve 3NF –2NF is an intermediate step and never a goal of normalization

Boyce Codd Normal Form A schema is in BCNF if every functional dependency has a superkey as its determinant –No exclusion for key attributes in left side Important in the context of multi-attribute keys Consider the example of schema and FD –R6: (street, city, state, zipcode, secondary key {street, zipcode}) –FD4: zipcode  {city, state} FD4 has BCNF violation even though city and state are key attributes –FD4 is not a 3NF violation Decomposition of R6 by FD4 into R7 and R8 in BCNF –R7: (street, zipcode references R8) –R8: (zipcode, city, state) Note that the schemas have one key each. –Multiple keys have been removed by decomposition

Case in Point: Normalizing a Car Registration Schema Example from text –Illustrates the way that normalization can be a source of schema definition Process of design –Define relevant FDs –Apply inference rules –Normalize –Rename resulting schemas