The principal problem that we encounter is redundancy, where a fact is repeated in more than one tuple. Most common cause: attempts to group into one relation.

Slides:



Advertisements
Similar presentations
Spring 2011 Instructor: Hassan Khosravi
Advertisements

Boyce-Codd NF Takahiko Saito Spring 2005 CS 157A.
Normalization CMSC 461 Michael Wilson. Anomalies  Poor relational database design can lead to the occurrence of anomalies  Anomalies that we tend to.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Functional Dependencies - Example
Lossless Decomposition (2) Prof. Sin-Min Lee Department of Computer Science San Jose State University.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 227 Database Systems I Design Theory for Relational Databases.
Midterm Review II. Redundancy. –Information may be repeated unnecessarily in several tuples. –E.g. length and filmType. Update anomalies. –We may change.
Functional Dependencies
Functional Dependencies. Babies At a birth, there is one baby (twins would be represented by two births), one mother, any number of nurses, and a doctor.
Instructor: Amol Deshpande  Data Models ◦ Conceptual representation of the data  Data Retrieval ◦ How to ask questions of the database.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 8 A First Course in Database Systems.
Decomposition By Timothy Chen CS157A. Goal to Decomposition Eliminate redundancy by decomposing a relation into several relations in a higher normal form.
Closure The closure of {B 1 …B k } under the set of FDs S, denoted by {B 1 …B k } +, is defined as follows: {B 1 …B k } + = {B | any relation satisfies.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Winter 2002Arthur Keller – CS 1804–1 Schedule Today: Jan. 15 (T) u Normal Forms, Multivalued Dependencies. u Read Sections Assignment 1 due. Jan.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
CMSC424: Database Design Instructor: Amol Deshpande
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #6 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Asst.Prof.Dr.İlker Kocabaş UBİ502 at
Normalization B Database Systems Normal Forms Wilhelm Steinbuss Room G1.25, ext. 4041
Introduction to Normalization CPSC 356 Database Ellen Walker Hiram College.
Database Management Systems Chapter 3 The Relational Data Model (III) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
Revisit FDs & BCNF Normalization 1 Instructor: Mohamed Eltabakh
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
© D. Wong Ch. 3 (continued)  Database design problems  Functional Dependency  Keys of relations  Decompositions based on Functional Dependency.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Functional Dependencies and Relational Schema Design.
Design Theory for RDB Normal Forms. Lu Chaojun, SJTU 2 Redundant because these info may be figured out by using FD s1  … What’s Bad Design? Redundancy.
3 Spring Chapter Normalization of Database Tables.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
Functional dependencies CMSC 461 Michael Wilson. Designing tables  Now we have all the tools to build our databases  How should we actually go about.
Multivalued Dependencies and 4th NF CIS 4301 Lecture Notes Lecture /21/2006.
CPSC 603 Database Systems Lecturer: Laurie Webster II, M.S.S.E., M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 5 Introduction to a First Course in Database Systems.
Databases 1 Sixth lecture. 2 Functional Dependencies X -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 Lecture 9: Database Design Wednesday, January 25, 2006.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
More on Decompositions and Third Normal Form CIS 4301 Lecture Notes Lecture /16/2006.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
Design Theory for Relational Databases
CS422 Principles of Database Systems Normalization
Schedule Today: Next After that Normal Forms. Section 3.6.
CS422 Principles of Database Systems Normalization
Schedule Today: Jan. 23 (wed) Week of Jan 28
3.1 Functional Dependencies
Problems in Designing Schema
Design Theory for Relational Databases
Functional Dependencies and Normalization
Functional Dependencies
Anomalies Boyce-Codd Normal Form 3rd Normal Form
CS4222 Principles of Database System
Design Theory for Relational Databases
Presentation transcript:

The principal problem that we encounter is redundancy, where a fact is repeated in more than one tuple. Most common cause: attempts to group into one relation both single valued and multi-valued properties of an object. Now we will tackle the problem of designing good relational schemas. Design of Relational Database Schemas

Redundancy. –Information may be repeated unnecessarily in several tuples. –E.g. length and filmType. Update anomalies. –We may change information in one tuple but leave it unchanged in other tuples. –E.g. we could change the length of Star Wars to 125, in the first tuple, and forget to do the same in the second and third tuple. Deletion anomalies. –If a set of values becomes empty, we may lose other information as a side effect. –E.g. if we delete Emilio Estevez we will lose all the information about Mighty Ducks. Anomalies Mike MeyersParamountcolor951992Wayne’s World Dana CarveyParamountcolor951992Wayne’s World Emilio EstevezDisneycolor Mighty Ducks Harrison FordFoxcolor Star Wars Mark HamillFoxcolor Star Wars Carrie FisherFoxcolor Star Wars starNamestudioNamfilmTyplengthyeartitle

Decomposing Relations - Example Star Wars title Wayne’s World Foxcolor studioNamfilmTyplengthyear Paramountcolor Disneycolor Mighty Ducks Wayne’s World Mighty Ducks Star Wars title Mike Meyers Dana Carvey Emilio Estevez Harrison Ford Mark Hamill Carrie Fisher starNameyear No true redundancy! The update anomaly disappeared. If we change the length of a movie, it is done only once. The deletion anomaly disappeared. If we delete all the stars from Movie 2 we still will have the other info for a movie. Movie 1 relation Movie 2 relation

Boyce-Codd Normal Form The goal of decomposition is to replace a relation by several that do not exhibit anomalies. There is a simple condition under which the anomalies can be guaranteed not to exist. This condition is called Boyce-Codd Normal Form, or BCNF. A relation is in BCNF if: –Whenever there is a nontrivial dependency A 1 A 2 …A n  B 1 B 2 …B m for R, it must be the case that {A 1, A 2, …, A n } is a superkey for R.

Boyce-Codd Normal Form - Example Relation Movie in the previous figure is not in BCNF. –Consider the FD: title year  length filmType studioName –Unfortunately, the left side of the above dependency is not a superkey. In particular we know that the title and the year does not functionally determine starName. On the other hand, Movie 1 is in BCNF. –The only key is {title, year} and –title year  length filmType studioName holds in the relation Violating BCNF

Decomposition into BCNF The decomposition strategy is: –Find a non-trivial FD A 1 A 2 …A n  B 1 B 2 …B m that violates BCNF, i.e. A 1 A 2 …A n is not a superkey. –Decompose the relation schema into two overlapping relation schemas: One is all the attributes involved in the violating dependency and the other is the left side and all the other attributes not involved in the dependency. By repeatedly, choosing suitable decompositions, we can break any relation schema into a collection of smaller schemas in BCNF. The data in the original relation is represented faithfully by the data in the relations that are the result of the decomposition. –i.e. we can reconstruct the original relation exactly from the decomposed relations.

Boyce-Codd Normal Form - Example Consider relation schema: Movies(title, year, studioName, president, presAddr) and functional dependencies: title year  studioName studioName  president president  presAddr Last two violate BCNF. Why? Compute {title, year}+, {studioName}+, {president}+ and see if you get all the attributes of the relation. If not, you got a BCNF violation, and need to break relation.

Boyce-Codd Normal Form – Example Let’s decompose starting with: studioName  president Let’s add to the right-hand side any other attributes in the closure of studioName (optional “rule of thumb”). 1.X={studioName} studioName  president 2.X={studioName, president} president  presAddr 3.X={studioName} + ={studioName, president, presAddr}

Boyce-Codd Normal Form – Example From the closure we get: studioName  president presAddr We decompose the relation schema into the following two schemas: Movies1(studioName, president, presAddr) Movies2(title, year, studioName) The second schema is in BCNF. What about the first schema? The following dependency violates BCNF. president  presAddr Why it’s bad to leave Movies1 table as is? If many studios share the same president than we would have redundancy when repeating the presAddr in all those studios.

Boyce-Codd Normal Form – Example We must decompose Movies1, using the FD: president  presAddr The resulting relation schemas, both in BCNF, are: Movies11(title, year, studioName) Movies12(studioName, president) In general, we must keep applying the decomposition rule as many times as needed, until all our relations are in BCNF. So, finally we got Movies11, Movies12, and Movies2.

Finding FDs for the decomposed relations When we decompose a relation, we need to check that the resulting schemas are in BCNF. We can’t tell a relation is in BCNF, unless we can determine the FDs that hold for that relation.

Suppose S is one of the resulting relations in a decomposition of R. For this: Consider each subset X of attributes of S. Compute X + using the FD on R. At the end throw out the attributes of R, which aren’t in S. Then, for each attribute B such that: B is an attribute of S, B is in X + we have that the functional dependency X  B holds in S. Finding FDs for the decomposed relations

Example: Consider R(A, B, C, D, E) decomposed into S(A, B, C) and another relation. Let FDs of R be: A  D, B  E, DE  C First, consider {A} + ={A,D}. Since D is not in the schema of S, we get no dependency here. Similarly, {B} + ={B,E} and {C} + ={C}, yielding no FDs for S. Now consider pairs. {A,B} + ={A, B, C, D, E}. Thus, we deduce AB  C for S. Neither of the other pairs give us any FD for S. Of course the set of all three attributes of S, {A, B, C}, cannot yield any nontrivial dependencies for S. Thus, the only dependency we need assert for S is AB  C.

A Few Tricks Never need to compute the closure of the empty set or of the set of all attributes. If we find X + = all attributes, don’t bother computing the closure of any supersets of X.

Another Example R(A,B,C) with FD’s A  B and B  C. Project onto S(A,C). –A + =ABC ; yields A  B, A  C. We do not need to compute AB + or AC +. –B + =BC ; yields B  C. –C + =C ; yields nothing. –BC + =BC ; yields nothing. Resulting FD’s: A  B, A  C, and B  C. Projection onto AC: A  C. –This is the only FD that involves a subset of {A,C }.