CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.

Slides:



Advertisements
Similar presentations
primary key constraint foreign key constraint
Advertisements

Manipulating Functional Dependencies Zaki Malik September 30, 2008.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
Relational Normalization Theory. Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
1 Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design April 16 & 18, 2008.
Functional Dependencies (Part 3) Presented by Nash Raghavan All page numbers are in reference to Database System Concepts (5 th Edition)
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Normalization I.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Boyce-Codd NF & Lossless Decomposition Professor Sin-Min Lee.
Functional Dependencies CS 186, Spring 2006, Lecture 21 R&G Chapter 19 Science is the knowledge of consequences, and dependence of one fact upon another.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Functional Dependencies and Relational Schema Design.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
1 Schema Design & Refinement (aka Normalization).
CSCD34 - Data Management Systems - A. Vaisman1 Schema Refinement and Normal Forms.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Schema Refinement and Normal Forms Chapter 19 1 Database Management Systems 3ed, R.Ramakrishnan & J.Gehrke.
Lecture 09: Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Normalization Ioan Despi 2 The basic objective of logical modeling: to develop a “good” description of the data, its relationships and its constraints.
Computing & Information Sciences Kansas State University Tuesday, 27 Feb 2007CIS 560: Database System Concepts Lecture 18 of 42 Tuesday, 27 February 2007.
Logical Database Design (1 of 3) John Ortiz Lecture 6Logical Database Design (1)2 Introduction  The logical design is a process of refining DB schema.
Revisit FDs & BCNF Normalization 1 Instructor: Mohamed Eltabakh
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
Christoph F. Eick: Functional Dependencies, BCNF, and Normalization 1 Functional Dependencies, BCNF and Normalization.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
© D. Wong Ch. 3 (continued)  Database design problems  Functional Dependency  Keys of relations  Decompositions based on Functional Dependency.
Tallahassee, Florida, 2015 COP4710 Database Systems Relational Design Fall 2015.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 15.
Functional Dependencies R&G Chapter 19 Science is the knowledge of consequences, and dependence of one fact upon another. Thomas Hobbes ( )
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Ch 7: Normalization-Part 1
CS542 1 Schema Refinement Chapter 19 (part 1) Functional Dependencies.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 19.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
11/06/97J-1 Principles of Relational Design Chapter 12.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Normal Forms Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems June 18, 2016 Some slide content courtesy of Susan Davidson.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
Lecture 11: Functional Dependencies
Advanced Normalization
COP4710 Database Systems Relational Design.
Database Management Systems (CS 564)
Advanced Normalization
Functional Dependencies and Relational Schema Design
Lecture 07: E/R Diagrams and Functional Dependencies
Relational Database Design
Lecture 6: Functional Dependencies
Lecture 09: Functional Dependencies
Presentation transcript:

CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design

ER model vs. Relational model R E2E2 E1E1 A2A2 A3A3 A6A6 A5A5 A4A4 A1A1 Ted Codd E 1 (A 1, A 2, A 3 )R(A 1, A 4, A 5 )E 2 (A 5, A 6 ) A1A1 A2A2 A3A3 A1A1 A4A4 A5A5 A5A5 A6A6 Entity Sets Relationships Attributes Peter Chen Relations

ER model vs. Relational model R E2E2 E1E1 A2A2 A3A3 A6A6 A5A5 A4A4 A1A1 Ted Codd E 1 (A 1, A 2, A 3, A 4, A 5 )E 2 (A 5, A 6 ) A5A5 A6A6 Entity Sets Relationships Attributes Peter Chen Relations A1A1 A2A2 A3A3 A4A4 A5A5 determine

Reminder: redundancy causes trouble update anomaly = update one copy of Fred’s SSN but not the other deletion anomaly = delete all Fred’s phones, lose his SSN as a side effect SSNNamePhone Number Fred(201) Fred(206) Joe(908) Joe(212) Jocelyn(212) P o t e n t i a l I n c o n s i s t e n c y

Non-solution: multiple values in one field First Normal Form: Only one value in each field. SSNNamePhone Number Fred(201) (206) Joe(908) (212)

Your common sense will tell you how to fix this schema SSNName Fred Joe SSNPhone Number (201) (206) (908) (212) No more update or delete anomalies.

What if you don’t have common sense? There is a theory to tell you what to do! Original Schema R Transformed Schema R* R* must: Preserve the information of R Have minimal redundancy “Preserve dependencies”: easy to check R’s constraints Give good query performance (but the theory won’t help you there) This theory formalizes the concept of redundancy normalize R

Functional Dependencies

Functional dependencies generalize the idea of a key If two tuples agree on the attributes A 1, …, A n, then they must also agree on attributes B 1, …, B n. A 1, …, A n functionally determine B 1, …, B n. A 1, …, A n  B 1, …, B n equiv

EmpID  Name, Phone, Office Office  Phone Phone  Office Name  EmpID isn’t likely to hold in all instances of this schema, though it holds in this instance More generally, an instance can tell you many FDs that don’t hold, but not all those that do. EmpIDNamePhoneOffice E0045Alice9876SC 2119 E1847Bob9876SC 2119A E1111Carla9876SC 2119A E9999David1234DCL 1320

Use your common sense to find the FDs in the world around you Product: name  price, manufacturer Person: ssn  name, age Company: name  stock price, president School: student, course, semester  grade

We can define keys in terms of FDs Key of a relation R is a set of attributes that 1.functionally determines all attributes of R 2.none of its proper subsets have this property. Superkey = set of attributes that contains a key.

Reasoning with FDs 1) Closure of a set of FDs 2) Closure of a set of attributes

The closure S + of a set S of FDs is the set of all FDs logically implied by S. R = {A, B, C, G, H, I} S = {A  B, A  C, CG  H, CG  I, B  H} Does A  H hold? You can prove whether it does!

Compute the closure S + of S using Armstrong’s Axioms 1. Reflexivity A 1... A n  every subset of A 1... A n 2. Augmentation If A 1... A n  B 1... B m, then A 1... A n C 1... C k  B 1... B m C 1... C k 3. Transitivity If A 1... A n  B 1... B m and B 1... B m  C 1... C k, then A 1... A n  C 1... C k

How to compute S + using Armstrong's Axioms S + = S; loop { For each f in S, apply the reflexivity and augmentation rules and add the new FDs to S +. For each pair of FDs in S, apply the transitivity rule and add the new FDs to S + } until S + does not change any more.

You can infer additional rules from Armstrong’s Axioms Union If X  Y and X  Z, then X  YZ (X, Y, Z are sets of attributes) Splitting X  YZ, then X  Y and X  Z Combining X  Y and X  Z, then X  YZ Pseudo-transitivity X  Y and YZ  U, then XZ  U

The closure of a set of attributes contains everything they functionally determine Given a set S of dependencies, the closure of a set of attributes {A 1... A n }, written {A 1... A n } +, is { B such that any relation that satisfies S also satisfies A 1... A n  B }

It is easy to compute the closure of a set of attributes Start with X = {A 1... A n }. repeat until X doesn’t change do: if B 1... B m  C is in S, and B 1... B m are all in X, and C is not in X then add C to X.

A B  C A D  E B  D A F  B {A, B} + = {A, F} + = {A, B, C, D, E} {A, F, B, D, C, E}

What is the attribute closure good for? 1.Test if X is a superkey –compute X +, and check if X + contains all attrs of R 2.Check if X  Y holds –by checking if Y is contained in X + 3.Another (not so clever) way to compute closure S + of FDs –for each subset of attributes X in relation R, compute X + with respect to S –for each subset of attributes Y in X +, output the FD X  Y

Reminder: intended goals of schema refinement Minimize redundancy Avoid information loss Easy to check dependencies Ensure good query performance

Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = obsolete Boyce Codd Normal Form (BCNF) Third Normal Form (3NF) Fourth Normal Form (4NF) Others...

Boyce-Codd Normal Form A relation R is in BCNF if whenever there is a nontrivial FD A 1... A n  B for R, {A 1... A n } is a superkey for R. An FD is trivial if all the attributes on its right-hand side are also on its left-hand side.

What are the nontrivial functional dependencies? SSN  Name (plus the FDs that can be derived from that) What are the keys? The only key is {SSN, Phone Number}. How do I know? Augmentation + minimality. Is it in BCNF? No. SSN is not a key. SSNNamePhone Number Fred(201) Fred(206) Joe(908) Joe(212) Jocelyn(212)

What if we are in a situation where Phone Number  SSN? SSNNamePhone Number Fred(201) Fred(206) Joe(908) Joe(212) Jocelyn(212) What are the nontrivial FDs? Phone Number  SSN SSN  Name (plus FDs derived from these) What are the keys? Only {Phone Number}. How do I know? Augmentation, transitivity, minimality. Is it in BCNF? No.

What about that alternative schema we recommended earlier---are they in BCNF? SSNName Fred Joe SSNPhone Number (201) (206) (908) (212) For each relation: What are its important FDs? What are its keys? Is it in BCNF?

What about that alternative schema we recommended earlier---are they in BCNF? SSNPhone Number (201) (206) (908) (212) If Phone Number  SSN holds Important FDS: Phone Number  SSN. Keys: {Phone Number} Is it in BCNF? Yes. If Phone Number  SSN doesn’t hold Important FDS: none. Keys: {SSN, Phone Number} Is it in BCNF? Yes. SSNName Fred Joe Important FDS: SSN  Name Keys: {SSN}. Is it in BCNF? Yes.

What about that alternative schema we recommended earlier---are they in BCNF? True or False: Any 2-attribute relation is in BCNF. SSNName Fred Joe SSNPhone Number (201) (206) (908) (212)