Lecture 07: E/R Diagrams and Functional Dependencies

Slides:



Advertisements
Similar presentations
Schema Refinement: Normal Forms
Advertisements

Lecture 6: Design Constraints and Functional Dependencies January 21st, 2004.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
1 Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design April 16 & 18, 2008.
Lecture #3 Functional Dependencies Normalization Relational Algebra Thursday, October 12, 2000.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Lecture 06 The Relational Data Model. 2 Outline Relational Data Model Functional Dependencies FDs in ER Logical Schema Design Reading Chapter 8.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #7 Matthew P. Johnson Stern School of Business, NYU Spring,
Boyce-Codd NF & Lossless Decomposition Professor Sin-Min Lee.
Functional Dependencies CS 186, Spring 2006, Lecture 21 R&G Chapter 19 Science is the knowledge of consequences, and dependence of one fact upon another.
Functional Dependencies and Relational Schema Design.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Lecture 09: Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
1 Lecture 7: Normal Forms, Relational Algebra Monday, 10/15/2001.
Functional Dependencies R&G Chapter 19 Science is the knowledge of consequences, and dependence of one fact upon another. Thomas Hobbes ( )
1 Lecture 10: Database Design Wednesday, January 26, 2005.
Functional Dependencies and Relational Schema Design.
1 Lecture 08: E/R Diagrams and Functional Dependencies Friday, January 21, 2005.
CS542 1 Schema Refinement Chapter 19 (part 1) Functional Dependencies.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
COMP 430 Intro. to Database Systems Normal Forms Slides use ideas from Chris Ré.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
1 Lecture 9: Database Design Wednesday, January 25, 2006.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
1 Normalization Theory. 2 Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide a way.
Lecture 11: Functional Dependencies
CS422 Principles of Database Systems Normalization
Schedule Today: Next After that Normal Forms. Section 3.6.
COP4710 Database Systems Relational Design.
Normalization First Normal Form (1NF) Boyce-Codd Normal Form (BCNF)
CS422 Principles of Database Systems Normalization
3.1 Functional Dependencies
Handout 4 Functional Dependencies
Schema Refinement & Normalization Theory
Functional Dependencies
Problems in Designing Schema
Introduction to Database Systems CSE 544
Cse 344 May 18th – Loss and views.
Functional Dependencies and Normalization for Relational Databases
Lecture 6: Design Theory
Normalization Murali Mani.
Functional Dependencies and Normalization
Lecture 2: Database Modeling (end) The Relational Data Model
Cse 344 May 16th – Normalization.
Functional Dependencies and Relational Schema Design
Functional Dependencies
Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design October 12 & 15, 2007.
Lecture 09: Functional Dependencies, Database Design
Lecture 8: Database Design
Lectures 12: Design Theory I
Functional Dependencies
Normalization cs3431.
CSE544 Data Modeling, Conceptual Design
Lectures 13: Design Theory II
CS4433 Database Systems Relational Design.
Functional Dependencies
Chapter 19 (part 1) Functional Dependencies
Terminology Product Attribute names Name Price Category Manufacturer
Lecture 08: E/R Diagrams and Functional Dependencies
Lecture 6: Functional Dependencies
Lecture 11: Functional Dependencies
Chapter 3: Design theory for relational Databases
Chapter 7a: Overview of Database Design -- Normalization
Lecture 09: Functional Dependencies
CS4222 Principles of Database System
Presentation transcript:

Lecture 07: E/R Diagrams and Functional Dependencies Wednesday, October 12, 2005

Outline Finish E/R diagrams (Chapter 2) The relational data model: 3.1 And E/R diagrams to relations (3.2, 3.3) The relational data model: 3.1 Functional dependencies: 3.4

Relational Schema Design Person buys Product name price ssn Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies

Data Anomalies When a database is poorly designed we get anomalies: Redundancy: data is repeated Updated anomalies: need to change in several places Delete anomalies: may lose data when we don’t want

Relational Schema Design Recall set attributes (persons with several phones): Name SSN PhoneNumber City Fred 123-45-6789 206-555-1234 Seattle 206-555-6543 Joe 987-65-4321 908-555-2121 Westfield One person may have multiple phones, but lives in only one city Anomalies: Redundancy = repeat data Update anomalies = Fred moves to “Bellevue” Deletion anomalies = Joe deletes his phone number: what is his city ?

Relation Decomposition Break the relation into two: Name SSN PhoneNumber City Fred 123-45-6789 206-555-1234 Seattle 206-555-6543 Joe 987-65-4321 908-555-2121 Westfield Name SSN City Fred 123-45-6789 Seattle Joe 987-65-4321 Westfield SSN PhoneNumber 123-45-6789 206-555-1234 206-555-6543 987-65-4321 908-555-2121 Anomalies have gone: No more repeated data Easy to move Fred to “Bellevue” (how ?) Easy to delete all Joe’s phone number (how ?)

Relational Schema Design (or Logical Design) Main idea: Start with some relational schema Find out its functional dependencies Use them to design a better relational schema

Functional Dependencies A form of constraint hence, part of the schema Finding them is part of the database design Also used in normalizing the relations

Functional Dependencies Definition: If two tuples agree on the attributes A1, A2, …, An then they must also agree on the attributes B1, B2, …, Bm Formally: A1, A2, …, An  B1, B2, …, Bm

When Does an FD Hold Definition: A1, ..., Am  B1, ..., Bn holds in R if: t, t’  R, (t.A1=t’.A1  ...  t.Am=t’.Am  t.B1=t’.B1  ...  t.Bn=t’.Bn ) R A1 ... Am B1 Bm t if t, t’ agree here then t, t’ agree here t’

Examples EmpID Name Phone Position E0045 Smith 1234 Clerk E3542 Mike An FD holds, or does not hold on an instance: EmpID Name Phone Position E0045 Smith 1234 Clerk E3542 Mike 9876 Salesrep E1111 E9999 Mary Lawyer EmpID  Name, Phone, Position Position  Phone but not Phone  Position

Example EmpID Name Phone Position E0045 Smith 1234 Clerk E3542 Mike 9876  Salesrep E1111 E9999 Mary Lawyer Position  Phone

Example EmpID Name Phone Position E0045 Smith 1234  Clerk E3542 Mike 1234  Clerk E3542 Mike 9876 Salesrep E1111 E9999 Mary Lawyer but not Phone  Position

Typical Examples of FDs Product: name  price, manufacturer Person: ssn  name, age zip  city city, state  zip

Example FD’s are constraints: On some instances they hold name  color On others they don’t name  color category  department color, category  price name category color department price Gizmo Gadget Green Toys 49 Tweaker 99 No: color,category  price doesn’t hold Does this instance satisfy all the FDs ?

Example name  color category  department color, category  price Gizmo Gadget Green Toys 49 Tweaker Black 99 Stationary Office-supp. 59 yes What about this one ?

An Interesting Observation name  color category  department color, category  price If all these FDs are true: Then this FD also holds: name, category  price Why ??

Goal: Find ALL Functional Dependencies Anomalies occur when certain “bad” FDs hold We know some of the FDs Need to find all FDs, then look for the bad ones

Inference Rules for FD’s A1, A2, …, An  B1, B2, …, Bm Splitting rule and Combing rule Is equivalent to A1 ... Am B1 Bm A1, A2, …, An  B1 A1, A2, …, An  B2 . . . . . A1, A2, …, An  Bm

Inference Rules for FD’s (continued) Trivial Rule A1, A2, …, An  Ai where i = 1, 2, ..., n A1 … Am Why ?

Inference Rules for FD’s (continued) Transitive Closure Rule If A1, A2, …, An  B1, B2, …, Bm and B1, B2, …, Bm  C1, C2, …, Cp then A1, A2, …, An  C1, C2, …, Cp Why ?

A1 … Am B1 Bm C1 ... Cp

Example (continued) Which Rule did we apply ? Inferred FD 1. name  color 2. category  department 3. color, category  price Start from the following FDs: Infer the following FDs: Inferred FD Which Rule did we apply ? 4. name, category  name 5. name, category  color 6. name, category  category 7. name, category  color, category 8. name, category  price

Example (continued) 1. name  color 2. category  department 3. color, category  price Answers: Inferred FD Which Rule did we apply ? 4. name, category  name Trivial rule 5. name, category  color Transitivity on 4, 1 6. name, category  category 7. name, category  color, category Split/combine on 5, 6 8. name, category  price Transitivity on 3, 7 THIS IS TOO HARD ! Let’s see an easier way.

Inference Rules for FDs The three simple rules are all we need to derive all possible FDs Called “Armstrong Rules” However, they are clumsy to use in practice Better: use “closure” of a set of attributes (next)

Closure of a set of Attributes Given a set of attributes A1, …, An The closure, {A1, …, An}+ , is the set of attributes B s.t. A1, …, An  B Example: name  color category  department color, category  price Closures: name+ = {name, color} {name, category}+ = {name, category, color, department, price} color+ = {color}

Closure Algorithm Start with X={A1, …, An}. Repeat until X doesn’t change do: if B1, …, Bn  C is a FD and B1, …, Bn are all in X then add C to X. Example: name  color category  department color, category  price {name, category}+ = { } name, category, color, department, price Hence: name, category  color, department, price

Example In class: A, B  C R(A,B,C,D,E,F) A, D  E B  D A, F  B Compute {A,B}+ X = {A, B, } Compute {A, F}+ X = {A, F, }

Why Do We Need Closure With closure we can find all FD’s easily To check if X ® A Compute X+ Check if A Î X+

Using Closure to Infer ALL FDs Example: A, B  C A, D  B B  D Step 1: Compute X+, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ = ABCD, AC+ = AC, AD+ = ABCD ABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?) BCD+ = BCD, ABCD+ = ABCD Step 2: Enumerate all FD’s X  Y, s.t. Y  X+ and XY = : AB  CD, ADBC, ABC  D, ABD  C, ACD  B

Another Example Enrollment(student, major, course, room, time) course  time What else can we infer ? [in class, or at home]

Back to Conceptual Design Now we know how to find more FDs, it’s easy Search for “bad” FDs If there are such, then decompose the table into two tables, repeat for the subtables. When done, the database schema is normalized Unfortunately, there are several normal forms…

Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Third Normal Form (3NF) = will discuss Boyce Codd Normal Form (BCNF) = will discuss Others...