1 Lecture 06 The Relational Data Model. 2 Outline Relational Data Model Functional Dependencies FDs in ER Logical Schema Design Reading Chapter 8.

Slides:



Advertisements
Similar presentations
Schema Refinement: Normal Forms
Advertisements

Lecture 6: Design Constraints and Functional Dependencies January 21st, 2004.
1 Lecture 02: Conceptual Design Wednesday, October 6, 2010 Dan Suciu -- CSEP544 Fall 2010.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Lecture 10: Database Design XML Wednesday, October 20, 2004.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
1 Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design April 16 & 18, 2008.
Lecture #3 Functional Dependencies Normalization Relational Algebra Thursday, October 12, 2000.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #7 Matthew P. Johnson Stern School of Business, NYU Spring,
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #4 M.P. Johnson Stern School of Business, NYU Spring, 2008.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #5 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
Boyce-Codd NF & Lossless Decomposition Professor Sin-Min Lee.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #6 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
Functional Dependencies CS 186, Spring 2006, Lecture 21 R&G Chapter 19 Science is the knowledge of consequences, and dependence of one fact upon another.
Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such.
Functional Dependencies and Relational Schema Design.
The Relational Data Model Database Model (ODL, E/R) Relational Schema Physical storage ODL definitions Diagrams (E/R) Tables: row names: attributes rows:
1 Lecture 4: Database Modeling (end) The Relational Data Model April 8, 2002.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
E/R Diagrams and Functional Dependencies. Modeling Subclasses The world is inherently hierarchical. Some entities are special cases of others We need.
Lecture 08: E/R Diagrams and Functional Dependencies.
1 Schema Design & Refinement (aka Normalization).
Lecture 2: E/R Diagrams and the Relational Model Thursday, January 4, 2001.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Lecture 09: Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
1 Lecture 7: Normal Forms, Relational Algebra Monday, 10/15/2001.
Tallahassee, Florida, 2015 COP4710 Database Systems Relational Design Fall 2015.
1 Database Systems Lecture #4 Yan Pan School of Software, SYSU 2011.
Lectures 5 & 7: Design Theory Lectures 5 & 7. Announcements Homework #1 due today! Homework was not easy You learned a new, declarative way of programming!
1 Lecture 10: Database Design Wednesday, January 26, 2005.
Functional Dependencies and Relational Schema Design.
Lecture 3: Conceptual Database Design and Schema Design April 12 th, 2004.
Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003.
1 Lecture 10: Database Design and Relational Algebra Monday, October 20, 2003.
1 Lecture 08: E/R Diagrams and Functional Dependencies Friday, January 21, 2005.
CS542 1 Schema Refinement Chapter 19 (part 1) Functional Dependencies.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
COMP 430 Intro. to Database Systems Normal Forms Slides use ideas from Chris Ré.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 19.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 Lecture 9: Database Design Wednesday, January 25, 2006.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
The Relational Data Model Database Model (ODL, E/R) Relational Schema Physical storage ODL definitions Diagrams (E/R) Tables: row names: attributes rows:
Lecture 11: Functional Dependencies
Problems in Designing Schema
Lecture 6: Design Theory
Lecture 2: Database Modeling (end) The Relational Data Model
Lecture 06 Data Modeling: E/R Diagrams
Cse 344 May 16th – Normalization.
Functional Dependencies and Relational Schema Design
Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design October 12 & 15, 2007.
Lecture 09: Functional Dependencies, Database Design
Lecture 8: Database Design
Lecture 07: E/R Diagrams and Functional Dependencies
Lecture 5: The Relational Data Model
CSE544 Data Modeling, Conceptual Design
Terminology Product Attribute names Name Price Category Manufacturer
Lecture 08: E/R Diagrams and Functional Dependencies
Lecture 6: Functional Dependencies
Lecture 11: Functional Dependencies
Lecture 09: Functional Dependencies
Presentation transcript:

1 Lecture 06 The Relational Data Model

2 Outline Relational Data Model Functional Dependencies FDs in ER Logical Schema Design Reading Chapter 8

3 The Relational Data Model Data Modeling Data Modeling Relational Schema Relational Schema Physical storage Physical storage E/R diagrams Tables: column names: attributes rows: tuples Complex file organization and index structures. Have seen this in SQL Have seen this too Discuss next

4 Terminology Name Price Category Manufacturer gizmo $19.99 gadgets GizmoWorks Power gizmo $29.99 gadgets GizmoWorks SingleTouch $ photography Canon MultiTouch $ household Hitachi Tuples or rows or records Attribute names Table name or relation name Products:

5 Schemas Relational Schema: –Relation name plus attribute names –E.g. Product(Name, Price, Category, Manufacturer) –In practice we add the domain for each attribute Database Schema –Set of relational schemas –E.g. Product(Name, Price, Category, Manufacturer), Company(Name, Address, Phone), This is all mathematics, not to be confused with SQL tables !

6 Instances Relational schema = R(A 1,…,A k ): Instance = relation with k attributes (of “type” R) –values of corresponding domains Database schema = R 1 (…), R 2 (…), …, R n (…) Instance = n relations, of types R 1, R 2,..., R n

7 Example Name Price Category Manufacturer gizmo $19.99 gadgets GizmoWorks Power gizmo $29.99 gadgets GizmoWorks SingleTouch $ photography Canon MultiTouch $ household Hitachi Relational schema:Product(Name, Price, Category, Manufacturer) Instance:

8 First Normal Form (1NF) A database schema is in First Normal Form if all tables are flat NameGPACourses Alice3.8 Bob3.7 Carol3.9 Math DB OS DB OS Math OS Student NameGPA Alice3.8 Bob3.7 Carol3.9 Student Course Math DB OS StudentCourse AliceMath CarolMath AliceDB BobDB AliceOS CarolOS Takes Course May need to add keys

9 Functional Dependencies A form of constraint –hence, part of the schema Finding them is part of the database design Also used in normalizing the relations Warning: this is the most abstract, and “hardest” part of the course.

10 Functional Dependencies Definition: If two tuples agree on the attributes then they must also agree on the attributes Formally: A 1, A 2, …, A n  B 1, B 2, …, B m A 1, A 2, …, A n B 1, B 2, …, B m

11 Examples EmpID  Name, Phone, Position Position  Phone but Phone  Position EmpIDNamePhonePosition E0045Smith1234Clerk E1847John9876Salesrep E1111Smith9876Salesrep E9999Mary1234Lawyer

12 In General To check A  B, erase all other columns check if the remaining relation is many-one (called functional in mathematics)

13 Example EmpIDNamePhonePosition E0045Smith1234Clerk E1847John9876Salesrep E1111Smith9876Salesrep E9999Mary1234Lawyer

14 Typical Examples of FDs Product: name  price, manufacturer Person: ssn  name, age Company: name  stockprice, president

15 Example Product(name, category, color, department, price) name  color category  department color, category  price name  color category  department color, category  price Consider these FDs: What do they say ?

16 Example FD’s are constraints: On some instances they hold On others they don’t namecategorycolordepartmentprice GizmoGadgetGreenToys49 TweakerGadgetGreenToys99 Does this instance satisfy all the FDs ? name  color category  department color, category  price name  color category  department color, category  price

17 Example namecategorycolordepartmentprice GizmoGadgetGreenToys49 TweakerGadgetBlackToys99 GizmoStationaryGreenOffice-supp.59 What about this one ? name  color category  department color, category  price name  color category  department color, category  price

18 Example If some FDs are satisfied, then others are satisfied too If all these FDs are true: name  color category  department color, category  price name  color category  department color, category  price Then this FD also holds: name, category  price Why ??

19 Inference Rules for FD’s Is equivalent to Splitting rule and Combing rule A1...AmB1...Bm A 1, A 2, …, A n  B 1, B 2, …, B m A 1, A 2, …, A n  B 1 A 1, A 2, …, A n  B A 1, A 2, …, A n  B m A 1, A 2, …, A n  B 1 A 1, A 2, …, A n  B A 1, A 2, …, A n  B m

20 Inference Rules for FD’s (continued) Trivial Rule Why ? A1A1 …AmAm where i = 1, 2,..., n A 1, A 2, …, A n  A i

21 Inference Rules for FD’s (continued) Transitive Closure Rule If and then Why ? A 1, A 2, …, A n  B 1, B 2, …, B m B 1, B 2, …, B m  C 1, C 2, …, C p A 1, A 2, …, A n  C 1, C 2, …, C p

22 A1A1 …AmAm B1B1 …BmBm C1C1...CpCp

23 Example (continued) Start from the following FDs: Infer the following FDs: 1. name  color 2. category  department 3. color, category  price 1. name  color 2. category  department 3. color, category  price Inferred FD Which Rule did we apply ? 4. name, category  name 5. name, category  color 6. name, category  category 7. name, category  color, category 8. name, category  price

24 Example (continued) Answers: Inferred FD Which Rule did we apply ? 4. name, category  name Trivial rule 5. name, category  color Transitivity on 4, 1 6. name, category  category Trivial rule 7. name, category  color, category Split/combine on 5, 6 8. name, category  price Transitivity on 3, 7 1. name  color 2. category  department 3. color, category  price 1. name  color 2. category  department 3. color, category  price

25 Another Example Enrollment(student, major, course, room, time) student  major major, course  room course  time What else can we infer ?

26 Another Rule If then Augmentation follows from trivial rules and transitivity How ? A 1, A 2, …, A n  B A 1, A 2, …, A n, C 1, C 2, …, C p  B Augmentation

27 Problem: infer ALL FDs Given a set of FDs, infer all possible FDs How to proceed ? Try all possible FDs, apply all 3 rules –E.g. R(A, B, C, D): how many FDs are possible ? Drop trivial FDs, drop augmented FDs –Still way too many Better: use the Closure Algorithm (next)

28 Closure of a set of Attributes Given a set of attributes A 1, …, A n The closure, {A 1, …, A n } +, is the set of attributes B s.t. A 1, …, A n  B Given a set of attributes A 1, …, A n The closure, {A 1, …, A n } +, is the set of attributes B s.t. A 1, …, A n  B name  color category  department color, category  price name  color category  department color, category  price Example: Closures: name + = {name, color} {name, category} + = {name, category, color, department, price} color + = {color}

29 Closure Algorithm Start with X={A1, …, An}. Repeat until X doesn’t change do: if B 1, …, B n  C is a FD and B 1, …, B n are all in X then add C to X. Start with X={A1, …, An}. Repeat until X doesn’t change do: if B 1, …, B n  C is a FD and B 1, …, B n are all in X then add C to X. {name, category} + = {name, category, color, department, price} name  color category  department color, category  price name  color category  department color, category  price Example:

30 Example Compute {A,B} + X = {A, B, } Compute {A, F} + X = {A, F, } R(A,B,C,D,E,F) A, B  C A, D  E B  D A, F  B A, B  C A, D  E B  D A, F  B

31 Using Closure to Infer ALL FDs A, B  C A, D  B B  D A, B  C A, D  B B  D Example: Step 1: Compute X +, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ = ABCD, AC+ = AC, AD+ = ABCD ABC+ = ABD+ = ACD + = ABCD (no need to compute– why ?) BCD + = BCD, ABCD+ = ABCD A+ = A, B+ = BD, C+ = C, D+ = D AB+ = ABCD, AC+ = AC, AD+ = ABCD ABC+ = ABD+ = ACD + = ABCD (no need to compute– why ?) BCD + = BCD, ABCD+ = ABCD Step 2: Enumerate all FD’s X  Y, s.t. Y  X + and X  Y =  : AB  CD, AD  BC, ABC  D, ABD  C, ACD  B

32 Problem: find FD from the data Given a database instance Find all FD’s satisfied by that instance Useful if we don’t get enough information from our users: need to reverse engineer a data instance

33 In Class: Find All FDs StudentDeptCourseRoom AliceCSEC++020 BobCSEC++020 AliceEEHW040 CarolCSEDB045 DanCSEJava050 ElsaCSEDB045 FrankEECircuits020 Do all FDs make sense in practice ?

34 Answer Course  Dept, Room Dept, Room  Course Student, Dept  Course, Room Student, Course  Dept, Room Student, Room  Dept, Course Course  Dept, Room Dept, Room  Course Student, Dept  Course, Room Student, Course  Dept, Room Student, Room  Dept, Course Do all FDs make sense in practice ?

35 Keys A key is a set of attributes A 1,..., A n s.t. for any other attribute B, we have A 1,..., A n  B A minimal key is a set of attributes which is a key and for which no subset is a key Note: book calls them superkey and key

36 Computing Keys Compute X + for all sets X If X + = all attributes, then X is a key List only the minimal keys Note: there can be many minimal keys ! Example: R(A,B,C), AB  C, BC  A Minimal keys: AB and BC

37 Examples of Keys Product(name, price, category, color) name, category  price category  color Keys are: {name, category} and all supersets Enrollment(student, address, course, room, time) student  address room, time  course student, course  room, time Keys are:

38 FD’s for E/R Diagrams Given a relation constructed from an E/R diagram, what is its key? Rule 1: If the relation comes from an entity set, the key of the relation is the set of attributes which is the key of the entity set. address namessn Person Person(address, name, ssn)

39 FD’s for E/R Diagrams Person buys Product name pricenamessn buys(name, ssn, date) date Rule 2: If the relation comes from a many-many relationship, the key of the relation is the set of all attribute keys in the relations corresponding to the entity sets

40 FD’s for E/R Diagrams Except: if there is an arrow from the relationship to E, then we don’t need the key of E as part of the relation key. Purchase Product Person Store CreditCard name card-no ssn sname Purchase(name, sname, ssn, card-no)

41 FD’s for E/R Diagrams More rules: Many-one, one-many, one-one relationships Multi-way relationships Weak entity sets (Try to find them yourself, or check book)

42 FD’s for E/R Diagrams Say: “the CreditCard determines the Person” Purchase Product Person Store CreditCard name card-no ssn sname Purchase(name, sname, ssn, card-no) Incomplete (what does it say ?) card-no  ssn

43 Relational Schema Design (or Logical Design) Main idea: Start with some relational schema Find out its FD’s Use them to design a better relational schema

44 Data Anomalies When a database is poorly designed we get anomalies: Redundancy: data is repeated Update anomalies: need to change in several places Delete anomalies: may lose data when we don’t want

45 Relational Schema Design Anomalies: Redundancy = repeat data Update anomalies = Fred moves to “Bellevue” Deletion anomalies = Joe deletes his phone number: what is his city ? Example: Persons with several phones SSN  Name, City NameSSNPhoneNumberCity Fred Seattle Fred Seattle Joe Westfield but not SSN  PhoneNumber

46 Relation Decomposition Break the relation into two: NameSSNCity Fred Seattle Joe Westfield SSNPhoneNumber Anomalies have gone: No more repeated data Easy to move Fred to “Bellevue” (how ?) Easy to delete all Joe’s phone number (how ?) NameSSNPhoneNumberCity Fred Seattle Fred Seattle Joe Westfield

47 Relational Schema Design Person buys Product name pricenamessn Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies

48 Decompositions in General R 1 = projection of R on A 1,..., A n, B 1,..., B m R 2 = projection of R on A 1,..., A n, C 1,..., C p R(A 1,..., A n, B 1,..., B m, C 1,..., C p ) R 1 (A 1,..., A n, B 1,..., B m ) R 2 (A 1,..., A n, C 1,..., C p )

49 Decomposition Sometimes it is correct: NamePriceCategory Gizmo19.99Gadget OneClick24.99Camera Gizmo19.99Camera NamePrice Gizmo19.99 OneClick24.99 Gizmo19.99 NameCategory GizmoGadget OneClickCamera GizmoCamera Lossless decomposition

50 Incorrect Decomposition Sometimes it is not: NamePriceCategory Gizmo19.99Gadget OneClick24.99Camera Gizmo19.99Camera NameCategory GizmoGadget OneClickCamera GizmoCamera PriceCategory 19.99Gadget 24.99Camera 19.99Camera What’s incorrect ?? Lossy decomposition

51 Decompositions in General R(A 1,..., A n, B 1,..., B m, C 1,..., C p ) If A 1,..., A n  B 1,..., B m Then the decomposition is lossless R 1 (A 1,..., A n, B 1,..., B m ) R 2 (A 1,..., A n, C 1,..., C p ) Example: name  price, hence the first decomposition is lossless Note: don’t need necessarily A 1,..., A n  C 1,..., C p

52 Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Third Normal Form (3NF) = this lecture Boyce Codd Normal Form (BCNF) = this lecture Others...

53 Boyce-Codd Normal Form A simple condition for removing anomalies from relations: In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, should determine all the attributes of R. A relation R is in BCNF if: If A 1,..., A n  B is a non-trivial dependency in R, then {A 1,..., A n } is a key for R A relation R is in BCNF if: If A 1,..., A n  B is a non-trivial dependency in R, then {A 1,..., A n } is a key for R

54 BCNF Decomposition Algorithm A’s Others B’s R1R1 Is there a 2-attribute relation that is not in BCNF ? Repeat choose A 1, …, A m  B 1, …, B n that violates the BNCF condition split R into R 1 (A 1, …, A m, B 1, …, B n ) and R 2 (A 1, …, A m, [others]) continue with both R 1 and R 2 Until no more violations R2R2

55 Example What are the dependencies? SSN  Name, City What are the keys? {SSN, PhoneNumber} Is it in BCNF? NameSSNPhoneNumberCity Fred Seattle Fred Seattle Joe Westfield Joe Westfield

56 Decompose it into BCNF NameSSNCity Fred Seattle Joe Westfield SSNPhoneNumber SSN  Name, City Let’s check anomalies: Redundancy ? Update ? Delete ?

57 Summary of BCNF Decomposition Find a dependency that violates the BCNF condition: A’s Others B’s R1R2 Heuristics: choose B, B, … B “as large as possible” 12m Decompose: Is there a 2-attribute relation that is not in BCNF ? Continue until there are no BCNF violations left. A 1, A 2, …, A n  B 1, B 2, …, B m

58 Example Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN  name, age age  hairColor Decompose in BCNF (in class): Step 1: find all keys (How ? Compute S +, for various sets S) Step 2: now decompose

59 Other Example R(A,B,C,D) A  B, B  C Key: AD Violations of BCNF: A  B, A  C, A  BC Pick A  BC: split into R1(A,BC) R2(A,D) What happens if we pick A  B first ?

60 Lossless Decompositions A decomposition is lossless if we can recover: R(A,B,C) R1(A,B) R2(A,C) R’(A,B,C) should be the same as R(A,B,C) R’ is in general larger than R. Must ensure R’ = R Decompose Recover

61 Lossless Decompositions Given R(A,B,C) s.t. A  B, the decomposition into R1(A,B), R2(A,C) is lossless

62 3NF: A Problem with BCNF Unit Company Product Unit Company Unit Product FD’s: Unit  Company; Company, Product  Unit So, there is a BCNF violation, and we decompose. Unit  Company No FDs Notice: we loose the FD: Company, Product  Unit

63 So What’s the Problem? Unit Company Product Unit CompanyUnit Product Galaga99 UW Galaga99 databases Bingo UW Bingo databases No problem so far. All local FD’s are satisfied. Let’s put all the data back into a single table again: Galaga99 UW databases Bingo UW databases Violates the dependency: company, product -> unit!

64 Solution: 3rd Normal Form (3NF) A simple condition for removing anomalies from relations: A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } is a super-key for R, or B is part of a key. A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A 1, A 2,..., A n  B for R, then {A 1, A 2,..., A n } is a super-key for R, or B is part of a key. Tradeoff: BCNF = no anomalies, but may lose some FDs 3NF = keeps all FDs, but may have some anomalies