1 The Relational Data Model Tables Schemas Conversion from E/R to Relations Functional Dependencies
2 A Relation is a Table namemanf WinterbrewPete’s Bud LiteAnheuser-Busch Beers Attributes (column headers) Tuples (rows)
3 Schemas uRelation schema = relation name and attribute list. wOptionally: types of attributes. wExample: Beers(name, manf) or Beers(name: string, manf: string) uDatabase = collection of relations. uDatabase schema = set of all relation schemas in the database.
4 Why Relations? uVery simple model. uOften matches how we think about data. uAbstract model that underlies SQL, the most important database language today.
5 From E/R Diagrams to Relations Simplest approach (not always best): convert each E.S. to a relation and each relationship to a relation. Entity Set Relation E.S. attributes become relational attributes. Becomes: Beers(name, manf) Beers namemanf
6 Keys in Relations An attribute or set of attributes K is a key for a relation R if we expect that in no instance of R will two different tuples agree on all the attributes of K. uIndicate a key by underlining the key attributes. Example: If name is a key for Beers : Beers(name, manf)
7 E/R Relationships Relations Relation has attribute for key attributes of each E.S. that participates in the relationship. uAdd any attributes that belong to the relationship itself. uRenaming attributes OK. wEssential if multiple roles for an E.S.
8 Relationship -> Relation DrinkersBeers Likes Likes(drinker, beer) Favorite Favorite(drinker, beer) Married husband wife Married(husband, wife) name addr name manf Buddies 1 2 Buddies(name1, name2) For one-one relation Married, we can choose either husband or wife as key.
9 Combining Relations uOK to combine into one relation: 1.The relation for an entity-set E 2.The relations for many-one relationships of which E is the “many.” uExample: Drinkers(name, addr) and Favorite(drinker, beer) combine to make Drinker1(name, addr, favBeer).
10 Risk with Many-Many Relationships uCombining Drinkers with Likes would be a mistake. It leads to redundancy, as: name addr beer Sally 123 Maple Bud Sally 123 Maple Miller Redundancy
11 Handling Weak Entity Sets uRelation for a weak E.S. must include its full key (i.e., attributes of related entity sets) as well as its own attributes. uA supporting (double-diamond) relationship yields a relation that is actually redundant and should be deleted from the database schema.
12 Example LoginsHostsAt name Hosts(hostName, location) Logins(loginName, hostName, billTo) At(loginName, hostName, hostName2) Must be the same billTo At becomes part of Logins location
13 Subclasses: Three Approaches 1.Object-oriented : One relation per subset of subclasses, with all relevant attributes. 2.Use nulls : One relation; entities have NULL in attributes that don’t belong to them. 3.E/R style : One relation for each subclass: wKey attribute(s). wAttributes of that subclass.
14 Example Beers Ales isa name manf color
15 Object-Oriented namemanf BudAnheuser-Busch Beers name manfcolor Summerbrew Pete’sdark Ales Good for queries like “find the color of ales made by Pete’s.”
16 E/R Style namemanf Bud Anheuser-Busch Summerbrew Pete’s Beers name color Summerbrew dark Ales Good for queries like “find all beers (including ales) made by Pete’s.”
17 Using Nulls namemanf color Bud Anheuser-Busch NULL Summerbrew Pete’s dark Beers Saves space unless there are lots of attributes that are usually NULL.
18 Functional Dependencies Meaning of FD’s Keys and Superkeys Inferring FD’s
19 Functional Dependencies uX -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X, then they must also agree on the attribute A. wSay “X -> A holds in R.” wConvention: …, X, Y, Z represent sets of attributes; A, B, C,… represent single attributes. wConvention: no set formers in sets of attributes, just ABC, rather than {A,B,C }.
20 Example Drinkers(name, addr, beersLiked, manf, favBeer) uReasonable FD’s to assert: 1.name -> addr 2.name -> favBeer 3.beersLiked -> manf
21 Example Data nameaddr beersLiked manffavBeer JanewayVoyager Bud A.B.WickedAle JanewayVoyager WickedAle Pete’sWickedAle SpockEnterprise Bud A.B.Bud Because name -> addr Because name -> favBeer Because beersLiked -> manf
22 FD’s With Multiple Attributes uNo need for FD’s with > 1 attribute on right. wBut sometimes convenient to combine FD’s as a shorthand. wExample: name -> addr and name -> favBeer become name -> addr favBeer u > 1 attribute on left may be essential. wExample: bar beer -> price
23 Keys of Relations uK is a superkey for relation R if K functionally determines all of R. uK is a key for R if K is a superkey, but no proper subset of K is a superkey (Minimality)
24 Example Drinkers(name, addr, beersLiked, manf, favBeer) u {name, beersLiked} is a superkey because together these attributes determine all the other attributes. wname -> addr favBeer wbeersLiked -> manf
25 Example, Cont. u{name, beersLiked} is a key because neither {name} nor {beersLiked} is a superkey. wname doesn’t -> manf; beersLiked doesn’t -> addr. uThere are no other keys, but lots of superkeys. wAny superset of {name, beersLiked}.
26 E/R and Relational Keys uKeys in E/R concern entities. uKeys in relations concern tuples. uUsually, one tuple corresponds to one entity, so the ideas are the same. uBut --- in poor relational designs, one entity can become several tuples, so E/R keys and Relational keys are different.
27 Example Data nameaddr beersLiked manffavBeer JanewayVoyager Bud A.B.WickedAle JanewayVoyager WickedAle Pete’sWickedAle SpockEnterprise Bud A.B.Bud Relational key = {name beersLiked} But in E/R, name is a key for Drinkers, and beersLiked is a key for Beers. Note: 2 tuples for Janeway entity and 2 tuples for Bud entity.
28 Where Do Keys Come From? 1.Just assert a key K. wThe only FD’s are K -> A for all attributes A. 2.Assert FD’s and deduce the keys by systematic exploration. wE/R model gives us FD’s from entity-set keys and from many-one relationships.
29 More FD’s From “Physics” or Organization Policy uExample: “no two courses can meet in the same room at the same time” tells us: hour room -> course.
30 Inferring FD’s uWe are given FD’s X 1 -> A 1, X 2 -> A 2,…, X n -> A n, and we want to know whether an FD Y -> B must hold in any relation that satisfies the given FD’s. wExample: If A -> B and B -> C hold, surely A -> C holds, even if we don’t say so. uImportant for design of good relation schemas.
31 Inference Test this is important because … uWhen we talk about improving relational designs, we often need to ask “does this FD hold in this relation?” uGiven FD’s X1 A1, X2 A2,…, Xn An, does FD Y B necessarily hold in the same relation? wStart by assuming two tuples agree in Y. wUse given FD’s to infer other attributes on which they must agree. wIf B is among them, then yes, else no.
32 Closure Test uAn easier way to test is to compute the closure of Y, denoted Y +. uBasis: Y + = Y. uInduction: Look for an FD’s left side X that is a subset of the current Y +. If the FD is X -> A, add A to Y +.
33 Y+Y+ new Y + XA
34 Closure Test -- Example A B, BC D. wA+ = AB. wC+=C. w(AC)+ = ABCD.
35 Given Versus Implied FD’s Typically, we state a few FD’s that are known to hold for a relation R. uOther FD’s may follow logically from the given FD’s; these are implied FD’s. uWe are free to choose any basis for the FD’s of R – a set of FD’s that imply all the FD’s that hold for R.
36 Finding All Implied FD’s uMotivation: “normalization,” the process where we break a relation schema into two or more schemas. uSuppose we have a relation ABCD with some FD’s F. If we decide to decompose ABCD into ABC and AD, what are the FD’s for ABC, AD? uExample: F = AB C, C D, D A. wIt looks like just AB C holds in ABC, but in fact C A follows from F and applies to relation ABC.
37 Why? a1b1ca1b1c ABC ABCD a2b2ca2b2c Thus, tuples in the projection with equal C’s have equal A’s; C -> A. a 1 b 1 cd 1 a 2 b 2 cd 2 comes from d 1 =d 2 because C -> D a 1 =a 2 because D -> A
38 Basic Idea 1.Start with given FD’s and find all nontrivial FD’s that follow from the given FD’s. wtrivial = right side member of left side. A -> A or AB ->A 2.Restrict to those FD’s that involve only attributes of the projected schema.
39 Simple, Exponential Algorithm 1.For each set of attributes X, compute X +. 2.Add X ->A for all A in X + - X. 3.However, drop XY ->A if X ->A holds. uBecause XY ->A follows from X ->A. 4.Finally, use only FD’s involving projected attributes.
40 A Few Tricks uNo need to compute the closure of the empty set or of the set of all attributes. uIf we find X + = all attributes, so is the closure of any superset of X.
41 Example uABC with FD’s A ->B and B ->C. Project onto AC. wA + =ABC ; yields A ->B, A ->C. We do not need to compute AB + or AC +. wB + =BC ; yields B ->C. wC + =C ; yields nothing. wBC + =BC ; yields nothing.
42 Example --- Continued uResulting FD’s: A ->B, A ->C, and B ->C. uProjection onto AC : A ->C. wOnly FD that involves a subset of {A,C }.
43 Example Relation ABCD FDs: F = AB C, C D, D A. What FD’s follow? uA + = A; B + =B (nothing). uC + =ACD (add C A). uD + =AD (nothing new). u(AB) + =ABCD (add AB D; skip all supersets of AB). u(BC) + =ABCD (nothing new; skip all supersets of BC). u(BD) + =ABCD (add BD C; skip all supersets of BD). u(AC) + =ACD; (AD) + =AD; (CD) + =ACD (nothing new). u(ACD) + =ACD (nothing new). uAll other sets contain AB, BC, or BD, so skip. uThus, the only interesting FD’s that follow from F are: C A, AB D, BD C.
44 Example 2 uSet of FD’s in ABCGHI: A B A C CG H CG I B H uCompute (CG) +, (BG) +, (AG) +
45 Example 3 In ABC with FD’s A B, B C, project onto AC. 1.A + = ABC; yields A B, A C. 2.B + = BC; yields B C. 3.AB + = ABC; yields AB C; drop in favor of A C. 4.AC + = ABC yields AC B; drop in favor of A B. 5.C + = C and BC + = BC; adds nothing. uResulting FD’s: A B, A C, B C. uProjection onto AC: A C.
46 A Geometric View of FD’s uImagine the set of all instances of a particular relation. uThat is, all finite sets of tuples that have the proper number of components. uEach instance is a point in this space.
47 Example: R(A,B) {(1,2), (3,4)} {} {(1,2), (3,4), (1,3)} {(5,1)}
48 An FD is a Subset of Instances uFor each FD X -> A there is a subset of all instances that satisfy the FD. uWe can represent an FD by a region in the space. uTrivial FD = an FD that is represented by the entire space. wExample: A -> A.
49 Example: A -> B for R(A,B) {(1,2), (3,4)} {} {(1,2), (3,4), (1,3)} {(5,1)} A -> B
50 Representing Sets of FD’s uIf each FD is a set of relation instances, then a collection of FD’s corresponds to the intersection of those sets. wIntersection = all instances that satisfy all of the FD’s.
51 Example A->B B->C CD->A Instances satisfying A->B, B->C, and CD->A
52 Implication of FD’s uIf an FD Y -> B follows from FD’s X 1 -> A 1,…,X n -> A n, then the region in the space of instances for Y -> B must include the intersection of the regions for the FD’s X i -> A i. wThat is, every instance satisfying all the FD’s X i -> A i surely satisfies Y -> B. wBut an instance could satisfy Y -> B, yet not be in this intersection.
53 Example A->B B->C A->C