Presentation is loading. Please wait.

Presentation is loading. Please wait.

Relational Data Model.

Similar presentations


Presentation on theme: "Relational Data Model."— Presentation transcript:

1 Relational Data Model

2 Relational Database A persistent collection of relations
Information about various kinds of objects (persons, places, things, events) Each relation holds information about various kinds of objects (persons, places, things, events, etc.) Each relation, or table, is characterized by a set of attributes or properties from a domain e.g., person(SSN, Name, Address, Phone) Each individual object, or record, is a tuple of values e.g., ( , Pat Carter, 12 Main, ) Set of records make up the relation, i.e., subset of the cross- product of the attributes’ domains Assume DB exists – in reality, create tables, insert tuples, etc. DBMS – manage storage, integrity/security, crash recovery, etc.

3 Relational Database Example
snap cr StudentID Name Address Phone 12345 C. Brown 12 Apple St. 67890 L. Van Pelt 34 Pear Ave. 22222 P. Patty 56 Grape Blvd. 33333 Snoopy Course Room CS101 Turing Aud. EE200 25 Ohm Hall PH100 Newton Lab. cp cdh csg Course Prerequisite CS101 CS100 EE200 EE005 CS120 CS121 CS205 CS206 Course Day Hour CS101 M 9AM W F EE200 Tu 10AM 1PM Th PH100 11AM Course StudentID Grade CS101 12345 A 67890 B EE200 C 22222 B+ 33333 A- PH100 C+

4 Relational Schemas Each table has a schema Example: Name
Set of attributes Domain for each attribute Example: Names: snap, cp, cdh, cr, csg Attributes: table headers Domains: studentID: integer all the rest are strings, but we could be more specific (e.g. time, day, grade)

5 Relational Tables Tables consist of n-tuples, where n is the arity or degree of the relation (i.e., the number of attributes) Each n-tuple t  D1  D2  …  Dn, where the Di’s are the domains e.g., a 3-tuple t of cdh is an element string  string  string or string  day  time or course  day  time depending on how specific we make our domains A table is a set of tuples, all with the same schema e.g., cdh  Dcourse  Dday  Dhour

6 Tables & Keys Because a table is a set of tuples, there are no duplicates There is always a set of attributes whose values uniquely identify a tuple (even if it is all of them) A set of attributes whose values always uniquely identify a tuple constitutes a key Typically, one or two attributes make up a key Keys must be declared: we cannot assume uniqueness e.g., Name is not a key; there could be another C. Brown Some systems add a tuple identifier as the key

7 Keys  Examples Table Key snap StudentID
Name, Address, Phone (possible key?) cp Course Prerequisite cdh Course Day Hour Can a course meet twice on the same day? If not: Course Day cr Course Room Does a course always meet in the same room? If so: Course csg Course StudentID

8 Predicates and Tuples A table name for tuples of arity n is an n-place predicate cdh('CS101','M','9AM') Asserts that CS101 meets on Monday at 9:00 am Predicates give each tuple a meaning in the ordinary sense of predicates The subset of D1  D2  …  Dn present in the database are those assigned T; all others are assigned F (CWA) Interpretation: Domain For each predicate and every substitution, T or F Every relation is a set, and every set is a predicate. Hence every relation is a predicate (and vice-versa)

9 Database Tuples Database tuples (strictly speaking) are not true subsets of D1  D2  …  Dn because we can alter the column order if we do so “correctly” More properly defined, a tuple in a relation is a set of attribute-value pairs e.g. {(Course, 'CS101'), (Day, 'M'), (Hour, '9AM') } = {(Day, 'M'), (Course, 'CS101'), (Hour, '9AM') } Normally, we factor out the attribute and fix the order Implication: we can interchange columns cr = Course Room = Room Course CS101 Turing Aud. Turing Aud. CS101 EE Ohm Hall 25 Ohm Hall EE200 PH100 Newton Lab. Newton Lab. PH100

10 Relational Algebra

11 Relational Algebra What is an algebra? What is relational algebra?
a pair: (set of values, set of operations)  ADT  type  Class  Object e.g., stack: (set of all stacks, {pop, push, top, …}) integer: (set of all integers, {+, -, *, }) What is relational algebra? (set of relations, set of relational operators) Operators: , , , , , , , ||

12 Relational Algebra is Closed
Closed: all operations produce values in the value set (reals, {+, *, })  closed (reals, {+, *, , })  not closed (divide by 0) (reals, {+, *, >})  not closed (T/F not in value set) (computer reals, {+, *, })  not closed (overflow, roundoff) (relations, relational operators)  closed Implication: we can always nest (or compose) relational operators; can’t for algebras that are not closed. When the syntax is incorrect, we can’t ask whether the operation produces a value in the value set. The preconditions for relational algebra can be checked syntactically (assuming we don’t use variables for attributes and values, which we don’t). Thus, saying it is “closed” should be proper. It is common to use variables for values in arithmetic, however, so the same cannot be said for divide by zero.

13 Set Operations: , , and 
Relations are sets; thus set operations work. Examples: R = A B 1 2 2 2 2 3 S = A B 2 2 2 3 4 2 5 5 RS = A B 1 2 2 2 2 3 4 2 5 5 RS = A B 2 2 2 3 RS = A B 1 2 SR = A B 4 2 5 5

14 Set Operations (cont’d)
Definition: schema(R) = {A, B} = AB, i.e. the set of attributes We sometimes write R(AB) to mean the relation R with schema AB. Definition: union compatible schema(R) = schema(S) required precondition for , ,  Definitions: R  S = { t | t  R  t  S} R  S = { t | t  R  t  S} R  S = { t | t  R  t  S}

15 Tuple Restriction: [X]
Restriction is a tuple operator (not a relational operator). t[X] restricts tuple t to the attributes in X. A B C t = t[A] = (1) t[AC] = (1,3)

16 Renaming:  ABR renames attribute A to be B in R Example: let
A must be in schema(R) B must not be in schema(R) Example: let R = A B 1 2 2 2 2 3 Q = A C 2 2 3 2 RQ = ? Not union compatible But with : RCBQ = A B 1 2 2 2 2 3 3 2 CBQ = A B 2 2 3 2

17 Renaming (cont’d) Q = ABR renames attribute A to B; the result is Q.
Precondition: A  schema(R) B  schema(R) Postcondition: schema(Q) = (schema(R)  {A})  {B} Q = {t' | t (tR  t' = (t – {(A, t[A])})  {(B, t[A])})} Q = ABR = {{(B,1), (C,2)} {(B,2), (C,2)}} R = {{(A,1), (C,2)} {(A,2), (C,2)}}

18 Selection:  The selection operation selects the tuples that satisfy a condition. R = A B 1 2 2 2 2 3 A=1R = A B 1 2 B=2R = A B 1 2 2 2 A=2B2R = A B 2 2 2 3 PR = { t | t  R  P(t) } A=3R = A B Note: empty, but still retains the schema Meaning: apply predicate P to tuple t by substituting into P appropriate t values. Precondition: each attribute mentioned in P must be in schema(R). Postcondition: PR = { t | t  R  P(t) } schema(PR) = schema(R)

19 Projection:  The projection operation restricts tuples in a relation to those designated in the operation. R = A B 1 2 2 2 2 3 AR = A 1 2 BR = B 2 3 ABR = R = A,BR = {A,B}R Q = A B C BCQ = B C 1 1 4 5 Precondition: X  schema(R) Postcondition: XR = { t' | t (t  R  t' = t[X]) } schema(XR) = X

20 Practice Exercises PE1 Using the following database, compute: Trail Name Characteristics Activity2 Description=Waterfall Recreational features Feature=Vista Characteristics Trails Name=North Fork and Forest-ID=2 Trails

21 Cross Product:  Standard cartesian product adapted for relational algebra R = A B 1 2 2 2 S = C D 1 1 2 2 3 3 R  S = A B C D

22 Cross Product (cont’d)
Precondition: schema(R)  schema(S) =  Postcondition: R  S = { t | t' t''(t' R  t'' S  t = t'  t'')} schema(R  S) = schema(R)  schema(S) R = A B 1 2 = t' 2 2 t' = { (A,1), (B,2) } S = C D 1 1 2 2 3 3 = t'' t'' = { (C,3), (D,3) } t'  t'' = { (A,1), (B,2), (C,3), (D,3) }

23 Cross Product (cont’d)
What if R and S have the same attribute, e.g. A? R = A B 1 2 2 2 S = C A 1 1 2 2 3 3 Can’t do cross product Solution: Rename AAS R  AAS = A B C A

24 Natural Join: || R || S = ABC (R  ) BB'S B=B' R = A B S = B C
1 2 2 2 S = B C 1 2 2 1 3 2 R || S = A B C R || S = ABC Projection (R  ) Cross Product A B 1 2 Renaming BB'S B' C 1 2 2 1 3 2 1 2 2 2 1 B=B' Selection

25 Natural Join (cont’d) In general, we can equate 0, 1, 2, or more attributes using || . A join is defined as: schema (R || S) = schema(R)  schema(S) R || S = {t | t[schema(R)]  R  t[schema(S)]  S} There are no preconditions  join always works.

26 Natural Join (cont’d) R = A B S = C D R || S = A B C D R = A B
0 attributes in common (full cross product) R = A B 1 1 2 3 4 1 S = C D 1 1 1 5 R || S = A B C D 1 attribute in common R = A B 1 2 2 2 2 3 S = B C 1 1 2 2 3 3 R || S = A B C 2 attributes in common R = A B C S = A B D R || S = A B C D

27 Practice Exercises PE1 Using the following database, compute: Department COURSE|x| Semester SECTION Department=Math COURSE |x| Instructor=Anderson SECTION Credit_hours=4 COURSE|x| Year=05 SECTION

28 Relational Algebra Expressions
Relational operators are closed. Thus we can nest expressions: R = A B 1 2 3 4 S = B C D DC=5(R || S) = A B C D = D 1 4 Unary operators have precedence over binary operators; binary operators are left associative. We can now do something very useful: ask and answer with relational algebra (almost) any query we can dream up.

29 Relational Algebra Queries
List the prerequisites for EE200. PrerequisiteCourse='EE200'cp = Prerequisite EE005 CS100 When does CS101 meet? Day,HourCourse='CS101'cdh = Day Hour M 9AM W 9AM F AM When and where does EE200 meet? Day,Hour,RoomCourse='EE200'(cdh || cr) = Day Hour Room Tu 10AM 25 Ohm Hall W 1PM Ohm Hall Th 10AM 25 Ohm Hall Our answers are in (cdh || cr). We select Course to be EE200. Then, project on Day, Hour, Room.

30 Practice Exercises PE1 Using the following database, write queries in relational algebra for: 1. Find the titles of all of the books published by New Moon Books 2. Find all of the publishers of Johnson White

31 Query Optimization (1) Where can I find Snoopy at 9 am on Monday?
StudentID Name'Snoopy' Address Phone Course StudentID Grade Course Room* Course Day'M' Hour'9AM' RoomName='Snoopy'  Day='M'  Hour='9AM' (snap || csg || cr || cdh) = Room Turing Aud. Can we rewrite the query more optimally?

32 Query Optimization (2) “Intuitively” we can write as
RoomName='Snoopy'  Day='M'  Hour='9AM' (snap || csg || cr || cdh) as Room(Name='Snoopy'snap || csg || cr || Day='M'  Hour='9AM'cdh) Why does this execute faster? What laws hold that will let us do this? R || S = S || R P1P2E = P1P2E P(R |×| S) = R || PS (if all the attributes of P are in S) How do we know they hold?

33 (Derive the right-hand side from the left-hand side.)
Proofs for Laws (1) (Derive the right-hand side from the left-hand side.) We can prove P1P2E = P1P2E as follows: P1P2E = {t | t  E  (P1P2)(t)} def.: PR = {t | tR  P(t)} = {t | t  E  P1(t)  P2(t)} identical substitutions & operations = {t | t  E  P2(t)  P1(t)} commutative = {t | t  P2E  P1(t)} def. of  = {t | t  P1P2E} def. of  = P1P2E def. of a relation

34 Proofs for Laws (2) To prove P(R || S) = R || PS, where all attributes of P are in S, we again need to prove that two sets are equal. As before, we can convert the lhs to the rhs. P(R || S) = {t | t  P(R || S)} def. of a relation = {t | t  R || S  P(t)} def.: PR={t | tRP(t)} = {t | t[schema(R)]  R  t[schema(S)]  S  P(t)} def.: R||S={t | t[schema(R)]Rt[schema(S)]S} = {t | t[schema(R)]  R  t[schema(S)]  S  P(t[schema(S)])} all attributes of P are in S = {t | t[schema(R)]  R  t[schema(S)]  PS} def. of  = {t | t  R || PS} def. of || = R || PS def. of a relation

35 Deductive Databases

36 Deductive Databases Deductive database management systems answer queries based on methods related to proofs from rules and facts. Relational algebra operations can be used to more efficiently process deductive database queries.

37 Correspondence Between Relational DBs & Deductive DBs (I)
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(y,x). b(x,y) :- f(x,z),f(z,y). Queries: b(1,x)? f(A, B) b(C, D) b  b  xCyDxy(AyBx f ) b  b  xCyDxy(AxBzf || AzByf) xDxC=1b

38 Correspondence Between Relational DBs & Deductive DBs (II)
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(y,x). b(x,y) :- f(x,z),f(z,y). Queries: b(1,x)? f(A, B) b(C, D) b  b  xCyDxy(AyBx f ) b  b  xCyDxy(AxBzf || AzByf) xDxC=1b

39 Correspondence Between Relational DBs & Deductive DBs (III)
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(y,x). b(x,y) :- f(x,z),f(z,y). Queries: b(1,x)? f(A, B) b(C, D) b  b  xCyDxy(AyBx f ) b  b  xCyDxy(AxBzf || AzByf) xDxC=1b

40 Correspondence Between Relational DBs & Deductive DBs (IV)
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(y,x). b(x,y) :- f(x,z),f(z,y). Queries: b(1,x)? f(A, B) b(C, D) b  b  xCyDxy(AyBx f ) b  b  xCyDxy(AxBzf || AzByf) xDxC=1b x123 or x = x = 2 x = 3

41 Correspondence Between Relational DBs & Deductive DBs (V)
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(y,x). b(x,y) :- f(y,y),f(2,x). Queries: b(1,x)? f(A, B) b(C, D) b  b  xCyDxy(AyBx f ) b  b  xCyDxy(AyAA=Bf || BxBA=2f) xDxC=1b

42 Correspondence Between Relational DBs & Deductive DBs (VI)
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(y,x). b(x,y) :- f(y,y),f(2,x). Queries: b(1,x)? f(A, B) b(C, D) b  b  xCyDxy(AyBx f ) b  b  xCyDxy(AyAA=Bf || BxBA=2f) xDxC=1b

43 Correspondence Between Relational DBs & Deductive DBs (VII)
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(y,x). b(x,y) :- f(y,y),f(2,x). Queries: b(1,x)? f(A, B) b(C, D) b  b  xCyDxy(AyBx f ) b  b  xCyDxy(AyAA=Bf || BxBA=2f) xDxC=1b x 1 or x = 1

44 Correspondence Between Relational DBs & Deductive DBs (VIII)
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(y,x). b(x,y) :- f(y,y),f(2,x). Queries: b(x,x)? f(A, B) b(C, D) b  b  xCyDxy(AyBx f ) b  b  xCyDxy(AyAA=Bf || BxBA=2f) xCxCC=Db x 1 or x = 1

45 Rule Order & Recursive Rules
Rules may appear in any order. Rules may be recursive. To handle this, we repeatedly execute rules in any order and quit when we deduce no new facts (least fixed point algorithm).

46 Least Fixed Point Algorithm
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(x,y). b(x,y) :- f(x,z),b(z,y). Queries: b(1,3)? Least Fixed Point Algorithm Start with an empty set S Add to S repeatedly until no more changes. Essentially: b  { } repeat: b  b  f b  b  (f || b) until no change result  “yes” if not empty: C=1D=3b

47 Set Up Before unioning, sets must be union compatible.
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(x,y). b(x,y) :- f(x,z),b(z,y). Queries: b(1,3)? Set Up Before unioning, sets must be union compatible. Before joining, attributes must observe variable sameness and differences. b  { } repeat: b  b  ACBDf b  b  CDAC(f || CBb) until no change result  “yes” if not empty: C=1D=3b

48 1st Iteration b  { } repeat: b  b  ACBDf
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(x,y). b(x,y) :- f(x,z),b(z,y). Queries: b(1,3)? b  { } repeat: b  b  ACBDf b  b  CDAC(f || CBb) until no change result  “yes” if not empty: C=1D=3b b(C,D)  b(C,D)  f(A,B) ACBDf 1 1 1 2 2 3 b(C,D) f(C,D) 1 1 1 2 2 3 CBb CDACt b(C,D)  b(C,D)  f(A,B) || b(C,D) 1 1 1 2 2 3 1 3 1 1 1 2 2 3 f(A,B) || b(B,D) = t(A,B,D) t(C,B,D) b changed! So, repeat.

49 2nd Iteration b  { } repeat: b  b  ACBDf
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(x,y). b(x,y) :- f(x,z),b(z,y). Queries: b(1,3)? b  { } repeat: b  b  ACBDf b  b  CDAC(f || CBb) until no change result  “yes” if not empty: C=1D=3b b(C,D)  b(C,D)  f(A,B) f hasn’t changed  so no change here CBb CDACt b(C,D)  b(C,D)  f(A,B) || b(C,D) 1 1 1 2 2 3 1 3 1 1 1 2 2 3 1 3 f(A,B) || b(B,D) = t(A,B,D) t(C,B,D) duplicate b did NOT change, so done.

50 Query Finalization b  { } repeat: b  b  ACBDf
Schemes: f(A,B) b(C,D) Facts: f(1,1). f(1,2). f(2,3). Rules: b(x,y) :- f(x,y). b(x,y) :- f(x,z),b(z,y). Queries: b(1,3)? b  { } repeat: b  b  ACBDf b  b  CDAC(f || CBb) until no change result  “yes” if not empty: C=1D=3b So, finally… b(1,3)? C=1D=3b(C,D)? = C,D = not empty = Yes 1 2 2 3 1 3 b(2,2)? No b(1,X)? Yes X = 1 X = 2 X = 3 b(2,X)? Yes X = 3


Download ppt "Relational Data Model."

Similar presentations


Ads by Google