Download presentation
Presentation is loading. Please wait.
Published byCameron Greer Modified over 9 years ago
1
Lu Chaojun, SJTU 1 Extended Relational Algebra
2
Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple more than once –There is no specified order (unlike a list). Select, project, and join work for bags as well as sets. –Just work on a tuple-by-tuple basis, and don't eliminate duplicates. Lu Chaojun, SJTU 2
3
Why Bags? Efficient implementation –e.g. projection, union –Q: How to eliminate duplicates? Some queries use bags –e.g. Aggregate Find the average grades Lu Chaojun, SJTU 3
4
Bag Union R S: Sum the times an element appears in the two bags, i.e. if t appears n/m times in R/S, then t appears n+m times in R S. Example { 1,2, 1} { 1,2, 3} = { 1,1,1,2,2, 3}. 4 Lu Chaojun, SJTU
5
Bag Intersection R S: Take the minimum of the number of occurrences in each bag, i.e. t appears min(n,m) times in R S. Example { 1,2, 1} { 1,2, 3,3} = { 1,2 }. 5 Lu Chaojun, SJTU
6
Bag Difference R S: Proper-subtract the number of occurrences in the two bags, i.e. t appears max(0, n m) times in R S. Example { 1,2, 1} { 1,2, 3,3} = { 1 }. 6 Lu Chaojun, SJTU
7
Other Operators on Bags Projection, selection, product, join –No duplicate elimination 7 Lu Chaojun, SJTU
8
Extensions to Relational Model Not a part of the formal relational model, but appear in real query languages like SQL. –Modification: insert, delete, update. –Aggregation: count, sum, average –Views –Null values 8 Lu Chaojun, SJTU
9
Extended RA Duplicate-elimination operator Sorting operator Extended projection Grouping-and-aggregation operator Outerjoin operator 9 Lu Chaojun, SJTU
10
Duplicate Elimination ( R) = relation with one copy of each tuple that appears one or more times in R. 10 Lu Chaojun, SJTU
11
Aggregation Operators These are not relational operators; rather they summarize a column in some way. Five standard operators: Sum, Average, Count, Min, and Max. 11 Lu Chaojun, SJTU
12
Grouping Operator L (R), where L is a list of elements that are either –Individual ( grouping) attributes or –Of the form (A), where is an aggregation operator and A the attribute to which it is applied. Example sno,AVG(grade) (SC) 12 Lu Chaojun, SJTU
13
Grouping Operator(cont.) L (R) is computed by: 1. Group R according to all the grouping attributes on list L. 2. Within each group, compute (A), for each element (A) on list L. 3. Result is the relation that consists of one tuple for each group. The components of that tuple are the values associated with each element of L for that group. 13 Lu Chaojun, SJTU
14
Extended Projection Allow the columns in the projection to be functions of one or more columns in the argument relation. Example name,2011 age (Student) 14 Lu Chaojun, SJTU
15
Sorting L (R) = list of tuples of R, ordered according to attributes on list L. Note that result type is outside the normal types (set or bag) for relational algebra. –Consequence: cannot be followed by other relational operators. 15 Lu Chaojun, SJTU
16
Outerjoin The normal join can lose information, because a tuple that doesn't join with any from the other relation becomes dangling. The null value can be used to pad dangling tuples so they appear in the join. Outerjoin operator: o Variations: theta-outerjoin, left- and right- outerjoin (pad only dangling tuples from the left (resp., right). 16 Lu Chaojun, SJTU
17
A Logic for Relations Datalog Lu Chaojun, SJTU 17
18
Introduction A query language for relational model may be based on –Algebra: relational algebra –Logic: relational calculus e.g. Datalog More natural for recursive queries 18 Lu Chaojun, SJTU
19
Predicates and Atoms RDB vs. Datalog RDB Datalog relation R( ) predicate R( ) attributes(tuples) arguments x schema R(X) (relational)atom R(x) tuple t R R(t) is TRUE –R(x) is a boolean-valued function if x contains variables; proposition otherwise. 19 Lu Chaojun, SJTU
20
Arithmetic Atoms Comparison between two arithmetic expressions exp1 exp2 –Predicate (exp1,exp2) –infinite and unchanging relation 20 Lu Chaojun, SJTU
21
Datalog Rules Example Happy(sno) S(sno,n,a,d) AND SC(sno,cno,g) AND g>=95 AND C(cno,cn) AND cn=‘Database’ Rules: Head Body –Head: relational atom –Body: AND of subgoals Subgoal: atom or NOT atom Atom: P(arg), P is relation name or arithmetic predicate; arg may be variable or constant – : if Or :- 21 Lu Chaojun, SJTU
22
Datalog Rules (cont.) Query: a collection of one or more rules Result: a relation appearing in rule heads –Designate the intended answer when there are more than one relation in rule heads 22 Lu Chaojun, SJTU
23
Meaning of Datalog Rules Meaning I: –Assign possible values to variables in the rule –If the assignment makes all the subgoals TRUE, then it forms a tuple of the result relation. Meaning II: –Consider consistent assignment of tuples for each nonnegated, relational subgoals. (see safety) –Then consider the negated, relational subgoals and the arithmetic subgoals, to see if the assignment of values to variables makes them all TRUE. If yes, a tuple is added to the result relation. 23 Lu Chaojun, SJTU
24
Example: Meaning I S(x,y) R(x,z) AND R(z,y) AND NOT R(x,y) Consider all possible assignments: R: A B 1. x=1, z=2 make R(x,z) TRUE 1 2 y=3 make R(z,y) TRUE 2 3 NOT R(x,y) TRUE thus add (1,3) to S; S: C D 2. x=2, z=3 make R(x,z) TRUE 1 3 no y make R(z,y) TRUE 24 Lu Chaojun, SJTU
25
Example: Meaning II S(x,y) R(x,z) AND R(z,y) AND NOT R(x,y) Consider consistent assignment of tuples: R: A B 1. t 1 for R(x,z), t 1 for R(z,y) t 1 1 2 2. t 1 for R(x,z), t 2 for R(z,y) t 2 2 3 3. t 2 for R(x,z), t 1 for R(z,y) 4. t 2 for R(x,z), t 2 for R(z,y) S: C D 1 3 only case 2 is a consistent assignment 25 Lu Chaojun, SJTU
26
Safety Every variable in the rule must appear in some nonnegated relational subgoal. To make the result a finite relation. Example: safety violation 1. S(x) R(y) x not in subgoal 2. S(x) NOT R(x) x not in nonnegated subgoal 3. S(x) R(y) AND x < y x not in relational subgoal 26 Lu Chaojun, SJTU
27
Datalog Program -- Query A collection of rules Predicates/Relations are divided into two classes: –Extensional Relations/Predicates: stored in DB –Intensional Relations/Predicates: defined by rules EDB predicates can’t appear in the head, only in body; IDB predicates can appear in head, body, or both. 27 Lu Chaojun, SJTU
28
Datalog Rules Applied to Bags When there are no negated relational subgoals: –Meaning I for evaluating Datalog rules applies to bags as well as sets –But for bags, Meaning II is simpler for evaluating. When there are negated relational subgoals: –There is not a clearly defined meaning under the bag model. 28 Lu Chaojun, SJTU
29
From RA to Datalog R S I(x) R(x) AND S(x) R S I(x) R(x) I(x) S(x) R S I(x) R(x) AND NOT S(x) A (R) I(a) R(a,b) 29 Lu Chaojun, SJTU
30
From RA to Datalog(cont.) F (R) I(x) R(x) AND F C1 AND C2 (R) I(x) R(x) AND C1 AND C2 C1 OR C2 (R) I(x) R(x) AND C1 I(x) R(x) AND C2 R S I(x,y) R(x) AND S(y) R S I(x,y,z) R(x,y) AND S(y,z) 30 Lu Chaojun, SJTU
31
Multiple Operations in Datalog Create IDB predicates for intermediate relations Example A(x,y,z) R(x,y,z) AND x > 10 B(x,y,z) R(x,y,z) AND y = ‘ok’ C(x,y,z) A(x,y,z) AND B(x,y,z) D(x,z) C(x,y,z) 31 Lu Chaojun, SJTU
32
Expressive Power of Datalog Non-recursive Datalog = RA Datalog simulates SQL SELECT-FROM- WHERE without aggregation and grouping Recursive Datalog is more powerful than RA and SQL None is full in expressive power (Turing completeness) 32 Lu Chaojun, SJTU
33
End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.