Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Relational Algebra Basic operators Relational algebra expression tree.

Similar presentations


Presentation on theme: "1 Relational Algebra Basic operators Relational algebra expression tree."— Presentation transcript:

1 1 Relational Algebra Basic operators Relational algebra expression tree

2 2 Queries Find all courses that “Mary” takes What happens behind the scene ? –Query processor figures out how to answer the query efficiently. SELECT C.name FROM Students S, Takes T, Courses C WHERE S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cid

3 3 Queries, behind the scene Imperative query execution plan: SELECT C.name FROM Students S, Takes T, Courses C WHERE S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cid SELECT C.name FROM Students S, Takes T, Courses C WHERE S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cid Declarative SQL query Students Takes sid=sid sname name=“Mary” cid=cid Courses The optimizer chooses the best execution plan for a query

4 4 What is an “Algebra” A mathematical model consists of: –Operands --- variables or values from which new values can be constructed. –Operators --- symbols denoting procedures that construct new values from given values. i.e. the ability to produce new values.

5 5 What is Relational Algebra? Simply speaking, the relational algebra is a collection of operators that –take relations as their operands –and return a relation as their result. (closure properties allows nested expression!)

6 6 closure In mathematics, a set is said to be closed under some operation if the operation on members of the set produces a member of the set. For example, the real numbers are closed under subtraction. Similarly, a set is said to be closed under a collection of operations if it is closed under each of the operations individually. In relational algebra, we say the result of any operator is still a relation, thus in a closure, which is the foundation for sub-queries.

7 7 What is Relational Algebra? Its operands are relations or variables that represent relations. Comprises a basic set of relational model operations (represented by operators) which are designed to do the most common things that we need to do with relations in a database. –Sequences of relational algebra operations form relational algebra expressions whose result is also a relation. The result is an algebra that can be used as but not limited to a query language (at abstract level) for relations.

8 8 Basic idea There is a core relational algebra that has traditionally been thought of as the relational algebra. But there are several other operators we shall add to the core in order to support or model better the language SQL --- the principal language used in relational database systems.

9 9 Core Relational Algebra Two groups of operations –Set theoretic operations: Union, intersection, and difference. –Usual set operations, but require both operands have the same relation schema. –Relational model specific operations: Selection: picking certain rows. σ Projection: picking certain columns. π Products: cartesian product. * (also a kind of set operation, but more related to joins.) joins: compositions of relations. Division: / Pronounced Bow tie

10 10 Union, Intersection, Set difference operators These operations are binary: each is applied to two relations. In order to apply these operators to relations, the relations have to be union compatible. Two relations R(A1, …, An) and S(B1,…,Bn) are union compatible if they have same degree n and domain(Ai)=domain(Bi), 1<=1<=n.

11 11 Union, Intersection, Set difference operators In this part, we assume R and S are union compatible. Notation: –Union: R  S –Intersection R  S –Set difference: R  S

12 12 Union: R  S The result of R  S is a relation that includes all the tuples that are either in R or S or in both R and S. Duplicates are eliminated.

13 13 Union: R  S FNLN SusanShasha BillBush FnameLname SussanShasha MikeFord student instructor FNLN SusanShasha BillBush MikeFord Student  instructor

14 14 Intersection R  S The result of R  S is a relation that includes all the tuples that are in both R and S.

15 15 Intersection R  S FNLN SusanShasha BillBush FnameLname SussanShasha MikeFord student instructor FNLN SusanShasha Student  instructor

16 16 Set difference: R  S The result of R-S is a relation that includes all the tuples that are in R but not in S.

17 17 Set difference: R  S FNLN BillBush FnameLname MikeFord Instructor - student Student - instructor

18 18 Remarks Intersection and Union are commutative operations. –R  S = S  R –R  S = R  S Intersection and Union are associative operations. –R  (S  T) =( R  S)  T –R  (S  T) =( R  S)  T –Set difference is not a commutative operation. –In general, R-S !=S-R

19 19 Select operation (selection) The SELECT operation is used to select a subset of tuples from a relation that satisfy a selection condition. Form of the operation: σ C (R) –Where R is a relational algebra expression. The simplest expression is the name of a database relation. –The condition C is an arbitrary Boolean expression on the attributes of R.

20 20 The boolean expression is formed using clauses (atomic comparisons) of the form A op B or A op c, where A, B are attribute names, c is a constant, and op is one of the comparison operators {=, =,>,>=,!=} The resulting relation has the same attributes as R. The resulting relation includes only tuples in R whose attribute values satisfy the condition C.

21 21 select R1 <-σ C (R2) –C is a condition (as in “if” statements) that refers to attributes of R2. –R1 is all those tuples of R2 that satisfy C.

22 22 Example Relation tinyemployee: fnamelnamesalary JoeReale250 JoeBush275 SueGates250 BobBush300 AllBush<- σ lname=“Bush” (tinyemployee): fnamelnamesalary JoeBush275 Bob Bush300

23 23 Example σ ( DNO=4 and SALARY>2500) or (DNO=5 and salary>30000) (employee):

24 24 Remarks The degree (i.e. the number of attributes) of the resulting relation is the same as that of the input relation. The resulting relation is empty if no tuple satisfies the selection condition. The number of tuples of the resulting relation is less than or equal to that of the input relation. The select operation is commutative: σ C 2 (σ C1 (R)) =σ C 1 (σ C2 (R)) A sequence of select operations can be combined into a single select operation σ C 1 (σ C2 (R)) =σ C 1 and C2 (R)

25 25 Project operation (projection) The project operation selects some of the columns of a table while discarding the other columns. Form of operation: π L (R) –R is a relational algebra expression, –L is a list of attributes from the attributes of relation R. –The resulting relation has only those attributes of R specified in L and in the same order as they appear in the list. Eliminate duplicate tuples, if any so that is remains a mathematical set (set does not allow duplicated elements! However, we will have more discussion later about this point when we come to SQL’s query mechanism!).

26 26 Example Relation tinyemployee: fnamelnamesalary JoeReale250 JoeBush275 SueGates250 BobBush300 Lastnameandsalary<- π lname, salary (tinyemployee) lnamesalary Reale250 Bush275 Gates250 Bush300

27 27 Be aware that if several employees sharing the same last name and salary, only one tuple will be kept in the resulting relation for duplicated tuples.

28 28 Remarks The degree of the resulting relation is the number of attributes in the attribute list. The number of tuples of the resulting relation is less than or equal to that of the input relation: If the attribute list L is a superkey of R, the number of tuples in the resulting relation equals to the number of tuples in R. Commutativity does not hold on projection.

29 29 The Cartesian product operation The Cartesian product (Cross product, Cross join) is a binary operation and the input queries do not have to be union compatible. Notation: R x S Given R(A1,…, An) and S(B1,… Bm), the schema of the relation Q resulting by the operation RXS is: Q(A1,…An, B1, B….m). It has n+m attributes. The instance of Q has one tuple for each combination of tuples – one from R and one from S.

30 30 Cartesian Product R3 <- R1 x R2 –Pair each tuple t1 of R1 with each tuple t2 of R2. –Concatenation t1t2 is a tuple of R3. –Schema of R3 is the attributes of R1 and R2, in order. –But beware attribute A of the same name in R1 and R2: use R1.A and R2.A. Since relational model assume no two attributes with the same name can appear in a relation schema.

31 31 Example: R3 <- R1 X R2 R1(A,B ) 12 34 R2(B,C ) 56 78 910 R3(A,R1.B,R2.B,C ) 1256 1278 12910 3456 3478 34910

32 32 From Cartesian to join The Cartesian product is a meaningless operation on its own. It can combine related tuples from two relations if followed by the appropriate select operation (in which case it is called join operation). The join operation is used to select and combine related tuples from two relations into single tuples.

33 33 Theta-Join The most general form of the join operation is the Theta Join, which is similar to a Cartesian product followed by a selection. R3 <- R1 C R2 –Take the product R1 X R2. –Then apply SELECT C to the result. As for SELECT, C can be any boolean-valued condition. –Historic versions of this operator allowed only A theta B, where theta was =,, etc.; hence the name “theta-join.” –Theta join A join that links tables based on a relationship other than equality between two columns –Theta means unknown (not necessarily equal) as apposed to natural join here.

34 34 Example Sells(bar,beer,price )Bars(name,addr ) Joe’sBud2.50Joe’sMaple St. Joe’sMiller2.75Sue’sRiver Rd. Sue’sBud2.50 Sue’sCoors3.00 BarInfo <- Sells JOIN Sells.bar = Bars.name Bars BarInfo(bar,beer,price,name,addr ) Joe’sBud2.50Joe’sMaple St. Joe’sMiller2.75Joe’sMaple St. Sue’sBud2.50Sue’sRiver Rd. Sue’sCoors3.00Sue’sRiver Rd.

35 35 Natural Join A frequent type of join connects two relations by: –Equating attributes of the same name, and –Projecting out one copy of each pair of equated attributes. Called natural join. Denoted R3 <- R1 R2.

36 36 Example Sells(bar,beer,price )Bars(bar,addr ) Joe’sBud2.50Joe’sMaple St. Joe’sMiller2.75Sue’sRiver Rd. Sue’sBud2.50 Sue’sCoors3.00 BarInfo <- Sells Bars Note Bars.name has become Bars.bar to make the natural join “work.” BarInfo(bar,beer,price,addr ) Joe’sBud2.50Maple St. Joe’sMilller2.75Maple St. Sue’sBud2.50River Rd. Sue’sCoors3.00River Rd.

37 37 A join can be specified between a relation and itself (self-join). One can think of this as joining two distinct copies of the relation, although only one relation actually exists.

38 38 Inner joins or outer joins The join operations we described so far are called inner joins, where only matching tuples (through join condition or selection) are kept in the result. Sometimes, we need query results in which tuples from two tables are to be combined by matching corresponding rows, but without losing any tuples for lack of matching values. To meet this need, a set of operations, called outer joins are introduced here.

39 39 Outerjoin Suppose we join R JOIN C S. A tuple of R that has no tuple of S with which it joins is said to be dangling. –Similarly for a tuple of S. Outerjoin preserves dangling tuples by padding them with a special NULL symbol in the result.

40 40 Example: Outerjoin R = ABS =BC 1223 4567 (1,2) joins with (2,3), but the other two tuples are dangling. R OUTERJOIN S =ABC 123 45NULL NULL67

41 41 Three different kinds of outerjoin Full outerjoin left outerjoin Right outerjoin ABC 12 3 45 NULL ABC 12 3 6 7

42 42 Expressions in a Single Assignment There are multiple ways to express the same meaning in relational algebra. Example: the theta-join R3 <- R1 JOIN C R2 use a single theta join operator, it is equivalent to : R3 <- SELECT C (R1 X R2), which involves two operators, i.e. product and select.

43 43 Precedence of relational operators Precedence of relational operators: 1.Unary operators --- select, project, rename --- have highest precedence, bind first. 2.Then come products and joins. 3.Then intersection. 4.Finally, union and set difference bind last. wYou can always insert parentheses to force the order you desire.

44 44 Building Complex Expressions Algebras allow us to express sequences of operations in a natural way. –Example: in arithmetic --- (x + 4)*(y - 3). Relational algebra allows the same. Three notations, just as in arithmetic: 1.Sequences of assignment statements. 2.Expressions with several operators. 3.Expression trees.

45 45 Sequences of operations We can break complex relational algebra expressions into smaller ones by applying one or more operators at a time and naming the intermediate results to improve readability.

46 46 Expression Trees Leaves are operands --- either variables standing for relations or particular, constant relations. Interior nodes are operators, applied to their child or children.

47 47 Example Using the relations Bars(name, addr) and Sells(bar, beer, price), find the names of all the bars that are either on Maple St. or sell Bud for less than $3.

48 48 As a Tree: BarsSells SELECT addr = “Maple St.” SELECT price<3 AND beer=“Bud” PROJECT name RENAME R(name) PROJECT bar UNION

49 49 Example Using Sells(bar, beer, price), find the bars that sell two different beers at the same price. Strategy: by renaming, define a copy of Sells, called S(bar, beer1, price). The natural join of Sells and S consists of quadruples (bar, beer, beer1, price) such that the bar sells both beers at this price.

50 50 The Tree Sells RENAME S(bar, beer1, price) JOIN PROJECT bar SELECT beer != beer1

51 51 Relation Schemas ( i.e. relation structures) for Interior Nodes in expression tree An expression tree defines a schema for the relation associated with each interior node. Similarly, a sequence of assignments defines a schema for each relation on the left of the <- sign.

52 52 Schema-Defining Rules 1 For union, intersection, and difference, the schemas of the two operands must be the same, so use that schema for the result. Selection: schema of the result is the same as the schema of the operand. Projection: list of attributes tells us the schema.

53 53 Schema-Defining Rules 2 Product: the schema is the attributes of both relations. –Use R.A, etc., to distinguish two attributes named A. Theta-join: same as product. Natural join: use attributes of both relations. –Shared attribute names are merged.


Download ppt "1 Relational Algebra Basic operators Relational algebra expression tree."

Similar presentations


Ads by Google