Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Relational Algebra

Similar presentations


Presentation on theme: "The Relational Algebra"— Presentation transcript:

1 The Relational Algebra
Textbook Ch © 1999 UW CSE 2/23/2019

2 Overview A non-procedural language Operations on whole relations
Inputs are sets, output is a set Can nest arbitrarily complex expressions Basis for SQL 2/23/2019

3 The RA Operators SELECT , PROJECT 
on a single relation Set union , intersection , difference  on pair of compatible relations Cartesian product  and various flavors of JOIN on any pair of relations Division  sort of the inverse of product Aggregate functions (unofficial) group tuples of one relation, then operate 2/23/2019

4 Set Operations Set union , intersection , difference 
on compatible relations: corresponding columns have same types Note there is no "set complement"! What would the complement of a table be?? Databases tend to operate on a "closed world" assumption: "If it's not in the DB, it doesn't exist." In Prolog: "If you can't prove it, it isn't true." 2/23/2019

5 condition(relationname)
Select Unary operation Select a subset of tuples from a relation, based upon a condition use AND, OR, NOT for compound conditions Result: a table with same attributes as original: a proper subset may be given a name (temporary) Notation: condition(relationname) 2/23/2019

6 col-list(relationname)
Project Unary operation Select a subset of columns Result: a table with same number of rows as original not actually a subset of the original (unlike ) may be given a name (temporary) Notation: col-list(relationname) 2/23/2019

7 Join A binary operation on relations Notation (these slides):
R1JN join-condition R2 Result is a whole relation All the columns of R1 followed by all the columns of R2 General description: a  followed by a . The  condition equates or otherwise relates common attributes between the two relations Often a superfluous common attribute is removed 2/23/2019

8 "Equi" and "Natural" join Equi-join: Common attributes are compared for equality no need to specify a join condition could join on more than one attribute need to list attributes if names are not the same superfluous column of 2nd relation removed "Natural join": Equi-join on all common fields Notation (some texts): R1 * attr-lists R2 When people say “join” natural join is usually what they mean. 2/23/2019

9 Semi Joins (not in book)
Left semi-join of R1 and R2: Each row of R1 is including in the output (with nulls in R2 columns if necessary), even if there are no matches in R2 for that row. Right semi-join: Each row of R2 is included, etc. Quiz: suppose R1 and R2 have no matches in their common columns. What is the cardinality of: Cartesian product; natural join; left, right semi-joins? 2/23/2019

10 Division: R1  R2 Sort of the reverse of Cartesian product
Like integer division in that any "remainder" is discarded Main idea: find all the tuples in R1 which are joined to all the values in R2 the R2 attributes are discarded from the output Same thing can be accomplished with combination of , ,  Can you figure out how? 2/23/2019

11 Division Details of R = R1R2
R1 (dividend): attribute set X  Y, |R1| rows R2 (divisor): attribute set Y, |R2| rows R (quotient): attribute set X, i.e., the attributes of R1 not in R2 at most |R1|/|R2| rows A row is in the answer (R) if that row (X attributes) occurs in R1 with each combination of the rows (Y attributes) of R2. 2/23/2019

12 Division Examples Who would have lots to talk about with Bessie? "Find (all) customers who have rented (all) the same movies as Bessie has." What airlines compete with Horizon Air? "Find the airlines which serve a city also served by Horizon" (not a division query). Which airline is best positioned to put Horizon Air out of business? "Find the airlines which fly to (all) the cities served by Horizon" (a division query). 2/23/2019

13 Aggregate Functions (not in book)
Technically, not part of R.A. Actual query languages will implement many of these (Usually) unary operators, take a whole relation and compute a value COUNT, AVERAGE, MAX, MIN Result is returned as a relation with one row and one column i.e., not as a scalar number 2/23/2019

14 Grouping and Aggregates
Rows may be grouped based on attribute values Think of it as a sort on those attributes Aggregate functions can be applied to the grouped relation Computes a value for each group Result returned as a relation with: one row for each group (unique value), one column for each aggregate function 2/23/2019

15 Grouping Notation and Example
<grouping attributes>  <agg. function list> (relation) "List number of employees and average salary for each department" 2/23/2019

16 Relational Completeness
A query language is "relationally complete" if any R.A. query can be expressed in it. Does not mean all R.A. operations are available The concept is analogous, but not identical, to Turing-completeness Commercial query languages are mostly much more powerful than the R.A. Aggregates, grouping, etc. We'll look later at relational calculus and QBE, which are also relationally complete 2/23/2019

17 Notice Something Missing?
Yes: iteration/recursion This is typical of relational languages Some fairly natural queries are impossible “Find all the children of X” -- no problem “Find all the grandchilden of X” -- still OK “Find all the descendents of X” -- no way! This is an example of a "transitive closure" Solution: embed the query in a procedural language (COBOL, C, etc). 2/23/2019

18 Looking Ahead: Query Processing
Order of operations affects efficiency Example: (R1) *(R2) probably much faster than (R1 * R2) result set is the same Large joins can be particularly taxing Ideally, we do not let this affect how we write queries! Smart DBMSs do "query optimization" automatically reorder operations for efficiency 2/23/2019


Download ppt "The Relational Algebra"

Similar presentations


Ads by Google