1 Database Design: DBS CB, 2 nd Edition Relational Algebra: Basic Operations & Algebra of Bags Ch. 5.

Slides:



Advertisements
Similar presentations
1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Advertisements

1 Relational Algebra Basic Operations Algebra of Bags.
IS698: Database Management Min Song IS NJIT. The Relational Data Model.
1 Relational Algebra 1 Basic Operations. 2 What is an “Algebra” uMathematical system consisting of: wOperands --- variables or values from which new values.
1 Relational Algebra* and Tuple Calculus * The slides in this lecture are adapted from slides used in Standford's CS145 course.
Tallahassee, Florida, 2014 COP4710 Database Systems Relational Algebra Fall 2014.
SQL Queries Principal form: SELECT desired attributes FROM tuple variables –– range over relations WHERE condition about tuple variables; Running example.
1 Relational Algebra. Motivation Write a Java program Translate it into a program in assembly language Execute the assembly language program As a rough.
Winter 2002Arthur Keller – CS 1806–1 Schedule Today: Jan. 22 (T) u SQL Queries. u Read Sections Assignment 2 due. Jan. 24 (TH) u Subqueries, Grouping.
SQL CSET 3300.
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
Algebraic and Logical Query Languages Spring 2011 Instructor: Hassan Khosravi.
1 Lecture 12: Further relational algebra, further SQL
Relational Operations on Bags Extended Operators of Relational Algebra.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Oct 28, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
Relational Algebra on Bags A bag is like a set, but an element may appear more than once. –Multiset is another name for “bag.” Example: {1,2,1,3} is a.
Winter 2002Arthur Keller – CS 1805–1 Schedule Today: Jan. 17 (TH) u Relational Algebra. u Read Chapter 5. Project Part 1 due. Jan. 22 (T) u SQL Queries.
Relational Operations on Bags Extended Operators of Relational Algebra.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
1 Relational Algebra Basic operators Relational algebra expression tree.
1 Relational Algebra Operators Expression Trees Bag Model of Data.
Nov 18, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
Murali Mani Relational Algebra. Murali Mani What is Relational Algebra? Defines operations (data retrieval) for relational model SQL’s DML (Data Manipulation.
Relational Algebra Basic Operations Algebra of Bags.
Databases 1 First lecture. Informations Lecture: Monday 12:15-13:45 (3.716) Practice: Thursday 10:15-11:45 (2-519) Website of the course:
SCUHolliday6–1 Schedule Today: u SQL Queries. u Read Sections Next time u Subqueries, Grouping and Aggregation. u Read Sections And then.
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
RELATIONAL DATA MODEL 1. 2 What is a Data Model? 1.Mathematical representation of data. wExamples: relational model = tables; semistructured model = trees/graphs.
1 Relational Algebra Basic Operations Algebra of Bags.
“Core” Relational Algebra A small set of operators that allow us to manipulate relations in limited but useful ways. The operators are: 1.Union, intersection,
Databases : Relational Algebra 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof. Jeffrey D. Ullman distributes.
From Professor Ullman, Relational Algebra.
Database Management Systems Chapter 5 The Relational Algebra Instructor: Li Ma Department of Computer Science Texas Southern University, Houston October,
1 Relational Algebra Operators Expression Trees. 2 What is an “Algebra” uMathematical system consisting of: wOperands --- variables or values from which.
1 Lecture 2 Relational Algebra Based on
Winter 2002Judy Cushing5–1 Schedule Today: Jan. 23 (Wed) u Relational Algebra. u Read Chapter 5. u Project Part 2 due Friday 5pm, Judy’s mailbox in Lab.
Chapter 5 Notes. P. 189: Sets, Bags, and Lists To understand the distinction between sets, bags, and lists, remember that a set has unordered elements,
SCUHolliday - coen 1785–1 Schedule Today: u Relational Algebra. u Read Chapter 5 to page 199. Next u SQL Queries. u Read Sections And then u Subqueries,
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Advanced Relational Algebra & SQL (Part1 )
Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof.
More Relation Operations 2015, Fall Pusan National University Ki-Joune Li.
More Relation Operations 2014, Fall Pusan National University Ki-Joune Li.
1 Algebra of Queries Classical Relational Algebra It is a collection of operations on relations. Each operation takes one or two relations as its operand(s)
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
Relational Algebra BASIC OPERATIONS 1 DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA.
Lu Chaojun, SJTU 1 Extended Relational Algebra. Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple.
1 Relational Algebra Basic Operations Algebra of Bags.
1 Introduction to Database Systems, CS420 Relational Algebra.
1. Chapter 2: The relational Database Modeling Section 2.4: An algebraic Query Language Chapter 5: Algebraic and logical Query Languages Section 5.1:
Summary of Relational Algebra
Basic Operations Algebra of Bags
CPSC-310 Database Systems
Slides are reused by the approval of Jeffrey Ullman’s
COP4710 Database Systems Relational Algebra.
Chapter 3: Relational Model III
CS 440 Database Management Systems
IST 210: Organization of Data
Operators Expression Trees Bag Model of Data
CPSC-310 Database Systems
Instructor: Mohamed Eltabakh
More Relation Operations
Basic Operations Algebra of Bags
Schedule Today: Next And then Relational Algebra.
Instructor: Zhe He Department of Computer Science
Select-From-Where Statements Multirelation Queries Subqueries
Presentation transcript:

1 Database Design: DBS CB, 2 nd Edition Relational Algebra: Basic Operations & Algebra of Bags Ch. 5

2 What is an “Algebra” Mathematical system consisting of:  Operands --- variables or values from which new values can be constructed  Operators --- symbols denoting procedures that construct new values from given values

3 What is Relational Algebra? An algebra whose operands are relations or variables that represent relations Operators are designed to do the most common things that we need to do with relations in a database  The result is an algebra that can be used as a query language for relations

4 Relational Operations on Bags vs on Sets Bag is a relation with relaxed conditions – allows duplicates while sets would not Some relational operations are much more efficient on Bags rather than on sets:  Projection on relation as set: we need to compare each projected tuple with all other projected tuples to eliminate duplicates Set is a special case of a Bag --- will cover Bags in details later in this session

5 Core Relational Algebra Union, intersection, and difference  Usual set operations, but both operands must have the same relation schema Selection: picking certain rows Projection: picking certain columns Products and joins: compositions of relations Renaming of relations and attributes

6 Union, Intersection, and Difference Assume R and S are Bags of the same schema (attributes). Assume tuple t appears n times in R and m times in S:  Bag Union (R U S), tuple t appears (n+m) times. Set Union, tuple t appears once  Bag intersection (R ∩ S), tuple t appears min(n, m) times. Set intersection, tuple t appears 0 or 1 times  Bag difference (R – S), tuple t appears max(0, n-m) times. Set difference, tuple t appears 0 or 1 times

7 Relational Algebra: Basic Operations On Sets

8 Selection on Sets R1 := σ C (R2)  C is a condition (as in “if” statements) that refers to attributes of R2 (i.e., col1 > 10)  R1 is all those tuples of R2 that satisfy C

9 Example: Selection Relation Sells: barbeer price Joe’s Bud 2.50 Joe’s Miller 2.75 Sue’sBud 2.50 Sue’sMiller 3.00 JoeMenu := σ bar=“Joe’s” (Sells): barbeerprice Joe’sBud2.50 Joe’sMiller2.75

10 Projection on Sets R1 := π L (R2)  L is a list of attributes from the schema of R2  R1 is constructed by looking at each tuple of R2, extracting the attributes on list L, in the order specified, and creating from those components a tuple for R1  Eliminate duplicate tuples, if any

11 Example: Projection Relation Sells: barbeer price Joe’sBud 2.50 Joe’sMiller 2.75 Sue’sBud 2.50 Sue’sMiller 3.00 Prices := π beer,price (Sells): beer price Bud 2.50 Miller 2.75 Miller 3.00

12 Extended Projection Using the same π L operator, we allow the list L to contain arbitrary expressions involving attributes: 1. Arithmetic on attributes, e.g., A+B  C 2. Duplicate occurrences of the same attribute

13 Example: Extended Projection R = ( A B ) π A+B  C,A,A (R) = C A1 A

14 Product (Cartesian Product) on Sets R3 := R1 Χ R2  Pair each tuple t 1 of R1 with each tuple t 2 of R2  Concatenation t 1 t 2 is a tuple of R3  Schema of R3 is the attributes of R1 and then R2, in order  If R1 has n tuples, R2 has m tuples, then R3 will have nxm tuples  But beware to access attribute A of the same name in R1 and R2: use R1.A and R2.A

15 Example: R3 := R1 Χ R2 R1( A, B ) R2( B, C ) R3( A,R1.B,R2.B,C )

16 Join (Logical Join) Inner Join:  Cross Join: cartesian product  Equi-Join: cross join with equality predicates only  Natural Join: cross join with union of the attributes of the two relations  Theta Join: like natural join but we apply a boolean-valued condition Outer Join:  Left Outer Join (left join): for every tuple on left relation, join with every tuple on the right relation and if none matches the condition return a tuple with left side and NULLs for the right side relation  Right Outer Join (right join): opposite of the left join  Full Outer Join (full join): union of left join and right join Self Join: joining table to itself

17 Natural Join A useful join variant (natural join) connects two relations by:  Equating attributes of the same name, and  Projecting out one copy of each pair of equated attributes Denoted R3 := R1 ⋈ R2

18 Example: Natural Join Sells( bar,beer,price ) Bars( bar, addr ) Joe’sBud2.50Joe’sMaple St. Joe’sMiller2.75Sue’sRiver Rd. Sue’sBud2.50 Sue’sCoors3.00 BarInfo := Sells ⋈ Bars Note: Bars.bar has become Sells.bar to make the natural join “work” BarInfo(bar, beer, price, addr ) Joe’sBud 2.50 Maple St. Joe’sMiller 2.75 Maple St. Sue’sBud 2.50 River Rd. Sue’sCoors 3.00 River Rd.

19 Theta-Join R3 := R1 ⋈ C R2  Take the product R1 Χ R2  Then apply σ C to the result As for σ, C can be any boolean-valued condition Historic versions of this operator allowed only A  B, where  is =, <, etc.; hence the name “theta- join”

20 Example: Theta Join - 1 Sells( bar, beer,price ) Bars( name, addr ) Joe’sBud 2.50 Joe’s Maple St. Joe’sMiller 2.75 Sue’s River Rd. Sue’sBud 2.50 Sue’sCoors 3.00 BarInfo := Sells ⋈ Sells.bar = Bars.name Bars Note: Sells.bar has become Bars.name to make the theta join “work” BarInfo( bar, beer, price, name, addr ) Joe’sBud2.50 Joe’s Maple St. Joe’sMiller2.75 Joe’s Maple St. Sue’sBud2.50 Sue’s River Rd. Sue’sCoors3.00 Sue’s River Rd.

21 Example: Theta Join - 2 Sells( bar, beer,price ) Bars( name, addr ) Joe’sBud 2.50 Joe’s Maple St. Joe’sMiller 2.75 Sue’s River Rd. Sue’sBud 2.50 Sue’sCoors 3.00 BarInfo := Sells ⋈ Sells.bar = Bars.name AND Sells.price < 2.75 Bars Note: Sells.bar has become Bars.name to make the theta join “work” BarInfo( bar, beer, price, name, addr ) Joe’sBud2.50 Joe’s Maple St. Sue’sBud2.50 Sue’s River Rd.

22 Renaming The ρ operator gives a new schema to a relation R1 := ρ R1(A1,…,An) (R2) makes R1 be a relation with attributes A1,…,An and the same tuples as R2 Simplified notation: R1(A1,…,An) := R2 Or even simpler notation: R1 := ρ R1 (R2)

23 Example: Renaming Bars( name, addr ) Joe’s Maple St. Sue’s River Rd. R( bar, addr ) Joe’s Maple St. Sue’s River Rd. R(bar, addr) := Bars

24 Aggregation Operators These are operators that apply to both Sets and Bags of numbers or strings Examples include:  SUM: sum of a column with numeric values  AVG: average of a column with numeric values  MIN and MAX of a column with numeric values  COUNT: number of values returned from a column

25 Grouping Operator -  L (R) Often we do not want to aggregate on the entire column values, but rather we want to aggregate on a subset (group) of the values of one or more columns  L (R) is a grouping operator that returns group of tuples having one value to the attributes in List L. Assume relation Orders (OrderId, OrderDate, OrderPrice, Customer) Assume you want to group by customer and return total orders and date of first order by every Customer  Customer, SUM(OrderPrice), MIN(OrderDate) (Orders) S elect Customer, SUM(OrderPrice), MIN(OrderDate) F rom Orders G ROUP BY Customer;

26 Sort Operator -  L (R) Often we want to sort the tuples of a relation (R) or sort the returned result relation (S) by one or more of the relevant relation’s attributes  L (R) where R is a relation and L is a list of some of R’s attributes, is a relation R but with tuples sorted in order indicated by L Assume relation Orders (OrderId, OrderDate, OrderPrice, Customer) Assume you want to group by customer and return total orders and date of first order by every Customer  OrderDate (Orders) S elect * F rom Orders O RDER BY OrderDate ASC|DESC;

27 Building Complex Expressions Combine operators with parentheses and precedence rules Three notations, just as in arithmetic: 1. Sequences of assignment statements 2. Expressions with several operators 3. Expression trees

28 Sequences of Assignments Create temporary relation names Renaming can be implied by giving relations a list of attributes Example: R3 := R1 ⋈ C R2 can be written: R4 := R1 Χ R2 R3 := σ C (R4)

29 Expressions in a Single Assignment Example: the theta-join R3 := R1 ⋈ C R2 can be written: R3 := σ C (R1 Χ R2) Precedence of relational operators: 1. [ σ, π, ρ ] (highest) 2. [ Χ, ⋈ ] 3. ∩ 4. [ ∪, — ]

30 Expression Trees Leaves are operands --- either variables standing for relations or particular, constant relations Interior nodes are operators, applied to their child or children (subexpressions). You might want to use parentheses to indicate grouping of operands

31 Example: Expression Tree for a Query Using the relations Bars(name, addr) and Sells(bar, beer, price), find the names of all the bars that are either on Maple St. or sell Bud for less than $3

32 As an Expression Tree: BarsSells σ addr = “Maple St.” σ price<3 AND beer=“Bud” π name ρ R(name) π bar ∪

33 Example: Self-Join Using Sells(bar, beer, price), find the bars that sell two different beers at the same price Strategy: by renaming, define a copy of Sells, called S(bar, beer1, price). The natural join of Sells and S consists of quadruples (bar, beer, beer1, price) such that the bar sells both beers at this price

34 The Expression Tree Sells ρ S(bar, beer1, price) ⋈ π bar σ beer != beer1

35 Schemas for Results Union, intersection, and difference: the schemas of the two operands must be the same, so use that schema for the result Selection: schema of the result is the same as the schema of the operand Projection: list of attributes tells us the schema

36 Schemas for Results --- (2) Product: schema is the attributes of both relations  Use R.A, etc., to distinguish two attributes named A Natural join: union of the attributes of the two relations Theta-join: same as product with applying a condition Renaming: the operator tells the schema

37 Relational Algebra: Basic Operations On Bags

38 Relational Algebra on Bags A bag (or multiset ) is like a set, but an element may appear more than once Example: {1,2,1,3} is a bag Example: {1,2,3} is also a bag that happens to be a set

39 Why Bags? SQL, the most important query language for relational databases, is actually a bag language Some operations, like projection, are more efficient on bags than sets

40 Operations on Bags Selection applies to each tuple, so its effect on bags is like its effect on sets Projection also applies to each tuple, but as a bag operator, we do not eliminate duplicates Products and joins are done on each pair of tuples, so duplicates in bags have no effect on how we operate

41 Example: Bag Selection R(A, B ) σ A+B < 5 (R) =A B 1 2

42 Example: Bag Projection R( A, B ) π A (R) =A 1 5 1

43 Example: Bag Product R( A, B ) S( B, C ) R Χ S = AR.BS.B C

44 Example: Bag Theta-Join R( A, B ) S( B, C ) R ⋈ R.B<S.B S = AR.B S.B C

45 Bag Union An element appears in the union of two bags is the sum of the number of times it appears in each bag Example: {1,2,1} ∪ {1,1,2,3,1} = {1,1,1,1,1,2,2,3}

46 Bag Intersection An element appears in the intersection of two bags is the minimum of the number of times it appears in either Example: {1,2,1,1} ∩ {1,2,1,3} = {1,1,2}

47 Bag Difference An element appears in the difference A – B of bags as many times as it appears in A, minus the number of times it appears in B  But never less than 0 times Example: {1,2,1,1} – {1,2,3} = {1,1}

48 Beware: Bag Laws != Set Laws Some, but not all algebraic laws that hold for sets also hold for bags Example: the commutative law for union (R ∪ S = S ∪ R ) does hold for bags  Since addition is commutative, adding the number of times x appears in R and S doesn’t depend on the order of R and S

49 Example: A Law That Fails Set union is idempotent, meaning that S ∪ S = S However, for bags, if x appears n times in S, then it appears 2n times in S ∪ S Thus S ∪ S != S in general  e.g., {1} ∪ {1} = {1,1} != {1}

50 Datalog: Logic Instead of Algebra

51 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic (Datalog) that consists of if-then rules. Datalog is inherently a logic of sets Datalog queries are more powerful than relational algebra; several rules can express recursions that are not expressable in algebra Relations are represented in Datalog as predicates; predicate (R) is followed by its arguments is called an atom Predicate returns a boolean value

52 Datalog: Logic instead of Algebra Rule: head  body  Head: relational atom   : read “if”  Body: one or more atoms called subgoals which may be relational or arithmetic Example: LongMovie(t,y)  Movies(t,y,l,s,p) AND l ≥ 100 It says: LongMovie(t,y) is true whenever we can find tuple in Movies with: first 2 components as (t,y) and 3 rd component as l that is at least 100, and any values in components 4 and 5 Equivalent to assignment statement in relational algebra: LongMovie := ∏ title,year (σ length≥100 (Movies))

53 Datalog: Logic instead of Algebra Extensional and Intentional Predicates:  Extensional Predicates (EDB) are predicates whose relations are stored in a database  Intentional Predicates (IDB) are predicates whose relations are computed by applying one or more Datalog rules As long as there is no negated relational subgoals, evaluating rules when relations are sets apply for bags as well Relational Algebra and Datalog: assume R(A,B,C), and S(A,B,C):  Boolean: Union: R υ S is equivalent to these 2 rules:  U(A,B,C)  R(A,B,C)  U(A,B,C)  S(A,B,C)

54 Datalog: Logic instead of Algebra Intersection: R ∩ S is equivalent to the following rule:  I(A,B,C)  R(A,B,C) AND S(A,B,C) Set Difference: R - S is equivalent to the following rule:  D(A,B,C)  R(A,B,C) AND NOT S(A,B,C)  Projection P(A,B)  R(A,B,C)  Selection S(A,B,C)  R(A,B,C) AND C ≥ 100  Product P(A,B,C,D,E.F)  R(A,B,C) AND S(D,E,F)  Joins J(A,B,C,D)  R(A,B) AND (S(B,C,D)

55 Datalog: Logic instead of Algebra Simulating Multiple Operations with Datalog Example: Algebraic Expression  ∏ Title,year (σ length ≥ 100 (Movies) ∩ σ StudioName=‘Fox’ (Movies))  Translates into this set of rules: W(t,y,l,g,s,p)  Movies(t,y,l,g,s,p) AND l ≥ 100 X(t,y,l,g,s,p)  Movies(t,y,l,g,s,p) AND s = ‘Fox’ Y(t,y,l,g,s,p)  W(t,y,l,g,s,p) AND X(t,y,l,g,s,p) Answer(t,y)  Y(t,y,l,g,s,p)

56 END