Lecture 15: Relational Algebra CSE 480: Database Systems Lecture 15: Relational Algebra Reference: Read Chapter 6.1-6.3.1 of the textbook
What is Relational Algebra? A formal way to express a query in relational model A query consists of relational expressions describing the sequence of operators applied to obtain the query result Example : LName, Sex ( DNo=4 OR Salary>100000 (EMPLOYEE) ) and are the operators
Why Study Relational Algebra? Query processing in DBMS SELECT A.Name, B.Grade FROM A, B WHERE A.Id = B.Id Name, Grade (Id,Name(A B))
Relational Algebra Operator Output of a relational algebra expression is a relation Types of operators in a relational algebra expression Unary: op(Relation) Binary: Relation op Relation Relational algebra operators select, project, union, set difference, Cartesian product
Select Operator () Id Name Address Status condition (R) (NOT THE SAME AS SELECT in SQL) Resulting relation contains only tuples in R that satisfy the given condition has the same attributes (columns) as the original relation Example: Student Status=‘Junior’(Student) Id Name Address Status Id Name Address Status 1123 John 123 Main Junior 1234 Lee 123 Main Senior 5556 Mary 7 Lake Dr Freshman 9876 Bart 5 Pine St Junior 1123 John 123 Main Junior 9876 Bart 5 Pine St Junior
Selection Condition select_condition (relation) Selection condition acts as a filter to the rows in relation Selection condition is a Boolean expression Simple selection condition: <attribute> operator <constant> <attribute> operator <attribute> where operator: <, , , >, =, <condition> AND <condition> <condition> OR <condition> NOT <condition>
Selection Condition - Examples Student 1123 John 123 Main Junior 1234 Lee 123 Main Senior 5556 Mary 7 Lake Dr Freshman 9876 Bart 5 Pine St Junior Id Name Address Status Selectivity Fraction of tuples selected by a selection condition Id>3000 OR Status=‘Freshman’ (Student) selectivity = 2/4 Id>3000 AND Id <3999 (Student) selectivity = 0/4 NOT(Status=‘Senior’) (Student) selectivity = 3/4 Status‘Senior’ (Student) selectivity = 3/4
Selection Condition – Incorrect Queries Find professors who taught both CS315 and CS305 CrsCode=‘CS315’ AND CrsCode=‘CS305’ (TEACHING) X Find courses taught by CS professors PROFESSOR.Id=Teaching.ProfId AND PROFESSOR.DeptId=‘CS’ (TEACHING) X
Properties of SELECT operator Commutative: <condition1>( < condition2> (R)) = <condition2> ( < condition1> (R)) Cascade of SELECT operations may be applied in any order: <cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1> ( R))) Cascade of SELECT operations may be replaced by a single selection with a conjunction of all the conditions: <cond1>(< cond2> (<cond3>(R)) = <cond1> AND < cond2> AND < cond3>(R)))
Project Operator Produces a relation containing a subset of columns of its input relation attribute list(R) Example: Student Name,Status(Student) Id Name Address Status Name Status 1123 John 123 Main Junior 1234 Lee 123 Main Freshman 5556 Mary 7 Lake Dr Senior 9876 Bart 5 Pine St Junior John Junior Lee Freshman Mary Senior Bart Junior
Project Operator Resulting relation has no duplicates; therefore it can have fewer tuples than the input relation Example: Student Address(Student) Id Name Address Status Address 123 Main 7 Lake Dr 5 Pine St 1123 John 123 Main Junior 1234 Lee 123 Main Freshman 5556 Mary 7 Lake Dr Senior 9876 Bart 5 Pine St Junior
Properties of PROJECT Operator Number of tuples in the result of projection <list>(R) is always less or equal to the number of tuples in R If list of projected attributes includes a superkey, the resulting relation has the same number of tuples as the input relation PROJECT operator is not commutative <list1> ( <list2> (R) ) <list2> ( <list1> (R) )
Relational Algebra Expressions Find the Ids and names of junior undergraduate students Student 1123 John 123 Main Junior 1234 Lee 123 Main Freshman 5556 Mary 7 Lake Dr Senior 9876 Bart 5 Pine St Junior Id Name Address Status 1123 John 9876 Bart Result Id Name Id, Name ( Status=’Junior’ (Student) )
Relational Algebra Expressions We may want to apply several relational algebra operations one after the other We can write the operations as a single relational algebra expression by nesting the operations, or We can apply one operation at a time and create intermediate result relations. In the latter case, we must give names to the relations that hold the intermediate results.
Single versus Sequence of operations Query: Find the first name, last name, and salary of all employees who work in department number 5 We can write a single relational algebra expression: FNAME, LNAME, SALARY( DNO=5(EMPLOYEE)) OR We can explicitly show the sequence of operations, giving a name to each intermediate relation: DEP5_EMPS DNO=5(EMPLOYEE) RESULT FNAME, LNAME, SALARY (DEP5_EMPS)
RENAME operator The RENAME operator is denoted by We may rename the attributes of a relation or the relation name or both S (B1, B2, …, Bn )(R) Change the relation name from R to S, and Change column (attribute) names to B1, B2, …..Bn S(R) : Change the relation name only to S (B1, B2, …, Bn )(R) : Change the column (attribute) names only to B1, B2, …..Bn
Example Change relation name to Emp and attributes to First_name, Last_name, and Salary Emp(First_name, Last_name, Salary ) ( FNAME,LNAME,SALARY (EMPLOYEE)) or Emp(First_name, Last_name, Salary) FNAME,LNAME,SALARY (EMPLOYEE)
Set Operators A relation is a set of tuples; therefore set operations are applicable: Intersection: Union: Set difference (Minus): Set operators are binary operators Operator: Relation Relation Relation Result of set operation is a relation that has the same schema as the combining relations
Examples: Set Operations Y A B x1 x2 x3 x4 A B x1 x2 x5 x6 X Y X Y X – Y A B x1 x2 A B x1 x2 x3 x4 x5 x6 A B x3 x4
Example STUDENT INSTRUCTOR STUDENT – INSTRUCTOR STUDENT INSTRUCTOR
Union Compatible Relations Set operations are limited to union compatible relations Two relations are union compatible if: they have the same number of attributes, and the domains of corresponding attributes are type compatible (i.e. dom(Ai)=dom(Bi) for i=1, 2, ..., n). The resulting relation has the same attribute names as the first operand relation
Example Tables: Student (SSN, Name, Address, Status) Professor (Id, Name, Office, Phone) are not union compatible. But Name (Student) and Name (Professor) are union compatible
Exercise Relations: Find the SSN of student employees Employee (SSN, Name, Address) Professor (SSN, Office, Phone) Student (SSN, Status) Find the SSN of student employees SSN (Employee) SSN (Student) Find the SSN of employees who are not professors SSN (Employee) – SSN (Professor) Find the SSN of employees who are neither professors nor students SSN (Employee) – ( SSN (Student) SSN (Professor))
Cartesian Product A B C D A B C D x1 x2 y1 y2 x1 x2 y1 y2 R S is expensive to compute: Number of columns = degree(R) + degree(S) Number of rows = number of rows (R) × number of rows (S) A B C D A B C D x1 x2 y1 y2 x1 x2 y1 y2 x3 x4 y3 y4 x1 x2 y3 y4 x3 x4 y1 y2 R S x3 x4 y3 y4 R S
Example List all the possible Broker-Client pairs Broker (BrokerId, BrokerName) Client (ClientId, ClientName) List all the possible Broker-Client pairs Broker Client BrokerID BrokerName 1 Merrill Lynch 2 Morgan Stanley 3 Salomon Smith Barney ClientId ClientName A11 Bill Gates B11 Steve Jobs BrokerId BrokerName ClientId ClientName 1 Merrill Lynch A11 Bill Gates B11 Steve Jobs 2 Morgan Stanley 3 Salomon Smith Barney
More Complex Examples Find the names of employees who are not supervisors
Example Find the names of employees who earn more than their supervisors
Summary (Relational Algebra Operators) Unary operators SELECT condition(R) PROJECT Attribute-List(R) RENAME S(A1,A2,…Ak)(R) Set operators R S R S R – S or S – R Cartesian product R S