Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CSE 480: Database Systems Lecture 16: Relational Algebra.

Similar presentations


Presentation on theme: "1 CSE 480: Database Systems Lecture 16: Relational Algebra."— Presentation transcript:

1 1 CSE 480: Database Systems Lecture 16: Relational Algebra

2 2 Review (Relational Algebra Operators) l Unary operators –SELECT  condition (R) –PROJECT  Attribute-List (R) –RENAME  S(A1,A2,…Ak) (R) l Set operators –R  S –R  S –R – S or S – R l Cartesian product (cross product or cross join) –R  S

3 3 Example (Select Operator)  Salary>50000 (Employee)

4 4 Example (Project Operator)  Ssn,Super_ssn (Employee)

5 5 Example (Rename operator)  SUP (  SSN,Super_ssn (Employee)) SUP

6 6 Example (Intersection) Employee Employee  SSN (Employee)   Super_SSN (Employee)

7 7 Example (Union) Employee Employee  SSN (Employee)   Super_SSN (Employee)

8 8 Example (Set Difference) Employee Employee  SSN (Employee) -  Super_SSN (Employee)

9 9 Example (Cartesian/Cross Product) Employee Employee  SSN (Employee)   Super_SSN (Employee) …

10 10 Join Operation Example: Suppose we want to retrieve information about the manager of each department (their names, salary, address, etc) To get the manager’s name, we need to combine each DEPARTMENT tuple with the EMPLOYEE tuple If we use Cartesian product, we get every possible pair of tuples between DEPARTMENT and EMPLOYEE

11 1 Join Operation l Apply a select statement after the Cartesian product  MgrSSN=SSN (Department  Employee) l Equivalent to: Department MgrSSN=SSN Employee Join operator

12 12 Join Operation join operation l A join operation between relations R and S is R join-condition S –where join-condition is a conjunction of terms l Join operation is equivalent to:  join-condition (R  S)

13 13 Theta Join l The general case of JOIN operation is called a Theta-join: R S theta l Theta can be any general boolean expression on the attributes of R and S; for example: –R.A 1 =S.B 2 AND (R.A 4 =S.B 3 OR R.A 3 <S.B 4 ) –R.A 1 =S.B 1 AND R.A 4 < 500 –R.A 1 =S.B 1 AND R.A 2 =S.B 2 AND R.A 3 =S.B 3 –R.A 1 >S.B 1 AND R.A 4  ‘Freshman’

14 14 Example: COMPANY Database Schema  Employee.FName,Employee. Lname (Employee Dno=Dnumber AND Employee.Salary>Manager.Salary Manager) Manager  (Department Mgr_ssn = SSN Employee) l Find the names of employees who earn more than their managers

15 15 Theta Join Properties l Associative property: (A cond1 B) cond2 C  A cond1 (B cond2 C) –Where  cond1 involves attributes between relations A and B  cond2 involves attributes between relations B and C

16 16 Example l Find the names of professors who have taught a course offered by the CS department (DeptId = ‘CS’) NameOfProfessor l Result schema? l How? 1. Get the list of courses offered by CS (from Course table) 2. Get the ID of professors who taught them (from Teaching table) 3. Get the names of the professors (from Professor table)

17 17 Example l Find the names of professors who have taught a course offered by the CS department (DeptId = ‘CS’) Result   Name (ProfsWhoTeachCS ProfsWhoTeachCS.ProfId=PROFESSOR.Id PROFESSOR) ProfsWhoTeachCS   ProfId (CSCourse CSCourse.CrsCode=Teaching.CrsCode TEACHING) CSCourse   CrsCode (  DeptId=‘CS’ (COURSE))

18 18 Example l Find the names of students who took a course taken by John Doe (in the same semester) but who received a better grade NameOfStudent l Result schema? l How? 1.Find the courses taken by John Doe and their corresponding grades 2.Find the students who took the same courses but had better grades 3.Find the names of these students

19 19 Example l Find the names of students who took the same course as John Doe (in the same semester) but received a better grade BStudentNames   Name ( BStudents BStudents.StudId=STUDENT.Id STUDENT) JD   Name=‘John Doe’ (STUDENT) Id=StudId TRANSCRIPT BStudents   Transcript.StudId ( JD JD.CrsCode=Transcript.CrsCode AND JD.Semester=Transcript.Semester AND JD.Grade<Transcript.Grade TRANSCRIPT)

20 20 Equijoin l The most common use of join involves join conditions with equality comparisons only R join-condition S l Join condition is a conjunction of equalities R.A 1 = S.B 1 AND … AND R.A k = S.B k l Used very frequently since it combines related data from different relations

21 21 Equijoin – Example Id Name Addr Status 111 John ….. ….. 222 Mary ….. ….. 333 Bill ….. ….. 444 Joe ….. ….. StudId CrsCode Sem Grade 111 CSE305 S00 B 111 CSE306 S99 A 222 CSE304 F99 A StudentTranscript Id Name Addr Status StudId CrsCode Sem Grade 111 John ….. ….. 111 CSE305 S00 B 111 John ….. ….. 111 CSE306 S99 A 222 Mary ….. ….. 222 CSE305 F99 A Student Id=StudId Transcript Produces columns (attributes) with identical values, which is redundant Join condition involves equality comparisons between corresponding attributes in two relations

22 2 Natural Join l A special case of equijoin: R  S –join condition equates all attributes with the same name (condition does not have to be explicitly stated) –duplicate columns are eliminated from the result

23 23 Natural Join Id Name Addr Status 111 John ….. ….. 222 Mary ….. ….. 333 Bill ….. ….. 444 Joe ….. ….. Id CrsCode Sem Grade 111 CSE305 S00 B 111 CSE306 S99 A 222 CSE304 F99 A StudentTranscript2 Id Name Addr Status CrsCode Sem Grade 111 John ….. ….. CSE305 S00 B 111 John ….. ….. CSE306 S99 A 222 Mary ….. ….. CSE305 F99 A Student  Transcript2 Duplicate attribute (Id) was removed Join attribute must have the same name (Id)

24 24 Equijoin vs Natural Join IdNameSex 1JohnM 2MaryF 3BobM IdTestStatus 1EyeFail 2HearingPass 4EyePass R S R R.Id=S.Id S R.IdNameSexS.IdTestStatus 1JohnM1EyeFail 2MaryF2HearingPass R  S IdNameSexTestStatus 1JohnMEyeFail 2MaryFHearingPass EquijoinNatural Join

25 25 Example l List the Ids of students who took at least two different courses: Transcript Transcript (StudId, CrsCode, Sem, Grade) Transcript  Transcript))  StudId (  CrsCode  CrsCode ( Transcript  Transcript)) What’s wrong with the above query? We don’t want to join on CrsCode, Sem, and Grade attributes, hence we must rename the attributes Transcript   (StudId,CrsCode2,Sem2,Grade2) (Transcript))  StudId (  CrsCode  CrsCode2 ( Transcript   (StudId,CrsCode2,Sem2,Grade2) (Transcript))

26 26 OUTER JOIN Operations l In THETA JOIN, NATURAL JOIN and EQUIJOIN, tuples without a matching (or related) tuple are eliminated –This amounts to loss of information l OUTER joins can be used if we want to keep all the tuples, regardless of whether they have matching tuples in the other relation IdNameSex 1JohnM 2MaryF 3BobM IdTestStatus 1EyeFail 2HearingPass 4EyePass RS IdNameSexTestStatus 1JohnMEyeFail 2MaryFHearingPass R * S

27 27 OUTER JOIN Operations l Left outer join operation: R S –Keeps every tuple in the left relation R; –if no matching tuple is found in S, then the attributes of S in the join result are filled or “padded” with null values l Right outer join: R S –keeps every tuple in the right relation S l Full outer join: R S –keeps all tuples in both the left and the right relations when no matching tuples are found, padding them with null values as needed.

28 28 Left Outer Join – Example IdNameSex 1JohnM 2MaryF 3BobM IdTestStatus 1EyeFail 2HearingPass 4EyePass R S R Id=Id S R.IdNameSexS.IdTestStatus 1JohnM1EyeFail 2MaryF2HearingPass 3BobMNULL

29 29 Right Outer Join – Example IdNameSex 1JohnM 2MaryF 3BobM IdTestStatus 1EyeFail 2HearingPass 4EyePass R S R Id=Id S R.IdNameSexS.IdTestStatus 1JohnM1EyeFail 2MaryF2HearingPass NULL 4EyePass

30 30 Full Outer Join – Example IdNameSex 1JohnM 2MaryF 3BobM IdTestStatus 1EyeFail 2HearingPass 4EyePass R S R Id=Id S R.IdNameSexS.IdTestStatus 1JohnM1EyeFail 2MaryF2HearingPass 3BobMNULL 4EyePass

31 31 Division l Example: l Find students who have taken ALL the required courses NameCrsCode JohnCSE231 JohnCSE331 MaryCSE331 CrsCode CSE231 CSE331 EnrollmentRequiredCourses Name John l Relational algebra: Enrollment  RequiredCourses

32 32 Division l Given two relations: –R (A 1, …A n, B 1, …B m ) –S (B 1 …B m ) l T  R  S –Schema of the result relation: T(A 1, …A n ) –T contains the set of all tuples such that for every tuple in S, is in R AB a1b1 a1b2 a1b3 a2B1   B b1 b2 A a1 RS T

33 3 Example Student Student (Id, Name, Addr, Status) Professor Professor (Id, Name, DeptId) Course Course (DeptId, CrsCode, CrsName, Descr) Transcript Transcript (StudId, CrsCode, Semester, Grade) Teaching Teaching (ProfId, CrsCode, Semester) Department Department (DeptId, Name) l Find the names of students who took a course from every professor who had ever taught a course l Find the names of professors who taught all the CS courses that have been taken by all students

34 34 Example Denom: Every professor who ever taught a course Denom   ProfId (Teaching) l Find the names of students who took a course from every professor who had ever taught a course Num: Set of (student, professor) pairs, where the student has taken a course from the professor Num   StudId,ProfId (Transcript * Teaching) Result   Name (Student Id=StudId (Num  Denom))

35 35 Example l Find the names of professors who taught all the CS courses that have been taken by all students l Denom: CS courses taken by all students –Denom2: All students Denom2(StudId)   Id (Student) –Num2: (CS course, student) pairs, where student took the CS course Num2   CrsCode,StudId (Transcript *  DeptId=‘CS’ (Course)) –Denom  Num2  Denom2 l Num: Set of (prof, course) pairs, where the prof taught the course Num   ProfId,CrsCode (Teaching) l Result   Name (Professor Id=ProfId (Num  Denom))

36 36 Aggregate Function l Examples: –Find the average salary of all employees or count the total number of employees –These functions are used in simple statistical queries that summarize information from the database tuples. l Examples of aggregate functions: –SUM, AVERAGE, MAXIMUM, MINIMUM, and COUNT

37 37 Examples of Aggregate Functions l ℱ MAX Salary (EMPLOYEE) retrieves the maximum salary value from the EMPLOYEE relation l ℱ MIN Salary (EMPLOYEE) retrieves the minimum Salary value from the EMPLOYEE relation l ℱ SUM Salary (EMPLOYEE) retrieves the sum of the Salary from the EMPLOYEE relation l ℱ COUNT SSN, AVERAGE Salary (EMPLOYEE) computes the total number of employees and their average salary

38 38 Using Grouping with Aggregation l The previous examples aggregate one or more attributes for the entire relation –Ex: Find the maximum/minimum salary or count number of employees l Grouping allows us to aggregate tuples for each group –Ex: For each department, retrieve the DNO, COUNT SSN, and AVERAGE SALARY DNO g COUNT SSN, AVERAGE Salary (EMPLOYEE)

39 39 Aggregate functions and grouping DNO g COUNT SSN, AVERAGE Salary (EMPLOYEE) ℱ COUNT SSN, AVERAGE Salary (EMPLOYEE)

40 40 Summary (Relational Algebra Operators) l Unary operators: SELECT, PROJECT, RENAME l Set operators –UNION: R  S –INTERSECTION: R  S –SET DIFFERENCE (MINUS): R – S or S – R l Cartesian product (Cross Product, Cross Join) –R  S l Join: Theta Join, Equijoin, Natural Join, Outer Join l Aggregate function and grouping

41 41 Exercise l Find the names of CS professors who have never taught John Doe l Find the names of CS professors who taught a class every semester l Find the most popular CS class (i.e., class with the highest number of enrolled students)


Download ppt "1 CSE 480: Database Systems Lecture 16: Relational Algebra."

Similar presentations


Ads by Google