Discussion #23 1/32 Discussion #23 Relational Algebra.

Slides:



Advertisements
Similar presentations
COMP 5138 Relational Database Management Systems Semester 2, 2007 Lecture 5A Relational Algebra.
Advertisements

CS CS4432: Database Systems II Logical Plan Rewriting.
D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
Basic Structures: Sets, Functions, Sequences, Sums, and Matrices
Relational Algebra Ch. 7.4 – 7.6 John Ortiz. Lecture 4Relational Algebra2 Relational Query Languages  Query languages: allow manipulation and retrieval.
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
INFS614, Fall 08 1 Relational Algebra Lecture 4. INFS614, Fall 08 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of.
SQL and Relational Algebra Zaki Malik September 02, 2008.
1 Relational Algebra & Calculus. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Relational Algebra 1 Chapter 5.1 V3.0 Napier University Dr Gordon Russell.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Discussion #22 1/10 Discussion #22 Relational Data Model.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
Discussion #12 1/22 Discussion #12 Deduction, Proofs and Proof Techniques.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
16.2 ALGEBRAIC LAWS FOR IMPROVING QUERY PLANS Ramya Karri ID: 206.
Discussion #15 1/16 Discussion #15 Interpretations.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Rutgers University Relational Algebra 198:541 Rutgers University.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
CSCD343- Introduction to databases- A. Vaisman1 Relational Algebra.
1 Discussion #21 Discussion #21 Sets & Set Operations; Tuples & Relations.
Relational Algebra.
START OF DAY 8 Reading: Chap. 2 & 9. Relational Data Model.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
Relational Algebra Instructor: Mohamed Eltabakh 1.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
1 Relational Algebra & Calculus Chapter 4, Part A (Relational Algebra)
1 Relational Algebra and Calculas Chapter 4, Part A.
1.1 CAS CS 460/660 Introduction to Database Systems Relational Algebra.
Database Management Systems 1 Raghu Ramakrishnan Relational Algebra Chpt 4 Xin Zhang.
ICS 321 Fall 2011 The Relational Model of Data (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 8/29/20111Lipyeow.
From Relational Algebra to SQL CS 157B Enrique Tang.
1 Relational Algebra Chapter 4, Sections 4.1 – 4.2.
1 Lecture 7: Normal Forms, Relational Algebra Monday, 10/15/2001.
CompSci 102 Discrete Math for Computer Science
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
Database Management Systems 1 Raghu Ramakrishnan Relational Algebra Chpt 4 Xin Zhang.
CSCD34-Data Management Systems - A. Vaisman1 Relational Algebra.
Sets & Set Operations Tuples & Relations. Sets Sets are collections –The things in the collection are called elements or members –Sets have no duplicates.
Database Management Systems, R. Ramakrishnan1 Relational Algebra Module 3, Lecture 1.
Relational Algebra Instructor: Mohamed Eltabakh 1 Part II.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Relational Algebra p BIT DBMS II.
1 Set Theory Second Part. 2 Disjoint Set let A and B be a set. the two sets are called disjoint if their intersection is an empty set. Intersection of.
Relational Algebra. What is an algebra? –a pair: (set of values, set of operations) –  ADT  type  Class  Object e.g., stack: (set of all stacks, {pop,
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
Chapter 2 1. Chapter Summary Sets (This Slide) The Language of Sets - Sec 2.1 – Lecture 8 Set Operations and Set Identities - Sec 2.2 – Lecture 9 Functions.
Relational Algebra COMP3211 Advanced Databases Nicholas Gibbins
©Silberschatz, Korth and Sudarshan2.1Database System Concepts - 6 th Edition Chapter 8: Relational Algebra.
Relational Algebra & Calculus
COMP3017 Advanced Databases
Relational Data Model.
An Algebraic Query Language
Relational Algebra Chapter 4 1.
Relational Algebra Chapter 4, Part A
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Relational Algebra.
Relational Algebra 1.
LECTURE 3: Relational Algebra
Relational Algebra Chapter 4 1.
Instructor: Mohamed Eltabakh
Relational Algebra Chapter 4, Sections 4.1 – 4.2
CENG 351 File Structures and Data Managemnet
Relational Algebra & Calculus
Presentation transcript:

Discussion #23 1/32 Discussion #23 Relational Algebra

Discussion #23 2/32 Topics Algebras Relational Algebra –use of standard notation –set operators , ,  –renaming  –selection  –projection  –cross product  –join |  | Queries (from English) Query optimization SQL

Discussion #23 3/32 Relational Algebra What is an algebra? –a pair: (set of values, set of operations) –  ADT  type  Class  Object e.g. stack: (set of all stacks, {pop, push, top, …}) integer: (set of all integers, {+, -, *,  }) What is relational algebra? –(set of relations, set of relational operators) –{ , , , , , , , |  |}

Discussion #23 4/32 Relational Algebra is Closed Closed: all operations produce values in the value set –(reals, {+, *,  })  closed –(reals, {+, *, ,  })  not closed (divide by 0) –(reals, {+, *, >})  not closed (T/F not in value set) –(computer reals, {+, *,  })  not closed (overflow, roundoff) –(relations, relational operators)  closed Implication: we can always nest relational operators; can’t for algebras that are not closed. –e.g. after overflow, can do nothing –e.g. can’t always nest: (2 < 3) + 5 = ?

Discussion #23 5/32 Set Operations: , , and  Relations are sets; thus set operations should work. Examples: R = A B S = A B R  S = A B R  S = A B R  S = A B 1 2 S  R = A B

Discussion #23 6/32 Set Operations (continued …) Definition: schema(R) = {A, B} = AB, i.e. the set of attributes We sometimes write R(AB) to mean the relation R with schema AB. Definition: union compatible –schema(R) = schema(S) –required precondition for , ,  Definitions: –R  S = { t | t  R  t  S} –R  S = { t | t  R  t  S} –R  S = { t | t  R  t  S}

Discussion #23 7/32 Tuple Restriction: [X] Restriction is a tuple operator (not a relational operator). t[X] restricts tuple t to the attributes in X. A B C t = t[ A ] = (1) t[ AC ] = (1,3) t= (1,2,3) t[A]= (1,2,3)[A] = {(A,1), (B, 2), (C,3)}[A] = {(A,1)} = (1)

Discussion #23 8/32 Renaming:   A  B R renames attribute A to be B. –A must be in schema(R) –B must not be in schema(R) Example: let  C  B Q = A B R  C  B Q = A B R = A B Q = A C But with  : R  Q = ? Not union compatible

Discussion #23 9/32 Renaming (continued…) Q =  A  B R renames attribute A to B; the result is Q. Precondition: –A  schema(R) –B  schema(R) Postcondition: –schema(Q) = (schema(R)  {A})  {B} –Q = {t' |  t (t  R  t' = (t – {(A, t[A])})  {(B, t[A])})} R = {{(A,1), (C,2)} {(A,2), (C,2)}} Q =  A  B R = {{(B,1), (C,2)} {(B,2), (C,2)}}

Discussion #23 10/32 Selection:  The selection operation selects the tuples that satisfy a condition.  A=1 R = A B 1 2  B=2 R = A B  A=2  B  2 R = A B  A=3 R = A B Note: empty, but still retain the schema  P R = { t | t  R  P(t) } Precondition: each attribute mentioned in P must be in schema(R). Postcondition:  P R = { t | t  R  P(t) } schema(  P R) = schema(R) Meaning: apply predicate P to tuple t by substituting into P appropriate t values. R = A B

Discussion #23 11/32 Projection:  The projection operation restricts tuples in a relation to those designated in the operation. R = A B Q = A B C  A R = A 1 2  B R = B 2 3  BC Q = B C  AB R = R =  A,B R =  {A,B} R Precondition: X  schema(R) Postcondition:  X R = { t' |  t (t  R  t' = t[ X ]) } schema(  X R) = X

Discussion #23 12/32 Cross Product:  Standard cartesian product adapted for relational algebra R = A B S = C D R  S = A B C D

Discussion #23 13/32 Cross Product (continued…) R = A B 1 2 = t' 2 2 S = C D = t'' Precondition: schema(R)  schema(S) =  Postcondition: R  S = { t |  t'  t''(t'  R  t''  S  t = t'  t'')} schema(R  S) = schema(R)  schema(S) t' = { (A,1), (B,2) } t'' = { (C,3), (D,3) } t'  t'' = { (A,1), (B,2), (C,3), (D,3) }

Discussion #23 14/32 Cross Product (continued…) R = A B 1 2 = t' = { (A,1), (B,2) } 2 2 S = C A 1 1 = t'' = { (C,1), (A,1) } = t''' = { (C,3), (A,3) } t'  t'' = { (A,1), (B,2), (C,1), (A,1) } What if R and S have the same attribute, e.g. A? Can’t do cross product Solution: Rename  A  A S R   A  A S = A B C A

Discussion #23 15/32 Natural Join: |  | R = A B S = B C R |  | S = A B C (R  ) Cross Product A B R |  | S =  ABC Projection  B=B' Selection Renaming  B  B' S B' C

Discussion #23 16/32 Join (continued …) In general, we can equate 0, 1, 2, or more attributes using |  |. A join is defined as: schema ( R |  | S ) = schema( R )  schema( S ) R |  | S = {t | t[schema( R) ]  R  t[schema( S) ]  S} There are no preconditions  join always works.

Discussion #23 17/32 Join (continued…) R = A B S = C D R |  | S = A B C D R = A B S = B C R |  | S = A B C R = A B C S = A B D R |  | S = A B C D attributes in common (full cross product) 1 attribute in common 2 attributes in common

Discussion #23 18/32 Join (continued…) We can use renaming to control the |  | R = A B S = B C R |  |  C  A S = A B 1 2 S' = B A = A B R |  | S' = A B 1 2 BTW, observe equivalence with intersection

Discussion #23 19/32 Relational Algebra Expressions Relational operators are closed. Thus we can nest expressions: R = A B S = B C D  D  C=5 (R |  | S) = A B C D Unary operators have precedence over binary operators; binary operators are left associative. We can now do something very useful: ask and answer with relational algebra (almost) any query we can dream up. = D 1 4

Discussion #23 20/32 Relational Algebra Queries List the prerequisites for EE200.  Prerequisite  Course='EE200' cp = Prerequisite EE005 CS100 When does CS101 meet?  Day,Hour  Course='CS101' cdh = Day Hour M 9AM W 9AM F 9AM When and where does EE200 meet?  Day,Hour,Room  Course='EE200' (cdh |  | cr) = Day Hour Room Tu 10AM 25 Ohm Hall W 1PM 25 Ohm Hall Th 10AM 25 Ohm Hall  Our answers are in (cdh |  | cr).  We select Course to be EE200.  Then, project on Day, Hour, Room.

Discussion #23 21/32 Queries (continued…) Can we rewrite the query more optimally? What rules should we use? –Associativity and commutivity of join –Distributive laws for select and project What strategy should we use? –Eliminate unnecessary operations –Make joins as small as possible before execution  Room  Name='Snoopy'  Day='M'  Hour='9AM' (snap |  | csg |  | cr |  | cdh) StudentID Name 'Snoopy' Address Phone Course StudentID Grade Course Room* Course Day 'M' Hour '9AM' = Room Turing Aud. Where can I find Snoopy at 9 am on Monday?

Discussion #23 22/32 Query Optimization “Intuitively” we can write  Room (  Name='Snoopy' snap |  | csg |  | cr |  |  Day='M'  Hour='9AM' cdh) Why does this execute faster? What laws hold that will let us do this? R |  | S = S |  | R  P1  P2 E =  P1  P2 E  P (R |×| S) = R |  |  P S (if all the attributes of P are in S) How do we know they hold?  Room  Name='Snoopy'  Day='M'  Hour='9AM' (snap |  | csg |  | cr |  | cdh) as

Discussion #23 23/32 Proofs for Laws To prove  P1  P2 E =  P1  P2 E, we need to prove that two sets are equal. We prove A = B by showing A  B  B  A. We show that A  B by showing that x  A  x  B. Thus, we can do two proofs to prove  P1  P2 E =  P1  P2 E as follows: 1.t   P1  P2 E premise 2.t  E  (P1  P2)(t) def.:  P R = {t | t  R  P(t)} 3.t  E  P1(t)  P2(t) identical substitutions & operations 4.t  E  P2(t)  P1(t) commutative 5.t   P2 E  P1(t) def. of  6.t   P1  P2 E def. of  1.t   P1  P2 E premise 2.… just go backwards from 6 to 1 in the proof above

Discussion #23 24/32 Alternate Proof Thus, we can prove  P1  P2 E =  P1  P2 E as follows:  P1  P2 E = {t | t  E  (P1  P2)(t)} def.:  P R = {t | t  R  P(t)} = {t | t  E  P1(t)  P2(t)} identical substitutions & operations = {t | t  E  P2(t)  P1(t)} commutative = {t | t   P2 E  P1(t)} def. of  = {t | t   P1  P2 E} def. of  =  P1  P2 E def. of a relation (Derive the right-hand side from the left-hand side.)

Discussion #23 25/32 Proofs for Laws (continued …) To prove  P (R |  | S) = R |  |  P S, where all attributes of P are in S, we again need to prove that two sets are equal. As before, we can convert the lhs to the rhs.  P (R |  | S) = {t | t   P (R |  | S)} def. of a relation = {t | t  R |  | S  P(t)} def.:  P R={t | t  R  P(t)} = {t | t[schema(R)]  R  t[schema(S)]  S  P(t)} def.: R|  |S={t | t[schema(R)]  R  t[schema(S)]  S} = {t | t[schema(R)]  R  t[schema(S)]  S  P(t[schema(S)])} all attributes of P are in S = {t | t[schema(R)]  R  t[schema(S)]   P S} def. of  = {t | t  R |  |  P S} def. of |  | = R |  |  P S def. of a relation

Discussion #23 26/32 SQL Correspondence with Relational Algebra select A from R where B = 1 Assume we have relations R(AB) and S(BC). select B from R except select B from S select A, R.B, C from R, S where R.B = S.B  A  B = 1 R  B R   B S  A, R.B, C  R.B = S.B (R  S) = R |  | S

Discussion #23 27/32 SQL Correspondence with Relational Algebra select A from R where B = 1 Assume we have relations R(AB) and S(BC). select R.B from R where R.B not in (select S.B from S) select * from R natural join S  A  B = 1 R  B R   B S R |  | S

Discussion #23 28/32 SQL Queries List the prerequisites for EE200. select PrerequisitePrerequisite from cp EE005 where Course='EE200'CS100 When does CS101 meet? select Day, HourDay Hour from cdhM 9AM where Course= 'CS101'W 9AM F 9AM When and where does EE200 meet? select cdh.Course, Day, Hour, Room Course Day Hour Room from cdh, cr EE200 Tu 10AM 25 Ohm Hall where cdh.Course= 'EE200' EE200 W 1PM 25 Ohm Hall and cdh.Course=cr.Course EE200 Th 10AM 25 Ohm Hall

Discussion #23 29/32 SQL Queries List the prerequisites for EE200. select PrerequisitePrerequisite from cp EE005 where Course='EE200'CS100 When does CS101 meet? select Day, HourDay Hour from cdhM 9AM where Course= 'CS101'W 9AM F 9AM When and where does EE200 meet? select Course, Day, Hour, Room Course Day Hour Room from cdh natural join cr EE200 Tu 10AM 25 Ohm Hall where cdh.Course= 'EE200' EE200 W 1PM 25 Ohm Hall EE200 Th 10AM 25 Ohm Hall

Discussion #23 30/32 SQL Queries List all prerequisite courses. select PrerequisitePrerequisite from cp CS100 EE005 CS100 CS101 CS120 CS101 CS121 CS205 select distinct PrerequisitePrerequisite from cpCS100 CS101 CS120 CS121 CS205 EE005

Discussion #23 31/32 SQL Queries Where can I find Snoopy at 9 am on Monday? List all prereqs of CS750 (including prereqs of prereqs.) Not possible with standard SQL (unless nesting depth is known) Is possible with Datalog Rules:prereqOf(x, y) :- cp(y, x). prereqOf(x, y) :- prereqOf(x, z), cp(y, z). Query:prereqOf(x, 'CS750')? To gain more power and flexibility, we typically embed SQL in a high-level language. select RoomRoom from snap, csg, cr, cdhTuring Aud. where Name='Snoopy' and Day='M' and Hour='9AM' and snap.StudentID=csg.StudentID and csg.Course=cr.Course and cr.Course=cdh.Course

Discussion #23 32/32 SQL Queries List all prereqs of CS750 (including prereqs of prereqs.) select cp.Prerequisite from cp where cp.Course = 'CS750' union select cp1.Prerequisite from cp cp1, cp cp2 where cp1.Course = cp2.Prerequisite and cp2.Course = 'CS750' union …