CMU SCS Carnegie Mellon Univ. School of Computer Science /615 - DB Applications C. Faloutsos & A. Pavlo Lecture #4: Relational Algebra
CMU SCS Faloutsos - PavloCMU SCS /615#2 Overview history concepts Formal query languages –relational algebra –rel. tuple calculus –rel. domain calculus
CMU SCS Faloutsos - PavloCMU SCS /615#3 History before: records, pointers, sets etc introduced by E.F. Codd in 1970 revolutionary! first systems: (System R; Ingres) Turing award in 1981
CMU SCS Faloutsos - PavloCMU SCS /615#4 Concepts - reminder Database: a set of relations (= tables) rows: tuples columns: attributes (or keys) superkey, candidate key, primary key
CMU SCS Faloutsos - PavloCMU SCS /615#5 Example Database:
CMU SCS Faloutsos - PavloCMU SCS /615#6 Example: cont’d Database: rel. schema (attr+domains) tuple k-th attribute (Dk domain)
CMU SCS Faloutsos - PavloCMU SCS /615#7 Example: cont’d rel. schema (attr+domains) instance
CMU SCS Faloutsos - PavloCMU SCS /615#8 Example: cont’d rel. schema (attr+domains) instance Di: the domain of the i-th attribute (eg., char(10)
CMU SCS Faloutsos - PavloCMU SCS /615#9 Overview history concepts Formal query languages –relational algebra –rel. tuple calculus –rel. domain calculus
CMU SCS Faloutsos - PavloCMU SCS /615#10 Formal query languages How do we collect information? Eg., find ssn’s of people in 415 (recall: everything is a set!) One solution: Rel. algebra, ie., set operators Q1: Which ones?? Q2: what is a minimal set of operators?
CMU SCS Faloutsos - PavloCMU SCS /615#11. set union U set difference ‘-’ Relational operators
CMU SCS Faloutsos - PavloCMU SCS /615#12 Example: Q: find all students (part or full time) A: PT-STUDENT union FT-STUDENT
CMU SCS Faloutsos - PavloCMU SCS /615#13 Observations: two tables are ‘union compatible’ if they have the same attributes (‘domains’) Q: how about intersection U
CMU SCS Faloutsos - PavloCMU SCS /615#14 Observations: A: redundant: STUDENT intersection STAFF = STUDENT STAFF
CMU SCS Faloutsos - PavloCMU SCS /615#15 Observations: A: redundant: STUDENT intersection STAFF = STUDENT STAFF
CMU SCS Faloutsos - PavloCMU SCS /615#16 Observations: A: redundant: STUDENT intersection STAFF = STUDENT - (STUDENT - STAFF) STUDENT STAFF
CMU SCS Faloutsos - PavloCMU SCS /615#17 Observations: A: redundant: STUDENT intersection STAFF = STUDENT - (STUDENT - STAFF) Double negation: We’ll see it again, later…
CMU SCS Faloutsos - PavloCMU SCS /615#18. set union set difference ‘-’ Relational operators U
CMU SCS Faloutsos - PavloCMU SCS /615#19 Other operators? eg, find all students on ‘Main street’ A: ‘selection’
CMU SCS Faloutsos - PavloCMU SCS /615#20 Other operators? Notice: selection (and rest of operators) expect tables, and produce tables (-> can be cascaded!!) For selection, in general:
CMU SCS Faloutsos - PavloCMU SCS /615#21 Selection - examples Find all ‘Smiths’ on ‘Forbes Ave’ ‘condition’ can be any boolean combination of ‘=‘, ‘>’, ‘>=‘,...
CMU SCS Faloutsos - PavloCMU SCS /615#22 selection. set union set difference R - S Relational operators R U S
CMU SCS Faloutsos - PavloCMU SCS /615#23 selection picks rows - how about columns? A: ‘projection’ - eg.: finds all the ‘ssn’ - removing duplicates Relational operators
CMU SCS Faloutsos - PavloCMU SCS /615#24 Cascading: ‘find ssn of students on ‘forbes ave’ Relational operators
CMU SCS Faloutsos - PavloCMU SCS /615#25 selection projection. set union set difference R - S Relational operators R U S
CMU SCS Faloutsos - PavloCMU SCS /615#26 Are we done yet? Q: Give a query we can not answer yet! Relational operators
CMU SCS Faloutsos - PavloCMU SCS /615#27 A: any query across two or more tables, eg., ‘find names of students in ’ Q: what extra operator do we need?? Relational operators
CMU SCS Faloutsos - PavloCMU SCS /615#28 A: any query across two or more tables, eg., ‘find names of students in ’ Q: what extra operator do we need?? A: surprisingly, cartesian product is enough! Relational operators
CMU SCS Faloutsos - PavloCMU SCS /615#29 Cartesian product eg., dog-breeding: MALE x FEMALE gives all possible couples x =
CMU SCS Faloutsos - PavloCMU SCS /615#30 so what? Eg., how do we find names of students taking 415?
CMU SCS Faloutsos - PavloCMU SCS /615#31 Cartesian product A:
CMU SCS Faloutsos - PavloCMU SCS /615#32 Cartesian product
CMU SCS Faloutsos - PavloCMU SCS /615#33
CMU SCS Faloutsos - PavloCMU SCS /615#34 selection projection cartesian product MALE x FEMALE set union set difference R - S FUNDAMENTAL Relational operators R U S
CMU SCS Faloutsos - PavloCMU SCS /615#35 Relational ops Surprisingly, they are enough, to help us answer almost any query we want!! derived/convenience operators: –set intersection – join (theta join, equi-join, natural join) –‘rename’ operator –division
CMU SCS Faloutsos - PavloCMU SCS /615#36 Joins Equijoin:
CMU SCS Faloutsos - PavloCMU SCS /615#37 Cartesian product A:
CMU SCS Faloutsos - PavloCMU SCS /615#38 Joins Equijoin: theta-joins: generalization of equi-join - any condition
CMU SCS Faloutsos - PavloCMU SCS /615#39 Joins very popular: natural join: R S like equi-join, but it drops duplicate columns: STUDENT (ssn, name, address) TAKES (ssn, cid, grade)
CMU SCS Faloutsos - PavloCMU SCS /615#40 Joins nat. join has 5 attributes equi-join: 6
CMU SCS Faloutsos - PavloCMU SCS /615#41 Natural Joins - nit-picking if no attributes in common between R, S: nat. join -> cartesian product
CMU SCS Faloutsos - PavloCMU SCS /615#42 Overview - rel. algebra fundamental operators derived operators –joins etc –rename –division examples
CMU SCS Faloutsos - PavloCMU SCS /615#43 Rename op. Q: why? A: shorthand; self-joins; … for example, find the grand-parents of ‘Tom’, given PC (parent-id, child-id)
CMU SCS Faloutsos - PavloCMU SCS /615#44 Rename op. PC (parent-id, child-id)
CMU SCS Faloutsos - PavloCMU SCS /615#45 Rename op. first, WRONG attempt: (why? how many columns?) Second WRONG attempt:
CMU SCS Faloutsos - PavloCMU SCS /615#46 Rename op. we clearly need two different names for the same table - hence, the ‘rename’ op.
CMU SCS Faloutsos - PavloCMU SCS /615#47 Overview - rel. algebra fundamental operators derived operators –joins etc –rename –division examples
CMU SCS Faloutsos - PavloCMU SCS /615#48 Division Rarely used, but powerful. Example: find suspicious suppliers, ie., suppliers that supplied all the parts in A_BOMB
CMU SCS Faloutsos - PavloCMU SCS /615#49 Division
CMU SCS Faloutsos - PavloCMU SCS /615#50 Division Observations: ~reverse of cartesian product It can be derived from the 5 fundamental operators (!!) How?
CMU SCS Faloutsos - PavloCMU SCS /615#51 Division Answer: Observation: find ‘good’ suppliers, and subtract! (double negation)
CMU SCS Faloutsos - PavloCMU SCS /615#52 Division Answer: Observation: find ‘good’ suppliers, and subtract! (double negation)
CMU SCS Faloutsos - PavloCMU SCS /615#53 Division Answer: All suppliers All bad parts
CMU SCS Faloutsos - PavloCMU SCS /615#54 Division Answer: all possible suspicious shipments
CMU SCS Faloutsos - PavloCMU SCS /615#55 Division Answer: all possible suspicious shipments that didn’t happen
CMU SCS Faloutsos - PavloCMU SCS /615#56 Division Answer: all suppliers who missed at least one suspicious shipment, i.e.: ‘good’ suppliers
CMU SCS Faloutsos - PavloCMU SCS /615#57 Overview - rel. algebra fundamental operators derived operators –joins etc –rename –division examples
CMU SCS Faloutsos - PavloCMU SCS /615#58 Sample schema find names of students that take
CMU SCS Faloutsos - PavloCMU SCS /615#59 Examples find names of students that take
CMU SCS Faloutsos - PavloCMU SCS /615#60 Examples find names of students that take
CMU SCS Faloutsos - PavloCMU SCS /615#61 Sample schema find course names of ‘smith’
CMU SCS Faloutsos - PavloCMU SCS /615#62 Examples find course names of ‘smith’
CMU SCS Faloutsos - PavloCMU SCS /615#63 Examples find ssn of ‘overworked’ students, ie., that take 412, 413, 415
CMU SCS Faloutsos - PavloCMU SCS /615#64 Examples find ssn of ‘overworked’ students, ie., that take 412, 413, 415: almost correct answer:
CMU SCS Faloutsos - PavloCMU SCS /615#65 Examples find ssn of ‘overworked’ students, ie., that take 412, 413, Correct answer: c-name=413 c-name=415
CMU SCS Faloutsos - PavloCMU SCS /615#66 Examples find ssn of students that work at least as hard as ssn=123, ie., they take all the courses of ssn=123, and maybe more
CMU SCS Faloutsos - PavloCMU SCS /615#67 Sample schema
CMU SCS Faloutsos - PavloCMU SCS /615#68 Examples find ssn of students that work at least as hard as ssn=123 (ie., they take all the courses of ssn=123, and maybe more
CMU SCS Faloutsos - PavloCMU SCS /615#69 Conclusions Relational model: only tables (‘relations’) relational algebra: powerful, minimal: 5 operators can handle almost any query!