Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Relational model
Carnegie Mellon C. Faloutsos2 Overview history concepts Formal query languages –relational algebra –rel. tuple calculus –rel. domain calculus
Carnegie Mellon C. Faloutsos3 History before: records, pointers, sets etc introduced by E.F. Codd in 1970 revolutionary! first systems: (System R; Ingres) Turing award in 1981
Carnegie Mellon C. Faloutsos4 Concepts Database: a set of relations (= tables) rows: tuples columns: attributes (or keys) superkey, candidate key, primary key
Carnegie Mellon C. Faloutsos5 Example Database:
Carnegie Mellon C. Faloutsos6 Example: cont’d Database: rel. schema (attr+domains) tuple k-th attribute (Dk domain)
Carnegie Mellon C. Faloutsos7 Example: cont’d rel. schema (attr+domains) instance
Carnegie Mellon C. Faloutsos8 Example: cont’d rel. schema (attr+domains) instance Di: the domain of the I-th attribute (eg., char(10) Formally: an instance is a subset of (D1 x D2 x …x Dn)
Carnegie Mellon C. Faloutsos9 Example: cont’d superkey (eg., ‘ssn, name’): determines record cand. key (eg., ‘ssn’, or ‘st#’): minimal superkey primary key: one of the cand. keys
Carnegie Mellon C. Faloutsos10 Overview history concepts Formal query languages –relational algebra –rel. tuple calculus –rel. domain calculus
Carnegie Mellon C. Faloutsos11 Formal query languages How do we collect information? Eg., find ssn’s of people in 415 (recall: everything is a set!) One solution: Rel. algebra, ie., set operators Q1: Which ones?? Q2: what is a minimal set of operators?
Carnegie Mellon C. Faloutsos12. set union U set difference ‘-’ Relational operators
Carnegie Mellon C. Faloutsos13 Example: Q: find all students (part or full time) A: PT-STUDENT union FT-STUDENT
Carnegie Mellon C. Faloutsos14 Observations: two tables are ‘union compatible’ if they have the same attributes (‘domains’) Q: how about intersection U
Carnegie Mellon C. Faloutsos15 Observations: A: redundant: STUDENT intersection STAFF = STUDENT - (STUDENT - STAFF) STUDENT STAFF
Carnegie Mellon C. Faloutsos16. set union set difference ‘-’ Relational operators U
Carnegie Mellon C. Faloutsos17 Other operators? eg, find all students on ‘Main street’ A: ‘selection’
Carnegie Mellon C. Faloutsos18 Other operators? Notice: selection (and rest of operators) expect tables, and produce tables (-> can be cascaded!!) For selection, in general:
Carnegie Mellon C. Faloutsos19 Selection - examples Find all ‘Smiths’ on ‘Forbes Ave’ ‘condition’ can be any boolean combination of ‘=‘, ‘>’, ‘>=‘,...
Carnegie Mellon C. Faloutsos20 selection. set union set difference R - S Relational operators R U S
Carnegie Mellon C. Faloutsos21 selection picks rows - how about columns? A: ‘projection’ - eg.: finds all the ‘ssn’ - removing duplicates Relational operators
Carnegie Mellon C. Faloutsos22 Cascading: ‘find ssn of students on ‘forbes ave’ Relational operators
Carnegie Mellon C. Faloutsos23 selection projection. set union set difference R - S Relational operators R U S
Carnegie Mellon C. Faloutsos24 Are we done yet? Q: Give a query we can not answer yet! Relational operators
Carnegie Mellon C. Faloutsos25 A: any query across two or more tables, eg., ‘find names of students in ’ Q: what extra operator do we need?? A: surprisingly, cartesian product is enough! Relational operators
Carnegie Mellon C. Faloutsos26 Cartesian product eg., dog-breeding: MALE x FEMALE gives all possible couples x =
Carnegie Mellon C. Faloutsos27 so what? Eg., how do we find names of students taking 415?
Carnegie Mellon C. Faloutsos28 Cartesian product A:
Carnegie Mellon C. Faloutsos29 Cartesian product
Carnegie Mellon C. Faloutsos30
Carnegie Mellon C. Faloutsos31 selection projection cartesian product MALE x FEMALE set union set difference R - S FUNDAMENTAL Relational operators R U S
Carnegie Mellon C. Faloutsos32 Relational ops Surprisingly, they are enough, to help us answer almost any query we want!! derived operators, for convenience –set intersection –join (theta join, equi-join, natural join) –‘rename’ operator –division
Carnegie Mellon C. Faloutsos33 Joins Equijoin:
Carnegie Mellon C. Faloutsos34 Cartesian product A:
Carnegie Mellon C. Faloutsos35 Joins Equijoin: theta-joins: generalization of equi-join - any condition
Carnegie Mellon C. Faloutsos36 Joins very popular: natural join: R S like equi-join, but it drops duplicate columns: STUDENT(ssn, name, address) TAKES(ssn, cid, grade)
Carnegie Mellon C. Faloutsos37 Joins nat. join has 5 attributes equi-join: 6
Carnegie Mellon C. Faloutsos38 Natural Joins - nit-picking if no attributes in common between R, S: nat. join -> cartesian product:
Carnegie Mellon C. Faloutsos39 Overview - rel. algebra fundamental operators derived operators –joins etc –rename –division examples
Carnegie Mellon C. Faloutsos40 rename op. Q: why? A: shorthand; self-joins; … for example, find the grand-parents of ‘Tom’, given PC(parent-id, child-id)
Carnegie Mellon C. Faloutsos41 rename op. PC(parent-id, child-id)
Carnegie Mellon C. Faloutsos42 rename op. first, WRONG attempt: (why? how many columns?) Second WRONG attempt:
Carnegie Mellon C. Faloutsos43 rename op. we clearly need two different names for the same table - hence, the ‘rename’ op.
Carnegie Mellon C. Faloutsos44 Overview - rel. algebra fundamental operators derived operators –joins etc –rename –division examples
Carnegie Mellon C. Faloutsos45 Division Rarely used, but powerful. Example: find suspicious suppliers, ie., suppliers that supplied all the parts in A_BOMB
Carnegie Mellon C. Faloutsos46 Division
Carnegie Mellon C. Faloutsos47 Division Observations: ~reverse of cartesian product It can be derived from the 5 fundamental operators (!!) How?
Carnegie Mellon C. Faloutsos48 Division Answer:
Carnegie Mellon C. Faloutsos49 Overview - rel. algebra fundamental operators derived operators –joins etc –rename –division examples
Carnegie Mellon C. Faloutsos50 Sample schema find names of students that take
Carnegie Mellon C. Faloutsos51 Examples find names of students that take
Carnegie Mellon C. Faloutsos52 Sample schema find course names of ‘smith’
Carnegie Mellon C. Faloutsos53 Examples find course names of ‘smith’
Carnegie Mellon C. Faloutsos54 Examples find ssn of ‘overworked’ students, ie., that take 412, 413, 415
Carnegie Mellon C. Faloutsos55 Examples find ssn of ‘overworked’ students, ie., that take 412, 413, 415: almost correct answer: U U U
Carnegie Mellon C. Faloutsos56 Examples find ssn of ‘overworked’ students, ie., that take 412, 413, Correct answer: U U U c-name=413 c-name=415
Carnegie Mellon C. Faloutsos57 Examples find ssn of students that work at least as hard as ssn=123 (ie., they take all the courses of ssn=123, and maybe more
Carnegie Mellon C. Faloutsos58 Sample schema
Carnegie Mellon C. Faloutsos59 Examples find ssn of students that work at least as hard as ssn=123 (ie., they take all the courses of ssn=123, and maybe more
Carnegie Mellon C. Faloutsos60 Conclusions Relational model: only tables (‘relations’) relational algebra: powerful, minimal: 5 operators can handle almost any query! most non-trivial op.: join