M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #10 M.P. Johnson Stern School of Business, NYU Spring, 2008
M.P. Johnson, DBMS, Stern/NYU, Spring Agenda Subqueries, etc. Sets, etc. Nulls Outer joins
M.P. Johnson, DBMS, Stern/NYU, Spring Operators on subqueries Several new operators applied to (unary) selections: 1. IN R 2. EXISTS R 3. UNIQUE R 4. s > ALL R 5. s > ANY R 6. x IN R > is just an example op Each expression can be negated with NOT
M.P. Johnson, DBMS, Stern/NYU, Spring Next: ALL op Employees(name, job, divid, salary) Find which employees are paid more than all the programmers SELECT name FROM Employees WHERE salary > ALL (SELECT salary FROM Employees WHERE job='programmer') SELECT name FROM Employees WHERE salary > ALL (SELECT salary FROM Employees WHERE job='programmer')
M.P. Johnson, DBMS, Stern/NYU, Spring ANY/SOME op Employees(name, job, divid, salary) Find which employees are paid more than at least one vice president SELECT name FROM Employees WHERE salary > ANY (SELECT salary FROM Employees WHERE job='VP') SELECT name FROM Employees WHERE salary > ANY (SELECT salary FROM Employees WHERE job='VP')
M.P. Johnson, DBMS, Stern/NYU, Spring ANY/SOME op Employees(name, job, divid, salary) Find which employees are paid more than at least one vice president SELECT name FROM Employees WHERE salary > SOME (SELECT salary FROM Employees WHERE job='VP') SELECT name FROM Employees WHERE salary > SOME (SELECT salary FROM Employees WHERE job='VP')
M.P. Johnson, DBMS, Stern/NYU, Spring Existential/Universal conditions Employees(name, job, divid, salary) Division(name, id, head) Find all divisions with an employee whose salary is > Existential: easy! SELECT DISTINCT Division.name FROM Employees JOIN Division ON divid=id WHERE salary > SELECT DISTINCT Division.name FROM Employees JOIN Division ON divid=id WHERE salary >
M.P. Johnson, DBMS, Stern/NYU, Spring Universal conditions Employees(name, job, divid, salary) Division(name, id, head) Find all divisions in which everyone makes > Universal: less easy
M.P. Johnson, DBMS, Stern/NYU, Spring Universal conditions Idea: find divisions with some poor employee, and throw them out (SELECT DISTINCT name from Division) MINUS (SELECT DISTINCT Division.name FROM Employees JOIN Division ON divid=id WHERE salary > ) (SELECT DISTINCT name from Division) MINUS (SELECT DISTINCT Division.name FROM Employees JOIN Division ON divid=id WHERE salary > ) Employees(name, job, divid, salary) Division(name, id, head) Find all divisions in which everyone makes >
M.P. Johnson, DBMS, Stern/NYU, Spring Or, universal with IN 2. Select the divisions we didn’t find: 1. Find the other divisions: in which someone makes <= : SELECT name FROM Division WHERE id IN (SELECT divid FROM Employees WHERE salary <= SELECT name FROM Division WHERE id IN (SELECT divid FROM Employees WHERE salary <= SELECT name FROM Division WHERE id NOT IN (SELECT divid FROM Employees WHERE salary <= SELECT name FROM Division WHERE id NOT IN (SELECT divid FROM Employees WHERE salary <=
M.P. Johnson, DBMS, Stern/NYU, Spring Or, universal with ALL Using <= ALL Employees(name, job, divid, salary) Division(name, id, head) Find all divisions in which everyone makes >
M.P. Johnson, DBMS, Stern/NYU, Spring Next: correlated subqueries Acc(name,bal,type…) Q: Who has the largest balance? Can we do this with subqueries?
M.P. Johnson, DBMS, Stern/NYU, Spring Acc(name,bal,type,…) Q: Find holder of largest account (Later, could use MAX, but still need a subquery here…) SELECT name FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc) SELECT name FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc) Correlated Queries
M.P. Johnson, DBMS, Stern/NYU, Spring Correlated Queries So far, subquery executed once; result used for higher query More complicated: correlated queries “[T]he subquery… [is] evaluated many times, once for each assignment of a value to some term in the subquery that comes from a tuple variable outside the subquery” (Ullman, p286). Q: What does this mean? A: That subqueries refer to vars from outer queries
M.P. Johnson, DBMS, Stern/NYU, Spring Acc(name,bal,type,…) Q2: Find holder of largest account of each type SELECT name, type FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=type) SELECT name, type FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=type) Correlated Queries correlation
M.P. Johnson, DBMS, Stern/NYU, Spring Acc(name,bal,type,…) Q2: Find holder of largest account of each type Note: 1. scope of variables 2. this can still be expressed as single SFW SELECT name, type FROM Acc a1 WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=a1.type) SELECT name, type FROM Acc a1 WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=a1.type) Correlated Queries correlation
M.P. Johnson, DBMS, Stern/NYU, Spring New topic: R.A./SQL Set Operators Relations are sets have set-theoretic ops Venn diagrams Union: R1 R2 Example: ActiveEmployees RetiredEmployees Difference: R1 – R2 Example: AllEmployees – RetiredEmployees = ActiveEmployees Intersection: R1 R2 Example: RetiredEmployees UnionizedEmployees
M.P. Johnson, DBMS, Stern/NYU, Spring Set operations - example NameAddressGenderBirthdate Fisher123 MapleF9/9/99 Hamill456 OakM8/8/88 NameAddressGenderBirthdate Fisher123 MapleF9/9/99 Ford345 PalmM7/7/77 R: S: NameAddressGenderBirthdate Fisher123 MapleF9/9/99 Hamill456 OakM8/8/88 Ford345 PalmM7/7/77 R S:
M.P. Johnson, DBMS, Stern/NYU, Spring Set operations - example NameAddressGenderBirthdate Fisher123 MapleF9/9/99 Hamill456 OakM8/8/88 NameAddressGenderBirthdate Fisher123 MapleF9/9/99 Ford345 PalmM7/7/77 R: S: R S: NameAddressGenderBirthdate Fisher123 MapleF9/9/99
M.P. Johnson, DBMS, Stern/NYU, Spring Set operations - example NameAddressGenderBirthdate Fisher123 MapleF9/9/99 Hamill456 OakM8/8/88 NameAddressGenderBirthdate Fisher123 MapleF9/9/99 Ford345 PalmM7/7/77 R: S: R - S: NameAddressGenderBirthdate Hamill456 OakM8/8/88
M.P. Johnson, DBMS, Stern/NYU, Spring Set ops in SQL Orthodox SQL has set operators: UNION, INTERSECT, EXCEPT Oracle SQL uses MINUS rather than EXCEPT See the Ullman page on more differencesUllman These ops applied to queries: (SELECT name FROM Person WHERE City = 'New York') INTERSECT (SELECT custname FROM Purchase WHERE store='Kim''s') (SELECT name FROM Person WHERE City = 'New York') INTERSECT (SELECT custname FROM Purchase WHERE store='Kim''s')
M.P. Johnson, DBMS, Stern/NYU, Spring Boat examples Reserve(ssn,bmodel,color) Q: Find ssns of sailors who reserved red boats or green boats SELECT DISTINCT ssn FROM reserve WHERE color = 'red' OR color = 'green' SELECT DISTINCT ssn FROM reserve WHERE color = 'red' OR color = 'green'
M.P. Johnson, DBMS, Stern/NYU, Spring Boat examples Reserve(ssn,bmodel,color) Q: Find ssns of sailors who reserved red boats and green boats SELECT DISTINCT ssn FROM reserve WHERE color = 'red' AND color = 'green' SELECT DISTINCT ssn FROM reserve WHERE color = 'red' AND color = 'green'
M.P. Johnson, DBMS, Stern/NYU, Spring Boat examples Reserve(ssn,bmodel,color) Q: Find ssns of sailors who reserved red boats and green boats SELECT DISTINCT r1.ssn FROM reserve r1, reserve r2 WHERE r1.ssn = r2.ssn AND r1.color = 'red' AND r2.color = 'green' SELECT DISTINCT r1.ssn FROM reserve r1, reserve r2 WHERE r1.ssn = r2.ssn AND r1.color = 'red' AND r2.color = 'green'
M.P. Johnson, DBMS, Stern/NYU, Spring Boat examples Reserve(ssn,bmodel,color) Q: Find ssns of sailors who reserved red boats and green boats (SELECT DISTINCT ssn FROM reserve WHERE color = 'red') INTERSECT(SELECT DISTINCT ssn FROM reserve WHERE color = 'green') (SELECT DISTINCT ssn FROM reserve WHERE color = 'red') INTERSECT(SELECT DISTINCT ssn FROM reserve WHERE color = 'green')
M.P. Johnson, DBMS, Stern/NYU, Spring Boat examples Reserve(ssn,bmodel,color) Q: Find ssns of sailors who reserved red boats or green boats (SELECT DISTINCT ssn FROM reserve WHERE color = 'red') UNION (SELECT DISTINCT ssn FROM reserve WHERE color = 'green') (SELECT DISTINCT ssn FROM reserve WHERE color = 'red') UNION (SELECT DISTINCT ssn FROM reserve WHERE color = 'green')
M.P. Johnson, DBMS, Stern/NYU, Spring Boat examples Reserve(ssn,bmodel,color) Q: Find ssns of sailors who reserved red boats but not green boats (SELECT DISTINCT ssn FROM reserve WHERE color = 'red') EXCEPT (SELECT DISTINCT ssn FROM reserve WHERE color = 'green') (SELECT DISTINCT ssn FROM reserve WHERE color = 'red') EXCEPT (SELECT DISTINCT ssn FROM reserve WHERE color = 'green')
M.P. Johnson, DBMS, Stern/NYU, Spring (SELECT name, address FROM Cust1) UNION (SELECT name FROM Cust2) (SELECT name, address FROM Cust1) UNION (SELECT name FROM Cust2) Union-Compatibility Situation: Cust1(name,address,…), Cust2(name,…) Want: report of all customer names and addresses (if known) Can’t do: Both tables must have same sequence of types Applies to all set ops
M.P. Johnson, DBMS, Stern/NYU, Spring Union-Compatibility Situation: Cust1(name,address,…), Cust2(name,…) Want: report of all customer names and addresses (if known) But can do: Resulting field names taken from first table (SELECT name, address FROM Cust1) UNION (SELECT name, '(N/A)' FROM Cust2) (SELECT name, address FROM Cust1) UNION (SELECT name, '(N/A)' FROM Cust2) Result(name, address)
M.P. Johnson, DBMS, Stern/NYU, Spring First Unintuitive SQLism Looking for R (S T) But what happens if T is empty? See transcript of this in Oracle on salestranscript SELECTR.A FROM R, S, T WHERER.A=S.A OR R.A=T.A SELECTR.A FROM R, S, T WHERER.A=S.A OR R.A=T.A
M.P. Johnson, DBMS, Stern/NYU, Spring New topic: Nulls in SQL If we don’t have a value, can put a NULL Null can mean several things: Value does not exists Value exists but is unknown Value not applicable But null is not the same as 0 See Douglas Foster Wallace…
M.P. Johnson, DBMS, Stern/NYU, Spring Null Values x = NULL 4*(3-x)/7 = NULL x = NULL x + 3 – x = NULL x = NULL 3 + (x-x) = NULL x = NULL x = 'Joe' is UNKNOWN In general: no row using null fields appear in the selection test will pass the test With one exception Pace Boole, SQL has three boolean values: FALSE=0 TRUE=1 UNKNOWN=0.5
M.P. Johnson, DBMS, Stern/NYU, Spring Null values in boolean expressions C1 AND C2= min(C1, C2) C1 OR C2= max(C1, C2) NOT C1= 1 – C1 height > 6 = UNKNOWN UNKNOWN OR weight > 190 = UNKOWN (age < 25) AND UNKNOWN = UNKNOWN E.g. age=20 height=NULL weight=180 SELECT * FROM Person WHERE (age < 25) AND (height > 6 OR weight > 190) SELECT * FROM Person WHERE (age < 25) AND (height > 6 OR weight > 190)
M.P. Johnson, DBMS, Stern/NYU, Spring Comparing null and non-nulls The schema specifies whether null is allowed for each attribute NOT NULL to forbid Nulls are allowed by default Unexpected behavior: Some Persons are not included! The “trichotomy law” does not hold! SELECT * FROM Person WHERE age = 25 SELECT * FROM Person WHERE age = 25
M.P. Johnson, DBMS, Stern/NYU, Spring Testing for null values Can test for NULL explicitly: x IS NULL x IS NOT NULL But: x = NULL is never true Now it includes all Persons SELECT * FROM Person WHERE age = 25 OR age IS NULL SELECT * FROM Person WHERE age = 25 OR age IS NULL
M.P. Johnson, DBMS, Stern/NYU, Spring Null/logic review TRUE AND UNKNOWN = ? TRUE OR UNKNOWN = ? UNKNOWN OR UNKNOWN = ? X = NULL = ?
M.P. Johnson, DBMS, Stern/NYU, Spring Next: Outer join Like inner join except that dangling tuples are included, padded with nulls Left outerjoin: dangling tuples from left are included Nulls appear “on the right” Right outerjoin: dangling tuples from right are included Nulls appear “on the left”
M.P. Johnson, DBMS, Stern/NYU, Spring Cross join - example NameAddressGenderBirthdate Hanks123 Palm RdM01/01/60 Taylor456 Maple AvF02/02/40 Lucas789 Oak StM03/03/55 NameAddressNetworth Spielberg246 Palm Rd10M Taylor456 Maple Av20M Lucas789 Oak St30M MovieStar MovieExec
M.P. Johnson, DBMS, Stern/NYU, Spring NameAddressG.BirthdateNameAddressNet Hanks123 Palm RdM01/01/60 Taylor456 Maple AvF02/02/40Taylor456 Maple Av20M Lucas789 Oak StM03/03/55Lucas789 Oak St30M Spielberg246 Palm Rd10M
M.P. Johnson, DBMS, Stern/NYU, Spring Outer Join - Example SELECT * FROM MovieStar LEFT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name SELECT * FROM MovieStar RIGHT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name NameAddressG.BirthdateNameAddressNet Hanks123 Palm RdM01/01/60Null Taylor456 Maple AvF02/02/40Taylor456 Maple Av20M Lucas789 Oak StM03/03/55Lucas789 Oak St30M Null Spielberg246 Palm Rd10M NameAddressG.BirthdateNameAddressNet Hanks123 Palm RdM01/01/60Null Taylor456 Maple AvF02/02/40Taylor456 Maple Av20M Lucas789 Oak StM03/03/55Lucas789 Oak St30M Null Spielberg246 Palm Rd10M
M.P. Johnson, DBMS, Stern/NYU, Spring Outer Join - Example NameAddressGenderBirthdate Hanks123 Palm RdM01/01/60 Taylor456 Maple AvF02/02/40 Lucas789 Oak StM03/03/55 NameAddressNetworth Spielberg246 Palm Rd10M Taylor456 Maple Av20M Lucas789 Oak St30M MovieStarMovieExec SELECT * FROM MovieStar FULL OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name NameAddressG.BirthdateNameAddressNet Hanks123 Palm RdM01/01/60Null Taylor456 Maple AvF02/02/40Taylor456 Maple Av20M Lucas789 Oak StM03/03/55Lucas789 Oak St30M Null Spielberg246 Palm Rd10M
M.P. Johnson, DBMS, Stern/NYU, Spring New-style outer joins Outer joins may be left, right, or full FROM A LEFT [OUTER] JOIN B; FROM A RIGHT [OUTER] JOIN B; FROM A FULL [OUTER] JOIN B; OUTER is optional If OUTER is included, then FULL is the default Q: How to remember left v. right? A: It indicates the side whose rows are always included
M.P. Johnson, DBMS, Stern/NYU, Spring Review Examples from sqlzoo.netsqlzoo.net SELECT L FROM R 1, …, R n WHERE C SELECT L FROM R 1, …, R n WHERE C L ( C (R 1 x … R n )