January 19th – Subqueries 2 and relational algebra

Slides:



Advertisements
Similar presentations
SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: SQL92, SQL2, SQL3. Vendors support.
Advertisements

SQL Queries Principal form: SELECT desired attributes FROM tuple variables –– range over relations WHERE condition about tuple variables; Running example.
Winter 2002Arthur Keller – CS 1806–1 Schedule Today: Jan. 22 (T) u SQL Queries. u Read Sections Assignment 2 due. Jan. 24 (TH) u Subqueries, Grouping.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
1 Lecture 12: SQL Friday, October 26, Outline Simple Queries in SQL (5.1) Queries with more than one relation (5.2) Subqueries (5.3) Duplicates.
Subqueries Example Find the name of the producer of ‘Star Wars’.
1 Lecture 03: SQL Friday, January 7, Administrivia Have you logged in IISQLSRV yet ? HAVE YOU CHANGED YOUR PASSWORD ? Homework 1 is now posted.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #3.
Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. SQL - part 2 - Database Management Systems I Alex Coman, Winter 2006.
Correlated Queries SELECT title FROM Movie AS Old WHERE year < ANY (SELECT year FROM Movie WHERE title = Old.title); Movie (title, year, director, length)
1 Lecture 3: More SQL Friday, January 9, Agenda Homework #1 on the web site today. Sign up for the mailing list! Next Friday: –In class ‘activity’
1 SQL cont.. 2 Outline Unions, intersections, differences (6.2.5, 6.4.2) Subqueries (6.3) Aggregations (6.4.3 – 6.4.6) Hint for reading the textbook:
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
SCUHolliday6–1 Schedule Today: u SQL Queries. u Read Sections Next time u Subqueries, Grouping and Aggregation. u Read Sections And then.
Database Management COP4540, SCS, FIU SQL (Continued) Querying Multiple Tables.
1 ICS 184: Introduction to Data Management Lecture Note 10 SQL as a Query Language (Cont.)
SQL Part I: Standard Queries. COMP-421: Database Systems - SQL Queries I 2 Example Instances sid sname rating age 22 debby debby lilly.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
1 Introduction to Database Systems CSE 444 Lecture 02: SQL September 28, 2007.
Himanshu GuptaCSE 532-SQL-1 SQL. Himanshu GuptaCSE 532-SQL-2 Why SQL? SQL is a very-high-level language, in which the programmer is able to avoid specifying.
More SQL (and Relational Algebra). More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update.
1 CS122A: Introduction to Data Management Lecture 9 SQL II: Nested Queries, Aggregation, Grouping Instructor: Chen Li.
Subqueries CIS 4301 Lecture Notes Lecture /23/2006.
1 Introduction to Database Systems, CS420 SQL JOIN, Aggregate, Grouping, HAVING and DML Clauses.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
More SQL: Complex Queries,
Schedule Today: Jan. 28 (Mon) Jan. 30 (Wed) Next Week Assignments !!
Slides are reused by the approval of Jeffrey Ullman’s
CS580 Advanced Database Topics
Lecture 05: SQL Wednesday, January 12, 2005.
Outerjoins, Grouping/Aggregation Insert/Delete/Update
Lecture 04: SQL Monday, January 10, 2005.
CPSC-310 Database Systems
Database Systems Subqueries, Aggregation
Cse 344 April 4th – Subqueries.
Basic SQL Lecture 6 Fall
COMP 430 Intro. to Database Systems
Introduction to Database Systems CSE 444 Lecture 04: SQL
SQL Structured Query Language 11/9/2018 Introduction to Databases.
Lecture 2 (cont’d) & Lecture 3: Advanced SQL – Part I
CS 405G: Introduction to Database Systems
Cse 344 January 12th –joins.
March 30th – intro to joins
Cse 344 January 10th –joins.
Introduction to Database Systems CSE 444 Lecture 03: SQL
January 17th – Subqueries
CPSC-310 Database Systems
Lecture 4: Advanced SQL – Part II
Introduction to Database Systems CSE 444 Lecture 03: SQL
More SQL: Complex Queries, Triggers, Views, and Schema Modification
SQL Introduction Standard language for querying and manipulating data
CMPT 354: Database System I
SQL: Structured Query Language
Lecture 12: SQL Friday, October 20, 2000.
Lectures 3: Introduction to SQL Part II
Lectures 7: Introduction to SQL 6
Introduction to Database Systems CSE 444 Lecture 02: SQL
Lecture 4: SQL Thursday, January 11, 2001.
Lecture 3 Monday, April 8, 2002.
Probabilistic Databases
Lecture 03: SQL Friday, October 3, 2003.
CPSC-608 Database Systems
More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation
Lecture 04: SQL Monday, October 6, 2003.
Select-From-Where Statements Multirelation Queries Subqueries
Presentation transcript:

January 19th – Subqueries 2 and relational algebra Cse 344 January 19th – Subqueries 2 and relational algebra

Assorted Minutiae Winter storm Inga Online quiz out after class Still due Wednesday, will be shorter but through today’s lecture For SQLite submissions, please use AS when aliasing Subqueries can return a single constant and this constant can be compared with another value in a WHERE clause Subqueries can return relations that can be used in various ways in WHERE clauses Subqueries can appear in FROM clauses, followed by a tuple variable that represents the tuples in the result of the subquery

Today’s lEcture Review of ordering Subqueries in FROM and WHERE Intro to relational algebra (maybe) Subqueries can return a single constant and this constant can be compared with another value in a WHERE clause Subqueries can return relations that can be used in various ways in WHERE clauses Subqueries can appear in FROM clauses, followed by a tuple variable that represents the tuples in the result of the subquery

Semantics of SQL With Group-By SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2 FWGHOS Evaluation steps: Evaluate FROM-WHERE using Nested Loop Semantics Group by the attributes a1,…,ak Apply condition C2 to each group (may have aggregates) Compute aggregates in S and return the result From Where Groupby Having Orderby Select CSE 344 - 2017au

Subqueries A subquery is a SQL query nested inside a larger query Such inner-outer queries are called nested queries A subquery may occur in: A SELECT clause A FROM clause A WHERE clause Rule of thumb: avoid nested queries when possible But sometimes it’s impossible, as we will see Subqueries can return a single constant and this constant can be compared with another value in a WHERE clause Subqueries can return relations that can be used in various ways in WHERE clauses Subqueries can appear in FROM clauses, followed by a tuple variable that represents the tuples in the result of the subquery You will understand why the last is true in a moment

Subqueries… Can return a single value to be included in a SELECT clause Can return a relation to be included in the FROM clause, aliased using a tuple variable Can return a single value to be compared with another value in a WHERE clause Can return a relation to be used in the WHERE or HAVING clause under an existential quantifier Subqueries can return a single constant and this constant can be compared with another value in a WHERE clause Subqueries can return relations that can be used in various ways in WHERE clauses Subqueries can appear in FROM clauses, followed by a tuple variable that represents the tuples in the result of the subquery

“correlated subquery” 1. Subqueries in SELECT Product (pname, price, cid) Company (cid, cname, city) For each product return the city where it is manufactured “correlated subquery” SELECT X.pname, (SELECT Y.city FROM Company Y WHERE Y.cid=X.cid) as City FROM Product X Putting query here because it is easier to paste into SQLite: SELECT X.pname, (SELECT Y.city FROM Company Y WHERE Y.cid=X.cid) As City FROM Product X; What happens if the subquery returns more than one city? We get a runtime error (and SQLite simply ignores the extra values…)

We have “unnested” the query 1. Subqueries in SELECT Whenever possible, don’t use a nested queries: SELECT X.pname, (SELECT Y.city FROM Company Y WHERE Y.cid=X.cid) as City FROM Product X = SELECT X.pname, Y.city FROM Product X, Company Y WHERE X.cid=Y.cid We have “unnested” the query SELECT X.pname, Y.city FROM Product X, Company Y WHERE X.cid=Y.cid

1. Subqueries in SELECT Compute the number of products made by each company SELECT DISTINCT C.cname, (SELECT count(*) FROM Product P WHERE P.cid=C.cid) FROM Company C SELECT DISTINCT C.cname, (SELECT count(*) FROM Product P WHERE P.cid=C.cid) FROM Company C; SELECT C.cname, count(*) FROM Company C, Product P WHERE C.cid=P.cid GROUP BY C.cname; SELECT C.cname, count(pname) FROM Company C left outer join Product P on C.cid=P.cid GROUP BY C.cname;

1. Subqueries in SELECT Compute the number of products made by each company SELECT DISTINCT C.cname, (SELECT count(*) FROM Product P WHERE P.cid=C.cid) FROM Company C SELECT DISTINCT C.cname, (SELECT count(*) FROM Product P WHERE P.cid=C.cid) FROM Company C; SELECT C.cname, count(*) FROM Company C, Product P WHERE C.cid=P.cid GROUP BY C.cname; SELECT C.cname, count(pname) FROM Company C left outer join Product P on C.cid=P.cid GROUP BY C.cname; SELECT C.cname, count(*) FROM Company C, Product P WHERE C.cid=P.cid GROUP BY C.cname Better: we can unnest using a GROUP BY CSE 344 - 2017au

1. Subqueries in SELECT But are these really equivalent? SELECT DISTINCT C.cname, (SELECT count(*) FROM Product P WHERE P.cid=C.cid) FROM Company C SELECT C.cname, count(*) FROM Company C, Product P WHERE C.cid=P.cid GROUP BY C.cname SELECT DISTINCT C.cname, (SELECT count(*) FROM Product P WHERE P.cid=C.cid) FROM Company C SELECT C.cname, count(*) FROM Company C, Product P WHERE C.cid=P.cid GROUP BY C.cname SELECT C.cname, count(pname) FROM Company C left outer join Product P on C.cid=P.cid GROUP BY C.cname

1. Subqueries in SELECT But are these really equivalent? SELECT DISTINCT C.cname, (SELECT count(*) FROM Product P WHERE P.cid=C.cid) FROM Company C SELECT C.cname, count(*) FROM Company C, Product P WHERE C.cid=P.cid GROUP BY C.cname No! Different results if a company has no products SELECT DISTINCT C.cname, (SELECT count(*) FROM Product P WHERE P.cid=C.cid) FROM Company C SELECT C.cname, count(*) FROM Company C, Product P WHERE C.cid=P.cid GROUP BY C.cname SELECT C.cname, count(pname) FROM Company C left outer join Product P on C.cid=P.cid GROUP BY C.cname SELECT C.cname, count(pname) FROM Company C LEFT OUTER JOIN Product P ON C.cid=P.cid GROUP BY C.cname CSE 344 - 2017au

2. Subqueries in FROM Find all products whose prices is > 20 and < 500 SELECT X.pname FROM (SELECT * FROM Product AS Y WHERE price > 20) as X WHERE X.price < 500

2. Subqueries in FROM Find all products whose prices is > 20 and < 500 SELECT X.pname FROM (SELECT * FROM Product AS Y WHERE price > 20) as X WHERE X.price < 500 Try unnest this query !

2. Subqueries in FROM Find all products whose prices is > 20 and < 500 SELECT X.pname FROM (SELECT * FROM Product AS Y WHERE price > 20) as X WHERE X.price < 500 Side note: This is not a correlated subquery. (why?) Try unnest this query !

2. Subqueries in FROM Sometimes we need to compute an intermediate table only to use it later in a SELECT-FROM-WHERE Option 1: use a subquery in the FROM clause Option 2: use the WITH clause

2. Subqueries in FROM SELECT X.pname FROM (SELECT * FROM Product AS Y WHERE price > 20) as X WHERE X.price < 500 = A subquery whose result we called myTable WITH myTable AS (SELECT * FROM Product AS Y WHERE price > 20) SELECT X.pname FROM myTable as X WHERE X.price < 500 CSE 344 - 2017au

3. Subqueries in WHERE Find all companies that make some products with price < 200 Conditions involving relations. Remember to say: EXISTS R is true if and only if R is not empty SELECT DISTINCT C.cname FROM Company C WHERE EXISTS (SELECT * FROM Product P WHERE C.cid = P.cid and P.price < 200)

Existential quantifiers 3. Subqueries in WHERE Find all companies that make some products with price < 200 Existential quantifiers Conditions involving relations. Remember to say: EXISTS R is true if and only if R is not empty SELECT DISTINCT C.cname FROM Company C WHERE EXISTS (SELECT * FROM Product P WHERE C.cid = P.cid and P.price < 200)

Existential quantifiers 3. Subqueries in WHERE Find all companies that make some products with price < 200 Existential quantifiers Using EXISTS: Conditions involving relations. Remember to say: EXISTS R is true if and only if R is not empty SELECT DISTINCT C.cname FROM Company C WHERE EXISTS (SELECT * FROM Product P WHERE C.cid = P.cid and P.price < 200) SELECT DISTINCT C.cname FROM Company C WHERE EXISTS (SELECT * FROM Product P WHERE C.cid = P.cid and P.price < 200)

Existential quantifiers 3. Subqueries in WHERE Find all companies that make some products with price < 200 Existential quantifiers Using IN SELECT DISTINCT C.cname FROM Company C WHERE C.cid IN (SELECT P.cid FROM Product P WHERE P.price < 200) SELECT DISTINCT C.cname FROM Company C WHERE C.cid IN (SELECT P.cid FROM Product P WHERE P.price < 200)

Existential quantifiers 3. Subqueries in WHERE Find all companies that make some products with price < 200 Existential quantifiers Using ANY: SQLITE DOES NOT SUPPORT ANY AND ALL! S > ANY R is true iff s is greater than at least one value in R SELECT DISTINCT C.cname FROM Company C WHERE 100 > ANY (SELECT price FROM Product P WHERE P.cid = C.cid) SELECT DISTINCT C.cname FROM Company C WHERE 200 > ANY (SELECT price FROM Product P WHERE P.cid = C.cid)

3. Subqueries in WHERE SELECT DISTINCT C.cname FROM Company C Find all companies that make some products with price < 200 Existential quantifiers Using ANY: SQLITE DOES NOT SUPPORT ANY AND ALL! S > ANY R is true iff s is greater than at least one value in R SELECT DISTINCT C.cname FROM Company C WHERE 100 > ANY (SELECT price FROM Product P WHERE P.cid = C.cid) SELECT DISTINCT C.cname FROM Company C WHERE 200 > ANY (SELECT price FROM Product P WHERE P.cid = C.cid) Not supported in sqlite

Existential quantifiers 3. Subqueries in WHERE Find all companies that make some products with price < 200 Existential quantifiers Now let’s unnest it: SELECT DISTINCT C.cname FROM Company C, Product P WHERE C.cid = P.cid and P.price < 200

Existential quantifiers 3. Subqueries in WHERE Find all companies that make some products with price < 200 Existential quantifiers Now let’s unnest it: You can easily unnest existential quantifiers SELECT DISTINCT C.cname FROM Company C, Product P WHERE C.cid = P.cid and P.price < 200 Existential quantifiers are easy!

3. Subqueries in WHERE Find all companies s.t. all their products have price < 200 same as: Find all companies that make only products with price < 200

Universal quantifiers 3. Subqueries in WHERE Find all companies s.t. all their products have price < 200 same as: Find all companies that make only products with price < 200 Universal quantifiers

Universal quantifiers 3. Subqueries in WHERE Find all companies s.t. all their products have price < 200 same as: Find all companies that make only products with price < 200 Universal quantifiers Universal quantifiers are hard!

3. Subqueries in WHERE SELECT DISTINCT C.cname FROM Company C Find all companies s.t. all their products have price < 200 1. Find the other companies that make some product ≥ 200 SELECT DISTINCT C.cname FROM Company C WHERE C.cid IN (SELECT P.cid FROM Product P WHERE P.price >= 200) SELECT DISTINCT C.cname FROM Company C WHERE C.cid IN (SELECT P.cid FROM Product P WHERE P.price >= 200); SELECT DISTINCT C.cname FROM Company C WHERE C.cid NOT IN (SELECT P.cid FROM Product P WHERE P.price >= 200);

3. Subqueries in WHERE SELECT DISTINCT C.cname FROM Company C Find all companies s.t. all their products have price < 200 1. Find the other companies that make some product ≥ 200 SELECT DISTINCT C.cname FROM Company C WHERE C.cid IN (SELECT P.cid FROM Product P WHERE P.price >= 200) 2. Find all companies s.t. all their products have price < 200 SELECT DISTINCT C.cname FROM Company C WHERE C.cid IN (SELECT P.cid FROM Product P WHERE P.price >= 200); SELECT DISTINCT C.cname FROM Company C WHERE C.cid NOT IN (SELECT P.cid FROM Product P WHERE P.price >= 200); SELECT DISTINCT C.cname FROM Company C WHERE C.cid NOT IN (SELECT P.cid FROM Product P WHERE P.price >= 200)

Universal quantifiers 3. Subqueries in WHERE Find all companies s.t. all their products have price < 200 Universal quantifiers Using EXISTS: SELECT DISTINCT C.cname FROM Company C WHERE NOT EXISTS (SELECT * FROM Product P WHERE P.cid = C.cid and P.price >= 200); SELECT DISTINCT C.cname FROM Company C WHERE NOT EXISTS (SELECT * FROM Product P WHERE P.cid = C.cid and P.price >= 200)

Universal quantifiers 3. Subqueries in WHERE Find all companies s.t. all their products have price < 200 Universal quantifiers Using ALL: Does not work in SQLite: SELECT DISTINCT C.cname FROM Company C WHERE 200 > ALL (SELECT price FROM Product P WHERE P.cid = C.cid); SELECT DISTINCT C.cname FROM Company C WHERE 200 >= ALL (SELECT price FROM Product P WHERE P.cid = C.cid)

3. Subqueries in WHERE SELECT DISTINCT C.cname FROM Company C Find all companies s.t. all their products have price < 200 Universal quantifiers Using ALL: Does not work in SQLite: SELECT DISTINCT C.cname FROM Company C WHERE 200 > ALL (SELECT price FROM Product P WHERE P.cid = C.cid); SELECT DISTINCT C.cname FROM Company C WHERE 200 >= ALL (SELECT price FROM Product P WHERE P.cid = C.cid) Not supported in sqlite

Question for Database Theory Fans and their Friends Can we unnest the universal quantifier query? We need to first discuss the concept of monotonicity

Monotone Queries Definition A query Q is monotone if: Whenever we add tuples to one or more input tables, the answer to the query will not lose any of the tuples

Monotone Queries Theorem: If Q is a SELECT-FROM-WHERE query that does not have subqueries, and no aggregates, then it is monotone. CSE 344 - 2017au

Monotone Queries Theorem: If Q is a SELECT-FROM-WHERE query that does not have subqueries, and no aggregates, then it is monotone. Proof. We use the nested loop semantics: if we insert a tuple in a relation Ri, this will not remove any tuples from the answer for x1 in R1 do for x2 in R2 do … for xn in Rn do if Conditions output (a1,…,ak) SELECT a1, a2, …, ak FROM R1 AS x1, R2 AS x2, …, Rn AS xn WHERE Conditions CSE 344 - 2017au

Monotone Queries The query: is not monotone Product (pname, price, cid) Company (cid, cname, city) Monotone Queries The query: is not monotone Find all companies s.t. all their products have price < 200

Monotone Queries The query: is not monotone Product (pname, price, cid) Company (cid, cname, city) Monotone Queries The query: is not monotone Find all companies s.t. all their products have price < 200 pname price cid Gizmo 19.99 c001 cid cname city c001 Sunworks Bonn cname Sunworks

Product (pname, price, cid) Company (cid, cname, city) Monotone Queries The query: is not monotone Consequence: If a query is not monotonic, then we cannot write it as a SELECT-FROM-WHERE query without nested subqueries Find all companies s.t. all their products have price < 200 pname price cid Gizmo 19.99 c001 cid cname city c001 Sunworks Bonn cname Sunworks pname price cid Gizmo 19.99 c001 Gadget 999.99 cid cname city c001 Sunworks Bonn cname

Queries that must be nested Queries with universal quantifiers or with negation CSE 344 - 2017au

Queries that must be nested Queries with universal quantifiers or with negation Queries that use aggregates in certain ways sum(..) and count(*) are NOT monotone, because they do not satisfy set containment select count(*) from R is not monotone! CSE 344 - 2017au