Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 8: Subqueries.

Slides:



Advertisements
Similar presentations
Advanced SQL (part 1) CS263 Lecture 7.
Advertisements

© 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Joins and Sub-queries in SQL.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 3: Joins Part I.
A Guide to SQL, Seventh Edition. Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations.
SQL Subqueries Objectives of the Lecture : To consider the general nature of subqueries. To consider simple versus correlated subqueries. To consider the.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Chapter 6 Set Functions.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.
Subqueries Example Find the name of the producer of ‘Star Wars’.
Instructor: Craig Duckett CASE, ORDER BY, GROUP BY, HAVING, Subqueries
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 2: Single-Table Selections.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 8: Correlated Subqueries.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 6: Set Functions.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 3: Joins Part I.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 9: Data Manipulation Language.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 8 Advanced SQL.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 5: Subqueries and Set Operations.
Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example.
Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 7:
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 2: Single-Table Selections.
Chapter 6 SQL: Data Manipulation Cont’d. 2 ANY and ALL u ANY and ALL used with subqueries that produce single column of numbers u ALL –Condition only.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 4: Joins Part II.
Introduction to Databases Chapter 7: Data Access and Manipulation.
Banner and the SQL Select Statement: Part Four (Multiple Connected Select Statements) Mark Holliday Department of Mathematics and Computer Science Western.
Using Relational Databases and SQL John Hurley Department of Computer Science California State University, Los Angeles Lecture 3: Joins Part I.
1 ICS 184: Introduction to Data Management Lecture Note 10 SQL as a Query Language (Cont.)
A Guide to MySQL 5. 2 Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables Use a subquery.
CSC271 Database Systems Lecture # 12. Summary: Previous Lecture  Row selection using WHERE clause  WHERE clause and search conditions  Sorting results.
Database Programming Sections 6 –Subqueries, Single Row Subqueries, Multiple-column subqueries, Multiple-row Subqueries, Correlated Subqueries 11/2/10,
Chapter 6 SQL: Data Manipulation (Advanced Commands) Pearson Education © 2009.
Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 6: Midterm Review.
Using Relational Databases and SQL John Hurley Department of Computer Science California State University, Los Angeles Lecture 2: Single-Table Selections.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 4: Joins Part II.
A Guide to SQL, Eighth Edition Chapter Five Multiple-Table Queries.
In this session, you will learn to: Query data by using joins Query data by using subqueries Objectives.
Database Programming Sections 6 –Subqueries, Single Row Subqueries, Multiple-row Subqueries, Correlated Subqueries.
CS122 Using Relational Databases and SQL Huiping Guo Department of Computer Science California State University, Los Angeles 2. Single Table Queries.
Slide 1 of 32ASH-Training Querying and Managing Data Using SQL Server 2014 By: Segla In this session, you will learn to: Query data by using joins Query.
CS 122: Lecture 3 Joins (Part 1) Tarik Booker CS 122 California State University, Los Angeles October 7, 2014.
Lecture 7: Subqueries Tarik Booker California State University, Los Angeles.
Tarik Booker CS 122. What we will cover… Tables (review) SELECT statement DISTINCT, Calculated Columns FROM Single tables (for now…) WHERE Date clauses,
CSC314 DAY 9 Intermediate SQL 1. Chapter 6 © 2013 Pearson Education, Inc. Publishing as Prentice Hall USING AND DEFINING VIEWS  Views provide users controlled.
Joins (Part II) Tarik Booker California State University, Los Angeles.
CS122 Using Relational Databases and SQL Huiping Guo Department of Computer Science California State University, Los Angeles 4. Subqueries and joins.
Using Subqueries to Solve Queries
More SQL: Complex Queries,
CS122 Using Relational Databases and SQL
Instructor: Craig Duckett Lecture 09: Tuesday, April 25th, 2017
CS122 Using Relational Databases and SQL
CS122 Using Relational Databases and SQL
Using Subqueries to Solve Queries
CS122 Using Relational Databases and SQL
Writing Correlated Subqueries
David M. Kroenke and David J
20761B 10: Using Subqueries Module 10   Using Subqueries.
CS122 Using Relational Databases and SQL
CMPT 354: Database System I
Using Subqueries to Solve Queries
CS122 Using Relational Databases and SQL
CS122 Using Relational Databases and SQL
Using Subqueries to Solve Queries
Using Subqueries to Solve Queries
Subqueries Schedule: Timing Topic 25 minutes Lecture
CS122 Using Relational Databases and SQL
Presentation transcript:

Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 8: Subqueries

Subqueries are queries within queries Also called inner queries A query that contains a subquery is called an outer query A subquery must be surrounded by parentheses

Subquery Example Example: -- List the name of each sales person who does not represent any members. No subquery: SELECT S.FirstName, S.LastName, S.salesID FROM salespeople S LEFT JOIN members M USING (salesID) WHERE M.memberID IS NULL; With subquery (in red): SELECT FirstName, LastName, salesID FROM salespeople WHERE salesID NOT IN (SELECT distinct salesID FROM members);

When to Use Subqueries Use a subquery when: When it is impossible or extremely difficult to solve the problem using a single query When a subquery solution to the problem runs faster than an equivalent non-subquery solution to the problem (rare with the current version of MySQL) When it is easier to understand a subquery than any alternate solution When you want to use an aggregate function in a where clause; subquery will execute separately

Types of Subqueries Single Value Subqueries Subquery returns a single value (one column, one row) List Subqueries Subquery returns a list (one column, multiple rows) Table Subqueries Subquery returns a table (multiple columns and rows)

WHERE Clause Subqueries Use a subquery in the WHERE clause when you want to filter records from the outer query using a single value or list of values returned from one or more subqueries Single value subqueries are OK List subqueries are OK

WHERE Clause Subquery Example Example #2: -- List all tracks with runtime greater than the average runtime of all tracks. This way won’t work because the where filters out data before it is aggregated: SELECT TrackTitle, lengthSeconds FROM Tracks T WHERE lengthSeconds > AVG(T.lengthSeconds)

WHERE Clause Subquery Example Solution: put the aggregate into a subquery SELECT TrackTitle, lengthSeconds FROM Tracks WHERE lengthSeconds > (SELECT AVG(lengthSeconds) FROM tracks); The aggregate does not filter out any tracks from the outer query results; the operator in the outer query does that. Subquery aggregate runs, full subquery completes and supplies a return value, then outer query runs

WHERE Clause Subquery Example Example #1: -- List all titles recorded at MakeTrax or Lone Star Recording. Do not use a join and do not hard-code company IDs.

WHERE Clause Subquery Example Outer and Inner Queries: The outer query... SELECT Title FROM Titles WHERE StudioID = (X) OR StudioID = (Y); Inner query X... SELECT studioID FROM studios WHERE studioName = 'MakeTrax'; Inner query Y... SELECT studioID FROM studios WHERE studioName = 'Lone Star Recording';

WHERE Clause Subquery Example Solution: SELECT Title FROM titles WHERE studioID = (SELECT studioID FROM studios WHERE studioName = 'MakeTrax') OR studioID = (SELECT studioID FROM studios WHERE studioName = 'Lone Star Recording');

IN and NOT IN Use the IN keyword to test if an expression matches any items in a list (typically returned by a subquery)‏ Syntax: expression IN (list subquery)‏ expression NOT IN (list subquery)‏

IN Example Example: -- List the memberID of each member of the Bullets without using a join

IN Example Solution: The outer query... SELECT R.memberID FROM xrefArtistsMembers R WHERE R.artistID = (X);

IN Example Solution: The inner query... SELECT artistID FROM Artists where artistName = ‘the Bullets’; Substitute to get the solution... SELECT MemberID FROM XrefArtistsMembers WHERE ArtistID = (SELECT ArtistID FROM Artists WHERE ArtistName = 'the Bullets');

ALL and ANY ALL The condition must hold true for all elements in the list. Syntax: expression operator ALL (list subquery)‏ ANY The condition must hold true for at least one element in the list. Syntax: expression operator ANY (list subquery)‏

ALL and ANY Examples Example: mysql> select lastname, birthday, region from members m where (region = "GA") or (birthday > all(select birthday from members where region = "GA")) order by birthday; Vs mysql> select lastname, birthday, region from members m where (region = "GA") or (birthday > any(select birthday from members where region = "GA")) order by birthday;

ALL and ANY Examples Example: -- List the names of all members whose birthdays are later than those of all members from CA or OH

ALL and ANY Examples Outer query... SELECT LastName, FirstName FROM Members WHERE Birthday > ALL (X) AND Birthday > ALL(Y)

ALL and ANY Examples Inner queries: Inner query X... SELECT birthday FROM Members WHERE Region = ‘CA’ -- Inner query Y... SELECT birthday FROM Members WHERE Region = ‘OH’

ALL and ANY Examples Substitute to get solution: SELECT lastName, FirstName FROM Members WHERE birthday > ALL (SELECT birthday FROM Members WHERE Region = ‘CA’) AND birthday > ALL(SELECT birthday FROM Members WHERE Region = ‘OH’)

HAVING Clause Subqueries As with a WHERE clause, you can have subqueries in a HAVING clause as well Think substitution as well List the number of members in each region which has more members than California

HAVING Clause Subqueries Outer and Inner Queries: Outer Query: SELECT Region, Count(*) FROM Members GROUP BY Region HAVING COUNT(*) > (X) Inner Query: SELECT COUNT(*) FROM Members WHERE Region = ‘CA';

HAVING Clause Subqueries Substitute to get solution: SELECT Region, Count(*) FROM Members GROUP BY Region HAVING COUNT(*) > (SELECT COUNT(*) FROM Members WHERE Region = “CA”);

SELECT Clause Subqueries A SELECT clause subquery must return a single value (not a list or table)‏ Examples: SELECT (SELECT 1) + (SELECT 2); -- 3 SELECT (SELECT COUNT(*) FROM tracks); SELECT (SELECT * FROM tracks); -- ERROR!!!

SELECT Clause Subqueries SELECT clause subqueries are good for single- value calculations, such as percentages Example: -- What percentage of members are male?

SELECT Clause Subqueries Example -- OUTER QUERY SELECT 100*(X)/(Y); -- INNER QUERY X = number of male accounts SELECT COUNT(*) FROM Members WHERE Gender = 'M'; -- INNER QUERY Y = number of total accounts SELECT COUNT(*) FROM Members; -- SOLUTION SELECT 100*(SELECT COUNT(*) FROM Members WHERE Gender = 'M')/(SELECT COUNT(*) FROM Members) as "Percent Male";

Nested Subqueries Nested subqueries are subqueries within subqueries Use same techniques as before, just go a little further Example: -- List the birthdays of all members who belong to artists which have recorded titles that include the word “the.” Do not use any joins.

Nested Queries -- Outer Query SELECT birthday FROM members WHERE memberID IN(X) -- Inner Query (X) SELECT memberID FROM xrefArtistsMembers WHERE artistID IN (Y); --Inner Query (Y) SELECT artistID FROM titles WHERE Title LIKE '% the %' OR Title like 'the %' OR Title LIKE "% the";

Nested Queries Solution: SELECT lastname, birthday FROM members WHERE memberID IN (SELECT memberID FROM xrefArtistsMembers WHERE artistID IN (SELECT artistID FROM titles WHERE Title LIKE "% the %" or Title like "the %" or Title LIKE "% the"))

Nested Subqueries What does this one do? SELECT A.artistName FROM artists A WHERE (A.artistID IN (select artistID FROM titles WHERE (titles.studioID IN (select studioID FROM studios P WHERE P.salesID IN (select salesID FROM salespeople WHERE base > 100)))))

Nested Subqueries Answer: Find all artists who have recorded titles at studios which are represented by salespeople whose base salaries are greater than $100

Correlated Subqueries Previous subqueries have been non-correlated. non-correlated means ‘no dependencies’ you can run the inner query separately

Correlated Subqueries Correlated subqueries are inner queries that are ‘dependent’ on data from outer queries. correlated means ‘with dependencies’ you can’t run the inner query separately the result of the inner query ‘depends on’ data given to it from the outer query A correlated subquery is executed for each row returned by an outer query Because of unique syntax the sub-query cannot be debugged as independent query

Correlated Subqueries List the first track of each title with its length in seconds and the total length in seconds of all tracks for that title: Select TrackTitle, LengthSeconds As Sec, (Select Sum(lengthseconds) From Tracks SC Where SC.TitleID=T.TitleID) As TotSec From Tracks T Where TrackNum=1; Sub-query returns multiple rows WHERE clause in sub-query joins each row to appropriate row of the outer query

Correlated Subqueries You can use an alias for the results of the outer query, making the full query easier to understand Find the titles of all tracks that are less than the mean lengths of tracks for the titles on which they occur: select tr.tracktitle, tr.lengthseconds from tracks tr where tr.lengthseconds < (select avg(lengthseconds) from tracks where titleID = tr.titleID);

EXISTS with Sub-Queries EXISTS checks for the existence of data in the sub-query Data is either there (True) or it isn't (False)

EXISTS with Sub-Queries List the names of all artists who have recorded at least one title: SELECT artistname from artists A where Exists (SELECT ArtistID FROM titles T WHERE T.artistID = A.ArtistID)

More Subqueries Find all artists which have members from GA: no subquery: SELECT DISTINCT Artistname FROM Artists A INNER JOIN XRefArtistsMembers X ON A.ArtistID = X.ArtistID INNER JOIN Members M ON X.MemberID = M.MemberID WHERE M.Region = "GA";

More Subqueries One subquery, two joins: SELECT DISTINCT Artistname FROM Artists A INNER JOIN XRefArtistsMembers X ON A.ArtistID = X.ArtistID INNER JOIN (SELECT MemberID FROM Members WHERE Region = "GA") M ON X.MemberID = M.MemberID; Can you do this with no joins at all?

More Subqueries Two Subqueries: Select distinct Artistname From Artists A where A.artistID in (select artistID from xrefartistsmembers x where x.memberID in (Select MemberID from Members where Region='GA'))

Subqueries v Joins Subqueries vs Joins Joins construct Cartesian products, then filter Subqueries select matching records The subquery version is often faster

Updating Records Why is EXISTS with a sub-query faster than a Join? With an EXISTS sub-query, SQL does not have to perform a full row by row join, building the Cartesian product and then tossing out unmatched rows. It simply runs the sub-query for each row of the outer query. It may not even have to run the entire sub-query, since as soon as it finds one good record it knows that at least some data exists.

How to Solve Subquery Problems To solve subquery problems: Always think substitution Analyze the question, looking for subqueries within the question Replace subqueries in the original question with substitution variables such as X, Y, and Z Write queries for your substitution variables Write a query to that solves the original question using your substitution variables Replace substitution variables with your subqueries

How to Solve Subquery Problems In other words: Try to solve the problem using a single query, and when you get stuck, write a subquery for the part you get stuck on!

Class Exercise List the artistIDs of all artists which have members born before January 2, List each artist that meets the criteria only once. a) use one or more joins, no subqueries b) use a subquery, no joins