Download presentation
Presentation is loading. Please wait.
1
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 6: Set Functions
2
Miscellany Midterm Questions? Too easy? Too hard?
3
Topics for Today Aggregate Functions (Pages 49 – 54) GROUP BY Clause (Pages 54 – 55) HAVING Clause (Pages 55 – 58)
4
Set Functions The SQL divides set functions into two categories Aggregate functions (in book, covered) Supported by MySQL Important for data analysis Window functions (not in book, not covered) Not supported by MySQL Similar to aggregate functions
5
Aggregate Functions Aggregate/Non-aggregate similarities Both take some kind of input Both perform operations on the input Both have a single output. Aggregate/Non-aggregate differences Input to an aggregate function is a set of data Input to a non-aggregate function is a single item
6
Examples Function Example: SELECT UPPER(Title) AS Title FROM Movies; Aggregate Example: SELECT COUNT(Title) AS TitleCount FROM Movies; Notice that in queries involving aggregates... the table in the FROM clause gets collapsed down to a single row
7
Aggregate Functions Five Main Aggregate Functions COUNT(*), COUNT(expression) AVG(expression) MIN(expression), MAX(expression) SUM(expression)
8
Aggregate Function Usage You may use aggregate functions directly in the following clauses: SELECT clause HAVING clause ORDER BY clause
9
Aggregate Arguments All aggregate functions take only a single argument, an expression that can be any expression (accept another aggregate function), including a CASE statement Example: SELECT SUM(Runtime BETWEEN 60 AND 120) FROM Movies; Which is the same as… SELECT COUNT(MovieID) FROM Movies WHERE Runtime BETWEEN 60 AND 120;
10
COUNT COUNT(*) Counts the number of rows in a table Excludes NULLs (doesn't count them) -- This query returns 6. SELECT COUNT(*) AS 'Number of Movies' FROM Movies; COUNT(expression) Same as above -- This query also returns 6. SELECT COUNT(MovieID) AS 'Number of Movies' FROM Movies;
11
AVG AVG(expression) Averages all data under expression Excludes NULLs (doesn't count NULL as 0). -- Averages all movie runtimes. SELECT AVG(Runtime) AS 'AvgRuntime' FROM Movies;
12
MIN and MAX MIN(expression) Returns the minimum value under expression -- Returns the minimum runtime. SELECT MIN(Runtime) AS 'Shortest Movie' FROM Movies; MAX(expression) Returns the maximum value under expression -- Returns the maximum runtime. SELECT MAX(Runtime) AS 'Longest Movie' FROM Movies;
13
SUM SUM(expression) Sums all the data under expression Excludes NULLs (doesn't count NULL as 0). -- Sums all of the runtimes. SELECT SUM(Runtime) AS 'Total Length' FROM Movies;
14
More Aggregate Functions The SQL 2003 standard also defines a number of statistical aggregate functions that are also supported by MySQL STDDEV_POP, STDDEV_SAMP VAR_POP, VAR_SAMP More MySQL specific ones are here.here
15
Excluding Data From Aggregation To exclude items from being aggregated, you may use the WHERE clause. Example: Count the number of male members. SELECT COUNT(*) FROM Members WHERE Gender = 'M'; Example: Count the number of female members. SELECT COUNT(*) FROM Members WHERE Gender = 'F';
16
Aggregates and the WHERE Clause The WHERE clause filters data from being aggregated (it filters the table in the FROM clause) In other words, the WHERE clauses filters data BEFORE it is aggregated (counted, summed, etc.) Therefore, you MAY NOT use aggregate functions in the WHERE clause! -- This is an ERROR!!! SELECT Title FROM Movies WHERE Runtime > AVG(Runtime);
17
How Aggregate Queries Work In other words, think of any aggregate query as a two-step process Process #1: Filter things you don’t want to aggregate Process #2: Aggregate (COUNT, AVG, SUM, etc.) And because the WHERE clause operates in the first process, you CANNOT use aggregate functions in the WHERE clause
18
Example Compute the average runtime for only those movies that are shorter than two hours. SELECT AVG(Runtime) AS 'Average Runtime of Short Movies' FROM Movies WHERE Runtime < 120;
19
Aggregating in Groups Can we calculate this with a single query? Yes, but we need a way to group data together Solution: Use the GROUP BY clause
20
Grouping Data You can make an aggregate function return multiple values per table by grouping the table -- Count the number of male and female members. SELECT Gender, COUNT(*) FROM Members GROUP BY Gender; Because there are two gender groups, male and female, the COUNT function will return two values, one for each gender group
21
GROUP BY AND DISTINCT Notice that GROUP BY and DISTINCT behave somewhat similarly, in that GROUP BY divides data into DISTINCT groups SELECT Gender FROM Members GROUP BY Gender; SELECT DISTINCT Gender FROM Members; However, even though they return the same results, using GROUP BY as a DISTINCT replacement is considered a misuse of GROUP BY
22
GROUP BY and DISTINCT GROUP BY works in conjunction with aggregate functions, DISTINCT does not GROUP BY effects how data is aggregated, DISTINCT has no effect at all In other words, if you are using GROUP BY, you have better have an aggregate function somewhere! I will take off points if I see you use GROUP BY as a DISTINCT replacement!
23
How GROUP BY Works GROUP BY begins by sorting the table based on the grouping attributes (in our case, Gender) If any aggregate functions are present, GROUP BY causes each aggregate to be applied per-group rather than per-table GROUP BY then condenses the table so that each group only appears once in the table (if listed) and displays any aggregated values along with it
24
Grouping on Multiple Fields GROUP BY can use multiple fieldnames (similar to how you can sort using multiple fieldnames) SELECT Genre, MPAA, COUNT(*) FROM Movies JOIN XRefGenresMovies USING(MovieID) GROUP BY Genre, MPAA; Notice that the more fields you group by, the more results you get!
25
GROUP BY and Primary Keys Let’s say you want to display each movie along with the number of genres associated with that movie We could write SELECT Title, COUNT(Genre) FROM Movies JOIN XRefGenresMovies USING(MovieID) GROUP BY Title; However, if there are multiple movies with the same title, we will get the wrong answer!
26
GROUP BY and Primary Keys The solution is to GROUP BY the primary key, MovieID, followed by Title SELECT Title, COUNT(Genre) FROM Movies JOIN XRefGenresMovies USING(MovieID) GROUP BY MovieID, Title; We now have a query that will always work, even if there are multiple movie titles with the same name
27
Filtering Aggregated Results Using the previous example, once we have our aggregated result table, is it possible to filter out certain groups, say where COUNT(*) = 1?
28
The HAVING Clause Yes, but we must have a way of filtering results AFTER aggregation! Solution is to use the HAVING clause The HAVING clause filters AFTER aggregation (this is why you CAN use aggregate functions in the HAVING clause) The WHERE clause filters BEFORE aggregation (this is why you CANNOT use aggregate functions in the WHERE clause)
29
HAVING Clause Example Example: SELECT Genre, MPAA, COUNT(*) FROM Movies JOIN XRefGenresMovies USING(MovieID) GROUP BY Genre, MPAA HAVING COUNT(*) = 1;
30
HAVING Summary Using HAVING to filter out groups in an aggregated table In a HAVING clause, you may use: aggregate functions regular functions constant values grouping attributes
31
An Advanced HAVING Problem List the country and average member age of all male members located within that country, for only those countries that have an average male member age greater than 25. Remember that nobody ever says “I'm 20.3948279 years old!”
32
Solution SELECT Country, TRUNCATE(AVG(TRUNCATE(DATEDIFF( CURDATE(), BirthDate)/365, 0)), 0) AS Average FROM Members WHERE Gender = 'M' GROUP BY Country HAVING TRUNCATE(AVG(TRUNCATE(DATEDIFF( CURDATE(), BirthDate)/365, 0)), 0) > 25;
33
Alternate Solution Remember, in MySQL you may use aliases in the HAVING clause (but not in the WHERE clause of course)… SELECT Country, TRUNCATE(AVG(TRUNCATE(DATEDIFF( CURDATE(), BirthDate)/365, 0)), 0) AS Average FROM Members WHERE Gender = 'M' GROUP BY Country HAVING Average > 25;
34
Aggregating Distinct Values A normal SELECT DISTINCT query filters out duplicates after aggregation Therefore, if a field contains duplicate values, and you aggregate on that field, SELECT DISTINCT WILL NOT filter out duplicate values from being aggregated. The solution is to use the DISTINCT keyword within the aggregate function
35
Aggregating Distinct Values List each movie title along with the number of distinct actors that appear in each movie.
36
Aggregating Distinct Values SELECT Title, COUNT(ActorID) AS 'Actors’ FROM XRefActorsMovies JOIN Movies USING(MovieID) GROUP BY MovieID, Title; This is incorrect since some actors play multiple characters in each movie!
37
Aggregating Distinct Values SELECT Title, COUNT(DISTINCT ActorID) AS 'Actors' FROM XRefActorsMovies JOIN Movies USING(MovieID) GROUP BY MovieID, Title; This is the correct answer!
38
How to Solve Aggregate Problems Step #1: Write a SELECT FROM WHERE query that selects the data you want to aggregate. Step #2: Divide your data into groups using GROUP BY (if necessary) and add aggregate your groups (add your aggregate functions). Step #3: Filter your groups using the HAVING clause.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.