Download presentation
Presentation is loading. Please wait.
Published byClifford Hall Modified over 9 years ago
1
1 Theory, Practice & Methodology of Relational Database Design and Programming Copyright © Ellis Cohen 2002-2008 Grouping These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. For more information on how you may use them, please see http://www.openlineconsult.com/db
2
2 © Ellis Cohen 2001-2008 Overview of Lecture Grouped Aggregation Restriction and Grouping Group Restriction
3
3 © Ellis Cohen 2001-2008 Grouped Aggregation
4
4 © Ellis Cohen 2001-2008 Grouped Aggregation Motivation Suppose we want to find the average and maximum salary of the employees in each department 1.We need to separate the employees into groups (where each department is a separate group) 2.We need to find the average and maximum salary of each group
5
5 © Ellis Cohen 2001-2008 SQL GROUP BY 105000 3018002850 5023003100 deptno avgsal maxsal empno deptno sal 7839105000 7499301600 7654301250 7698302850 7844301500 7986501500 7219503100 Emps SELECT deptno, avg(sal) AS avgsal, max(sal) AS maxsal FROM Emps GROUP BY deptno GROUP BY generates one result tuple for each group Describes how to group Describes the fields in the result set & how they are calculated Describes how to group
6
6 © Ellis Cohen 2001-2008 SELECT with GROUP BY SELECT deptno, avg(sal) AS avgsal, max(sal) AS maxsal FROM Emps GROUP BY deptno When GROUP BY is used the SELECT clause can contain GROUP BY expressions aggregate functions What's the SQL to count the # of employees who have each job?
7
7 © Ellis Cohen 2001-2008 SQL Group By Answer SELECT job, count(*) AS knt FROM Emps GROUP BY job What's the SQL to count the # of employees who have each job? CLERK1 ANALYST4 DEPTMGR2 job knt 783910CLERK 749930ANALYST 765410ANALYST 769820ANALYST 784430ANALYST 798610DEPTMGR 721930DEPTMGR Emps empno deptno job
8
8 © Ellis Cohen 2001-2008 GROUP and DISTINCT Compare the results of SELECT job FROM Emps GROUP BY job SELECT DISTINCT job FROM Emps How are the results different?
9
9 © Ellis Cohen 2001-2008 Answer: GROUP and DISTINCT SELECT job FROM Emps GROUP BY job SELECT DISTINCT job FROM Emps Identical Results! Use of DISTINCT is this case is preferred
10
10 © Ellis Cohen 2001-2008 Tuple vs Attribute Counting SELECT job, count(*) AS knt FROM Emps GROUP BY job SELECT job, count(deptno) AS knt FROM Emps GROUP BY job ? Are the results equivalent ?
11
11 © Ellis Cohen 2001-2008 Counting NULL Values SELECT job, count(*) AS tknt, count(deptno) AS dknt FROM Emps GROUP BY job CLERK11 ANALYST43 DEPTMGR22 job tknt dknt 783910CLERK 749930ANALYST 765410ANALYST 7698ANALYST 784430ANALYST 798610DEPTMGR 721930DEPTMGR Emps empno deptno job
12
12 © Ellis Cohen 2001-2008 Tuple vs Group Attribute Counting SELECT job, count(*) AS knt FROM Emps GROUP BY job SELECT job, count(job) AS knt FROM Emps GROUP BY job ? Are the results equivalent ?
13
13 © Ellis Cohen 2001-2008 Counting NULL Groups SELECT job, count(*) AS tknt, count(job) AS jknt FROM Emps GROUP BY job CLERK11 ANALYST44 20 DEPTMGR22 job tknt jknt 783910CLERK 749930ANALYST 765410ANALYST 769820ANALYST 784430ANALYST 751320 820410 798610DEPTMGR 721930DEPTMGR Emps empno deptno job
14
14 © Ellis Cohen 2001-2008 Composite Grouping SELECT deptno, job, count(*) AS knt FROM Emps GROUP BY deptno, job 783910CLERK 749930ANALYST 765430ANALYST 769830CLERK 784430SALESMAN 798650CLERK 721450CLERK 758650SALESMAN Emps empno deptno job 10CLERK1 30ANALYST2 30CLERK1 30SALESMAN1 50CLERK2 50SALESMAN1 deptno job knt How many employees hold each job within each department
15
15 © Ellis Cohen 2001-2008 Grouping & Distinct Aggregation SELECT deptno, count(DISTINCT job) AS njob FROM Emps GROUP BY deptno 783910CLERK 749930ANALYST 765430ANALYST 769830CLERK 784430SALESMAN 798650CLERK 721450CLERK 758650SALESMAN Emps empno deptno job 101 303 502 deptno njob How many different jobs are there within each department What's wrong with: SELECT deptno, count(job) AS njob FROM Emps GROUP BY deptno
16
16 © Ellis Cohen 2001-2008 Distinct Counts Problem Suppose, you executed CREATE TABLE DeptJobs AS SELECT DISTINCT deptno, job FROM Emps What's the difference between SELECT deptno, count(DISTINCT job) FROM Emps GROUP BY deptno SELECT deptno, count(job) FROM DeptJobs GROUP BY deptno SELECT deptno, count(*) FROM DeptJobs GROUP BY deptno
17
17 © Ellis Cohen 2001-2008 Diagram for Grouping Exercise 783910CLERK 749930ANALYST 765430ANALYST 769830CLERK 784430SALESMAN 798650CLERK 721450CLERK 758650SALESMAN Emps empno deptno job 10CLERK 30ANALYST 30CLERK 30SALESMAN 50CLERK 50SALESMAN deptno job 101 303 502 deptno njob SELECT DISTINCT deptno, job FROM Emps DeptJobs SELECT deptno, count(job) FROM DeptJobs GROUP BY deptno SELECT deptno, count(DISTINCT job) FROM Emps GROUP BY deptno
18
18 © Ellis Cohen 2001-2008 Distinct Counts With NULLs SELECT count(DISTINCT job) FROM Emps GROUP BY deptno – this ignores employees with NULL jobs SELECT count(job) FROM JustDepts GROUP BY deptno – this also ignores employees with NULL jobs SELECT count(*) FROM JustDepts GROUP BY deptno – the count will be one higher if any employees have NULL jobs
19
19 © Ellis Cohen 2001-2008 Restriction and Grouping
20
20 © Ellis Cohen 2001-2008 Group with Restriction empno ename deptno sal comm 7499ALLEN301600300 7654MARTIN3012501400 7698BLAKE302850 7839KING105000 7844TURNER3015000 7986STERN501500 Emps 10 30 50 GROUP BY deptno SELECT deptno, max(sal) AS maxsal FROM Emps WHERE ename <> 'BLAKE' GROUP BY deptno ORDER BY maxsal DEPTNO MAXSAL ------ 50 1500 30 1600 10 5000 Eliminates rows
21
21 © Ellis Cohen 2001-2008 SQL Query Sequence w GROUP SELECT deptno, max(sal) AS maxsal FROM Emps WHERE ename <> 'BLAKE' GROUP BY deptno ORDER BY deptno 3. Projection 1. Restriction 2. Grouping 3. Ordering
22
22 © Ellis Cohen 2001-2008 Restriction Can Eliminate Groups (but only by eliminating all rows in the group!) empno ename deptno sal comm 7499ALLEN301600300 7654MARTIN3012501400 7698BLAKE302850 7839KING105000 7844TURNER3015000 7986STERN501500 Emps 10 30 50 GROUP BY deptno SELECT deptno, max(sal) AS maxsal FROM Emps WHERE ename <> 'KING' GROUP BY deptno ORDER BY maxsal DEPTNO MAXSAL ------ 50 1500 30 2850 Eliminates rows
23
23 © Ellis Cohen 2001-2008 No Group Restriction in WHERE (only tuple restriction) SELECT deptno FROM Emps WHERE avg(sal) > 2000 GROUP BY deptno Suppose I want to know the departments in which the average employee salary is greater than 2000 We’ll see how to do this LATER! The WHERE clause is done BEFORE grouping It evaluates a single tuple at a time! avg(sal) can’t possibly make sense applied to a single tuple WRONG WHERE specifies a test to be applied to each individual tuple, so it CAN'T include aggregation!
24
24 © Ellis Cohen 2001-2008 Grouped Counts Problem SELECT deptno, count(*) AS dknt FROM Emps WHERE sal > 3000 GROUP BY deptno What does this return? Can any value in the dknt column be zero?
25
25 © Ellis Cohen 2001-2008 Grouped Counts Answer SELECT deptno, count(*) AS dknt FROM Emps WHERE sal > 3000 GROUP BY deptno Counts the number of employees in each department who make more than 3000. No dknt value can be zero. A group with no tuples simply doesn't appear Suppose we do want to see a zero for depts whose employees all make less than 3000
26
26 © Ellis Cohen 2001-2008 Arranging Zero Counts How many employees are in each department SELECT deptno, count(*) AS knt FROM Emps GROUP BY deptno … only show employees w sal > 3000 SELECT deptno, count(*) AS knt FROM Emps WHERE sal > 3000 GROUP BY deptno … showing departments with 0 employees SELECT deptno, count(CASE WHEN sal>3000 THEN sal END) AS knt FROM Emps GROUP BY deptno
27
27 © Ellis Cohen 2001-2008 Group Restriction
28
28 © Ellis Cohen 2001-2008 GROUP BY & HAVING SELECT deptno, max(sal) AS maxsal FROM Emps GROUP BY deptno HAVING avg(sal) > 2000 Group by deptno Keep those groups whose average salary > 2000 For each group, list the department number and the maximum salary of the group Determine the deptno and the maximum salary of those departments where the average salary > 2000 Note: We can use an aggregate function to calculate the average salary of each group
29
29 © Ellis Cohen 2001-2008 Effect of Having SELECT deptno, max(sal) AS maxsal FROM Emps GROUP BY deptno HAVING avg(sal) > 2000 105000 503100 deptno maxsal 105000 3028501800 5031002300 deptno maxsal avg(sal) empno deptno sal 7839105000 7499301600 7654301250 7698302850 7844301500 7986501500 7219503100 Emps > 2000
30
30 © Ellis Cohen 2001-2008 SQL Projected Group Restriction SELECT deptno, max(sal) AS maxsal FROM Emps GROUP BY deptno HAVING maxsal > 2000 ORDER BY deptno Will NOT work ON many DB's (including Oracle) SELECT deptno, max(sal) AS maxsal FROM Emps GROUP BY deptno HAVING max(sal) > 2000 ORDER BY deptno This will work on all RDB's
31
31 © Ellis Cohen 2001-2008 HAVING Exercise SELECT deptno FROM Emps GROUP BY deptno HAVING avg(sal) > 2000 What does this do?
32
32 © Ellis Cohen 2001-2008 HAVING vs WHERE SELECT deptno FROM Emps GROUP BY deptno HAVING avg(sal) > 2000 RIGHT ! SELECT deptno FROM Emps WHERE avg(sal) > 2000 GROUP BY deptno Aggregate functions CANNOT appear in the WHERE clause! Because WHERE is applied to each individual tuple * WRONG WHERE specifies a test to be applied to each individual tuple, so it CAN'T include aggregation! Determine the departments in which the average employee salary is greater than 2000
33
33 © Ellis Cohen 2001-2008 Another HAVING Exercise SELECT deptno FROM Emps GROUP BY deptno HAVING count(*) BETWEEN 2 AND 5 What does this do?
34
34 © Ellis Cohen 2001-2008 Counting # of Employees SELECT deptno FROM Emps GROUP BY deptno HAVING count(*) BETWEEN 2 AND 5 Generate the list of departments which have between 2 and 5 employees
35
35 © Ellis Cohen 2001-2008 SQL Query Sequence w HAVING SELECT deptno, max(sal) AS maxsal FROM Emps WHERE ename <> 'BLAKE' GROUP BY deptno HAVING avg(sal) > 2000 ORDER BY deptno 4. Projection 1. Restriction 2. Grouping 3. Group Restriction 4. Ordering
36
36 © Ellis Cohen 2001-2008 In simple SQL expressions Aggregate functions CAN ONLY appear in the SELECT, HAVING & ORDER BY clauses SELECT deptno, max(sal) FROM Emps WHERE job <> 'CLERK' GROUP BY deptno HAVING count(*) > 3 ORDER BY avg(sal) SQL Placement of Aggregate Functions OK NOT HERE Where can Aggregate Functions Go? Only if GROUP is used * *
37
37 © Ellis Cohen 2001-2008 Group Restriction Exercise Using Emps( empno, ename, job, sal, comm, deptno ) Write the SQL expression for the following: Show the average salary per job, excluding those jobs held by only a single employee Show the average salary per job, excluding those jobs found only in a single department You can assume that every employee has a job and a salary
38
38 © Ellis Cohen 2001-2008 Answer to Group Restriction Exercise Show the average salary per job, excluding those jobs held by only a single employee SELECT job, avg(sal) AS avgsal FROM Emps GROUP BY job HAVING count(*) > 1 Show the average salary per job, excluding those jobs found only in a single dept SELECT job, avg(sal) AS avgsal FROM Emps GROUP BY job HAVING count(DISTINCT deptno) > 1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.