Download presentation
Presentation is loading. Please wait.
1
Concept of Aggregation in SQL
2
What does an aggregation function do?
Performs computation over all values in the column, and then produces just one tuple in the results that has that value. examples of aggregation : AVG() ,MIN(),MAX(),COUNT() Select AVG(GPA) from students Find the minimum GPA of the students who have applied for a CS major Select MIN(GPA) from Stuents,apply where students.sID=apply.sID and major=‘CS’;
3
Find the average GPA of the students who have applied for a CS major
Select MIN(GPA) from Stuents,apply where students.sID=apply.sID and major=‘CS’ In fact this result is probably not precisely as we were looking for !! Let’s back to star version query Select * The issue with this particular query is that if a student applied to CS multiple times in different universities when we are going to compute the average GPA we’re going to be counting his GPA several times. Note: presumably what we really want is to count the GPA only once for each student who applied to CS, no matter how many times they applied.
4
Answer the previous question using sub-query:
For each student we check whether their ID is among those who apply for computer science. Select * from student where sID in (select sID from Apply where major=‘CS’); We have the students who applied to CS and in this case we have one instance of each student. When we count the average GPA we correctly count each student GPA once.
5
Count function Count function counts the number of tuples in the result. Find the number of colleges in our database whose enrollment is greater than Select count(*) from college where enrollment>15000; Count the number of students who have applied to Cornel university. from Apply where cName=‘Cornell’; Note: we are over counting; in reality again we are counting the number of applications not the number of applicants. To get the precise result we can rewrite the query using sub query exactly the same way as we did to get the average GPA of all the students who apply to a specific university. but SQL provides a very nice feature to resolve this problem. We can use the special keyword “distinct” and the name of one or more attributes.
6
Rewrite the previous query using “distinct”
Select count(distinct sID) from Apply where cName=‘Cornell’; Note: Count (distinct) turns out to be a very useful feature in SQL. Count the number of students who have the same GPA select count(*) from Student S1,Student S2 where S2.sID<>S1.sID and S2.GPA=S1.GPA;
7
New query: The query computes the amount by which the average GPA of students who apply to CS exceeds the average GPA of students who didn’t apply to CS. Select CS.avgGPA – NonCS.avgGPA from (select AVG(GPA) as avgGPA from Student where sID in (select sID from Apply where major=‘CS’)) as CS, (select AVG(GPA) as avgGPA from Student where sID in (select sID from Apply where major<> ‘CS’) )as NonCS; Notes: sub query in from clause allows you to write a select from where expression and then use the result of that expression as it were an actual table in database. In the above example we are using two sub queries in from clause; one of them is the average GPA of CS applicants and one is the average GPA of non-CS applicants.
8
Group by clause Group by clause is used in conjunction with aggregation. Find the number of applicants to each college: Select cName, count(*) from Apply group by cName; Note: Effectively what grouping does, is it takes the relation and it partitions it by the value of a given attribute or a set of attributes. Specifically in this query we’re taking the apply relation and we’re breaking into groups where each group has one of the college names. For each group we return one tuple containing the college name and the number of the tuples in the group. Find the total enrollment of colleges for each state: Select state, sum(enrollement) from college Group by state
9
More challenging queries
For each college and major combination computes MAX and MIN GPA of students who’ve applied to that college: Select cName, major, min(GPA), max(GPA) from Student,Apply where Student.sId=Apply.sID group by cName,major Find the number of colleges applied by each student: Select sID, count(distinct cName) where Student.sID=Apply.sID group by Student.sID;
10
Having clause Having clause is also only used in conjunction with aggregation. What the having clause allows us to do is apply the conditions on the results of aggregation functions. So having clause is used after the group by clause and allows us to check the conditions that involve the entire group in contrast the where clause applies to one tuple at a time. Find the colleges that have fewer than five applicants: Select cName from Apply group by cName having count(*)<5; Note: we are grouping the apply relation by college name, for each college we are going to keep those groups where the number of tuples in the group is less that five.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.