Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 7:

Similar presentations


Presentation on theme: "Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 7:"— Presentation transcript:

1 Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 7:

2 Topics for Today GROUP BY HAVING‏ Set Operations Scripts Views

3 Aggregating in Groups Can we calculate this with a single query? Yes, but we need a way to group data together Solution: Use the GROUP BY clause

4 Grouping Data You can make an aggregate function return multiple values per table by grouping the table -- Count the number of male and female members. SELECT Gender, COUNT(*) FROM Members GROUP BY Gender; Because there are two gender groups, male and female, the COUNT function will return two values, one for each gender group

5 How GROUP BY Works GROUP BY begins by sorting the table based on the grouping attributes (in our case, Gender)‏ If any aggregate functions are present, GROUP BY causes each aggregate to be applied per-group rather than per-table GROUP BY then condenses the table so that each group only appears once in the table (if listed) and displays any aggregated values along with it

6 Grouping on Multiple Fields GROUP BY can use multiple fieldnames (similar to how you can sort using multiple fieldnames)‏ SELECT Genre, ArtistID, COUNT(*) FROM Titles GROUP BY Genre, ArtistID order by Genre Notice that the more fields you group by, the more results you get!

7 GROUP BY AND DISTINCT In some cases, using GROUP BY outside an aggregate would produce the same result as using DISTINCT SELECT Gender FROM Members GROUP BY Gender; SELECT DISTINCT Gender FROM Members; This is sloppy practice!

8 GROUP BY and DISTINCT GROUP BY works in conjunction with aggregate functions, DISTINCT does not GROUP BY affects how data is aggregated and thus has more work to do. DISTINCT just checks to see whether the value is a repeat In other words, if you are using GROUP BY, you had better have an aggregate function somewhere! From now on, using GROUP BY as a DISTINCT replacement is an error!

9 GROUP BY and Primary Keys Let’s say you want to display each title and the number of members who played on it. We could write SELECT Title, COUNT(M.FirstName) FROM titles JOIN XRefArtistsMembers USING(artistID) join members M using(memberID) GROUP BY Title However, if there are multiple albums with the same title, we will get garbage!

10 GROUP BY and Primary Keys The solution is to GROUP BY the primary key, TitleID, followed by Title SELECT Title, COUNT(M.FirstName) FROM titles JOIN XRefArtistsMembers USING(artistID) join members M using(memberID) GROUP BY titleID, title We now have a query that will always work, even if there are multiple titles with the same name

11 Filtering Aggregated Results Using the previous example, once we have our aggregated result table, is it possible to filter out certain groups, say where COUNT(*) = 1?

12 The HAVING Clause Yes, but we must have a way of filtering results AFTER aggregation! Solution is to use the HAVING clause The HAVING clause filters AFTER aggregation (this is why you CAN use aggregate functions in the HAVING clause) The WHERE clause filters BEFORE aggregation (this is why you CANNOT use aggregate functions in the WHERE clause)

13 HAVING Summary Using HAVING to filter out groups in an aggregated table In a HAVING clause, you may use: aggregate functions regular functions constant values grouping attributes

14 HAVING Clause Example Example: SELECT Genre, COUNT(*) FROM Titles J GROUP BY Genre HAVING COUNT(*) > 1;

15 Aggregating Distinct Values A normal SELECT DISTINCT query filters out duplicates after aggregation Therefore, if a field contains duplicate values, and you aggregate on that field, SELECT DISTINCT WILL NOT filter out duplicate values from being aggregated. This could produce incorrect answers with, for example, count() or avg()

16 Aggregating Distinct Values The solution is to use the DISTINCT keyword within the aggregate function Example: WRONG: SELECT DISTINCT COUNT(Firstname) FROM Members; There is only one value, the number 23! RIGHT: SELECT COUNT(DISTINCT(Firstname)) FROM Members;

17 Aggregating Distinct Values create table musicians( id int primary key, name varchar(30) ); create table bands( id int primary key, name varchar(30) ); create table xref( musicianId int, bandId int, instrument varchar(30), primary key ( musicianid, bandId, instrument) ); insert into musicians values(1, "Mick"); insert into musicians values(2, "Keith"); insert into musicians values(3, "Charlie"); insert into musicians values(4, "Ron"); insert into musicians values(5, "Jan"); insert into musicians values(6, "Dean"); insert into bands values(1, "Rolling Stones"); insert into bands values(2, "Jan and Dean"); insert into xref values(1, 1, "Vocals"); insert into xref values(2, 1, "Guitar"); insert into xref values(3, 1, "Drums"); insert into xref values(4, 1, "Guitar"); insert into xref values(5, 2, "Vocals"); insert into xref values(6, 2, "Vocals"); insert into xref values(6, 2, "Vuvuzuela");

18 Aggregating Distinct Values Find the number of different musicians in each band -- WRONG!!! SELECT b.name, COUNT(m.id) AS 'Musicians’FROM bands b JOIN xref x ON b.id = x.bandId JOIN musicians m ON x.musicianId = m.id GROUP BY bandID; This is incorrect since some musicians in this DB are listed more than once in xref because they play more than one instrument

19 Aggregating Distinct Values -- CORRECT!!! SELECT b.name, COUNT(distinct m.id) AS 'Musicians’ FROM bands b JOIN xref x ON b.id = x.bandId JOIN musicians m ON x.musicianId = m.id GROUP BY bandID; This is the correct answer!

20 Views Views are stored queries Views are part of the database schema Use views as though they were actual tables With complex DBs, may be better to query views than tables

21 Views Syntax: CREATE [OR REPLACE] VIEW view_name AS select_statement It’s a good practice to include the primary keys of the tables you select from. You may want to get the SELECT right first, then build the CREATE VIEW around it.

22 Views Example: CREATE VIEW SalespersonMembers AS SELECT s.lastname AS "salespersonName", s.salesID, m.lastname AS "memberName", m.memberID FROM salespeople s JOIN members m USING(salesID);

23 Views Select from views just as if they were tables: SELECT * FROM SalespersonMembers;

24 Views Benefits: Consider an application with queries and DML based on a third-party database Database manager changes the table definitions: application code breaks! Any changes require careful coordination between DB manager and application programmers Consider an app that queries against database views DB manager can change the tables but rewrite the views to maintain an unchanged interface Application programmers don’t have to worry about underlying changes

25 Union List all regions that contain members, studios, or both This is a SET operation, so there will be no duplicates. SELECT region FROM members UNION SELECT region FROM studios;

26 Summary Statistics with UNION Select 1, '< 2 minutes' As Length, Count(*) As NumTracks From Tracks Where LengthSeconds<120 Union Select 2, '2-3 minutes', Count(*)From Tracks Where LengthSeconds Between 120 and 180 Union Select 3, '>3 minutes', Count(*) From Tracks Where LengthSeconds>180;

27 UNION ALL Counts duplicates multiple time; breaks set definition of UNION Get number of electronic versions available for all tracks: Select Count(SoundFile) As Num_Electronic_Files from (Select MP3 As SoundFile From Tracks Where MP3=1 Union All Select RealAud From Tracks Where RealAud=1) as elec;

28 In Class Exercise Write SQL queries to find the following: By how many seconds does the length of the longest track in the db exceed that of the average track? List the title names and IDs with the number of tracks on each one


Download ppt "Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 7:"

Similar presentations


Ads by Google