Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 7:

Slides:



Advertisements
Similar presentations
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 3: Joins Part I.
Advertisements

TURKISH STATISTICAL INSTITUTE 1 /34 SQL FUNDEMANTALS (Muscat, Oman)
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Chapter 6 Set Functions.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 9: Data Manipulation Language.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 2: Single-Table Selections.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 6: Set Functions.
Using Relational Databases and SQL
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 9: Data Definition Language.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 9: Data Manipulation Language.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 10: Advanced Topics.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 5: Subqueries and Set Operations.
View Sen Zhang. Views are very common in business systems users view of data is simplified a form of security - user sees only the data he/she needs to.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 4: Joins Part II.
Introduction to SQL Session 1 Retrieving Data From a Single Table.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 2: Single-Table Selections.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 4: Joins Part II.
DAT702.  Standard Query Language  Ability to access and manipulate databases ◦ Retrieve data ◦ Insert, delete, update records ◦ Create and set permissions.
Computer Science 101 Web Access to Databases SQL – Extended Form.
SQL Operations Aggregate Functions Having Clause Database Access Layer A2 Teacher Up skilling LECTURE 5.
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 8: Subqueries.
Xin  Syntax ◦ SELECT field1 AS title1, field2 AS title2,... ◦ FROM table1, table2 ◦ WHERE conditions  Make a query that returns all records.
Using Relational Databases and SQL John Hurley Department of Computer Science California State University, Los Angeles Lecture 3: Joins Part I.
1 Single Table Queries. 2 Objectives  SELECT, WHERE  AND / OR / NOT conditions  Computed columns  LIKE, IN, BETWEEN operators  ORDER BY, GROUP BY,
Structure Query Language SQL. Database Terminology Employee ID 3 3 Last name Small First name Tony 5 5 Smith James
U:/msu/course/cse/103 Day 06, Slide 1 CSE students: Do not log in yet. Review Day 6 in your textbook. Think about.
Views Lesson 7.
Using Relational Databases and SQL John Hurley Department of Computer Science California State University, Los Angeles Lecture 2: Single-Table Selections.
Relational Databases.  In week 1 we looked at the concept of a key, the primary key is a column/attribute that uniquely identifies the rest of the data.
Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 4: Joins Part II.
O FFICE M ANAGEMENT T OOL - II B BA -V I TH. Abdus Salam2 Week-7 Introduction to Query Introduction to Query Querying from Multiple Tables Querying from.
DATA RETRIEVAL WITH SQL Goal: To issue a database query using the SELECT command.
Advanced SQL Concepts - Checking of Constraints CIS 4301 Lecture Notes Lecture /6/2006.
DAY 21: MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Aliya Farheen October 29,2015.
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
Structured Query Language SQL Unit 4 Solving Problems with SQL.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
Day 5 - More Complexity With Queries Explanation of JOIN & Examples Explanation of JOIN & Examples Explanation & Examples of Aggregation Explanation &
MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Sravanthi Lakkimsety Mar 14,2016.
SQL: Interactive Queries (2) Prof. Weining Zhang Cs.utsa.edu.
CS122 Using Relational Databases and SQL Huiping Guo Department of Computer Science California State University, Los Angeles 2. Single Table Queries.
CS 122: Lecture 3 Joins (Part 1) Tarik Booker CS 122 California State University, Los Angeles October 7, 2014.
Lecture 7: Subqueries Tarik Booker California State University, Los Angeles.
Tarik Booker CS 122. What we will cover… Tables (review) SELECT statement DISTINCT, Calculated Columns FROM Single tables (for now…) WHERE Date clauses,
Joins (Part II) Tarik Booker California State University, Los Angeles.
CS122 Using Relational Databases and SQL Huiping Guo Department of Computer Science California State University, Los Angeles 4. Subqueries and joins.
CS122 Using Relational Databases and SQL
Aggregating Data Using Group Functions
Lyric Database Lyric Music
Using Relational Databases and SQL
CS122 Using Relational Databases and SQL
Concept of Aggregation in SQL
CS122 Using Relational Databases and SQL
Lecture#7: Fun with SQL (Part 2)
SQL – Entire Select.
Aggregating Data Using Group Functions
Access: SQL Participation Project
CS4222 Principles of Database System
CS122 Using Relational Databases and SQL
CS122 Using Relational Databases and SQL
Reporting Aggregated Data Using the Group Functions
Section 4 - Sorting/Functions
Reporting Aggregated Data Using the Group Functions
Lyric Database Lyric Music
Presentation transcript:

Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 7:

Topics for Today GROUP BY HAVING‏ Set Operations Scripts Views

Aggregating in Groups Can we calculate this with a single query? Yes, but we need a way to group data together Solution: Use the GROUP BY clause

Grouping Data You can make an aggregate function return multiple values per table by grouping the table -- Count the number of male and female members. SELECT Gender, COUNT(*) FROM Members GROUP BY Gender; Because there are two gender groups, male and female, the COUNT function will return two values, one for each gender group

How GROUP BY Works GROUP BY begins by sorting the table based on the grouping attributes (in our case, Gender)‏ If any aggregate functions are present, GROUP BY causes each aggregate to be applied per-group rather than per-table GROUP BY then condenses the table so that each group only appears once in the table (if listed) and displays any aggregated values along with it

Grouping on Multiple Fields GROUP BY can use multiple fieldnames (similar to how you can sort using multiple fieldnames)‏ SELECT Genre, ArtistID, COUNT(*) FROM Titles GROUP BY Genre, ArtistID order by Genre Notice that the more fields you group by, the more results you get!

GROUP BY AND DISTINCT In some cases, using GROUP BY outside an aggregate would produce the same result as using DISTINCT SELECT Gender FROM Members GROUP BY Gender; SELECT DISTINCT Gender FROM Members; This is sloppy practice!

GROUP BY and DISTINCT GROUP BY works in conjunction with aggregate functions, DISTINCT does not GROUP BY affects how data is aggregated and thus has more work to do. DISTINCT just checks to see whether the value is a repeat In other words, if you are using GROUP BY, you had better have an aggregate function somewhere! From now on, using GROUP BY as a DISTINCT replacement is an error!

GROUP BY and Primary Keys Let’s say you want to display each title and the number of members who played on it. We could write SELECT Title, COUNT(M.FirstName) FROM titles JOIN XRefArtistsMembers USING(artistID) join members M using(memberID) GROUP BY Title However, if there are multiple albums with the same title, we will get garbage!

GROUP BY and Primary Keys The solution is to GROUP BY the primary key, TitleID, followed by Title SELECT Title, COUNT(M.FirstName) FROM titles JOIN XRefArtistsMembers USING(artistID) join members M using(memberID) GROUP BY titleID, title We now have a query that will always work, even if there are multiple titles with the same name

Filtering Aggregated Results Using the previous example, once we have our aggregated result table, is it possible to filter out certain groups, say where COUNT(*) = 1?

The HAVING Clause Yes, but we must have a way of filtering results AFTER aggregation! Solution is to use the HAVING clause The HAVING clause filters AFTER aggregation (this is why you CAN use aggregate functions in the HAVING clause) The WHERE clause filters BEFORE aggregation (this is why you CANNOT use aggregate functions in the WHERE clause)

HAVING Summary Using HAVING to filter out groups in an aggregated table In a HAVING clause, you may use: aggregate functions regular functions constant values grouping attributes

HAVING Clause Example Example: SELECT Genre, COUNT(*) FROM Titles J GROUP BY Genre HAVING COUNT(*) > 1;

Aggregating Distinct Values A normal SELECT DISTINCT query filters out duplicates after aggregation Therefore, if a field contains duplicate values, and you aggregate on that field, SELECT DISTINCT WILL NOT filter out duplicate values from being aggregated. This could produce incorrect answers with, for example, count() or avg()

Aggregating Distinct Values The solution is to use the DISTINCT keyword within the aggregate function Example: WRONG: SELECT DISTINCT COUNT(Firstname) FROM Members; There is only one value, the number 23! RIGHT: SELECT COUNT(DISTINCT(Firstname)) FROM Members;

Aggregating Distinct Values create table musicians( id int primary key, name varchar(30) ); create table bands( id int primary key, name varchar(30) ); create table xref( musicianId int, bandId int, instrument varchar(30), primary key ( musicianid, bandId, instrument) ); insert into musicians values(1, "Mick"); insert into musicians values(2, "Keith"); insert into musicians values(3, "Charlie"); insert into musicians values(4, "Ron"); insert into musicians values(5, "Jan"); insert into musicians values(6, "Dean"); insert into bands values(1, "Rolling Stones"); insert into bands values(2, "Jan and Dean"); insert into xref values(1, 1, "Vocals"); insert into xref values(2, 1, "Guitar"); insert into xref values(3, 1, "Drums"); insert into xref values(4, 1, "Guitar"); insert into xref values(5, 2, "Vocals"); insert into xref values(6, 2, "Vocals"); insert into xref values(6, 2, "Vuvuzuela");

Aggregating Distinct Values Find the number of different musicians in each band -- WRONG!!! SELECT b.name, COUNT(m.id) AS 'Musicians’FROM bands b JOIN xref x ON b.id = x.bandId JOIN musicians m ON x.musicianId = m.id GROUP BY bandID; This is incorrect since some musicians in this DB are listed more than once in xref because they play more than one instrument

Aggregating Distinct Values -- CORRECT!!! SELECT b.name, COUNT(distinct m.id) AS 'Musicians’ FROM bands b JOIN xref x ON b.id = x.bandId JOIN musicians m ON x.musicianId = m.id GROUP BY bandID; This is the correct answer!

Views Views are stored queries Views are part of the database schema Use views as though they were actual tables With complex DBs, may be better to query views than tables

Views Syntax: CREATE [OR REPLACE] VIEW view_name AS select_statement It’s a good practice to include the primary keys of the tables you select from. You may want to get the SELECT right first, then build the CREATE VIEW around it.

Views Example: CREATE VIEW SalespersonMembers AS SELECT s.lastname AS "salespersonName", s.salesID, m.lastname AS "memberName", m.memberID FROM salespeople s JOIN members m USING(salesID);

Views Select from views just as if they were tables: SELECT * FROM SalespersonMembers;

Views Benefits: Consider an application with queries and DML based on a third-party database Database manager changes the table definitions: application code breaks! Any changes require careful coordination between DB manager and application programmers Consider an app that queries against database views DB manager can change the tables but rewrite the views to maintain an unchanged interface Application programmers don’t have to worry about underlying changes

Union List all regions that contain members, studios, or both This is a SET operation, so there will be no duplicates. SELECT region FROM members UNION SELECT region FROM studios;

Summary Statistics with UNION Select 1, '< 2 minutes' As Length, Count(*) As NumTracks From Tracks Where LengthSeconds<120 Union Select 2, '2-3 minutes', Count(*)From Tracks Where LengthSeconds Between 120 and 180 Union Select 3, '>3 minutes', Count(*) From Tracks Where LengthSeconds>180;

UNION ALL Counts duplicates multiple time; breaks set definition of UNION Get number of electronic versions available for all tracks: Select Count(SoundFile) As Num_Electronic_Files from (Select MP3 As SoundFile From Tracks Where MP3=1 Union All Select RealAud From Tracks Where RealAud=1) as elec;

In Class Exercise Write SQL queries to find the following: By how many seconds does the length of the longest track in the db exceed that of the average track? List the title names and IDs with the number of tracks on each one