ZEIT2301 Design of Information Systems SQL: Computing Statistics School of Engineering and Information Technology Dr Kathryn Merrick
Topic 11: SQL Computing Statistics In this lecture you will learn to use functions in SQL to compute simple statistics on data 1. Aggregating functions 2. Ordering functions 3. String functions 4. Date functions Reference:
1. Aggregate Functions Functions that operate on a single column (or expression) and return a single value COUNT – counts the number of values SUM – returns the total of the values AVG – returns the average of the values MIN – returns the minimum value MAX – returns the maximum value Used in SELECT clause NOT allowed in WHERE clause (very common mistake)
COUNT() How many clubs are there? SELECT COUNT(*) FROM sportClub; How many club presidents are there? SELECT COUNT(president) FROM sportClub; sportClub (sport, contactNo, sponsor, president, annualBudget ) Query returns a table with one row with one column. * is a special shorthand; Query counts all rows of the table Does not count nulls in the “president” column
COUNT(DISTINCT) How many different club sponsors are there? SELECT COUNT (DISTINCT sponsor) FROM sportClub; (supported by Oracle and SQL Server but not by Access) sportClub (sport, contactNo, sponsor, president, annualBudget ) Discards duplicates
SUM() Each club has an annual budget. What is the total budget amount for all clubs? SELECT SUM(annualBudget) FROM sportClub; sportClub (sport, contactNo, sponsor, president, annualBudget ) Query returns a table with one row with one column. Hint: The NRL Salary Cap for 2011 is $4.3m for the 25 highest paid players at each club.
AVG(), MIN(), MAX() Find the average, minimum and maximum cost of the clubs’ budgets SELECT AVG(annualBudget), MIN(annualBudget), MAX(annualBudget) FROM sportClub; Query returns one row with three columns.
Review: Column Name Aliases Columns can be renamed in the result table using the AS clause to give more meaningful output Also useful to avoid display of system generated column names for calculated columns (MsAccess uses “Expr1”) Select SUM(annualBudget) AS TotalBudget
The Bike Database Revisited Bike name* Number of riders* Centre of mass height Harley Harley Honda Honda Road conditions* Coefficient of friction Icy0.1 Wet0.5 Dry0.9 Scenario ID* Bike name Number of riders Road conditions Can stoppie 1Harley1Dryfalse 2Harley2Dryfalse 3Honda1Drytrue 4Honda2Drytrue Bike name* Wheelbase Harley1.588 Honda1.458
Aliasing Examples SELECT MIN(wheelbase) AS minWheelbase FROM Bikes; SELECT MAX(scenarioID) AS maxScenarioID FROM Scenarios; SELECT AVG(wheelbase) AS avgWheelbase FROM Bikes; SELECT COUNT(wheelbase) AS smallWheelbases FROM Bikes WHERE wheelbase < 1.5;
Aggregating Results The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns. Eg: to find the total value of all orders by each customer, we can use the GROUP BY statement to group customers. SELECT customer, SUM(orderPrice) FROM Orders GROUP BY customer Orders(orderID, orderDate, orderPrice, customer)
Aggregating Results Solution orderIDorderDateorderPricecustomer 12008/11/121000Hansen 22008/10/231600Nilsen 32008/09/02700Hansen 42008/09/03300Hansen 52008/08/302000Jensen 62008/10/04100Nilsen customerSUM(orderPrice) Hansen2000 Nilsen1700 Jensen2000 Orders Query result
Filtering Groups Individual rows can be filtered using a WHERE clause BUT groups must be filtered using a HAVING clause Eg: suppose we only want to display customer order totals less than $2000: SELECT Customer,SUM(OrderPrice) FROM Orders GROUP BY Customer HAVING SUM(OrderPrice) < 2000 customerSUM(OrderPrice) Nilsen1700
In Class Exercise What is the result of the following query on the Orders table? SELECT customer, SUM(orderPrice) FROM Orders WHERE customer='Hansen' OR customer='Jensen' GROUP BY customer HAVING SUM(orderPrice) > 1500 orderIDorderDateorderPricecustomer 12008/11/121000Hansen 22008/10/231600Nilsen 32008/09/02700Hansen 42008/09/03300Hansen 52008/08/302000Jensen 62008/10/04100Nilsen
2. Order Functions Find the first value of the orderPrice column SELECT FIRST(orderPrice) FROM Orders Equivalent to: SELECT orderPrice FROM Orders ORDER BY orderID LIMIT 1 Find the last value of the orderPrice column: SELECT LAST(orderPrice) FROM Orders Equivalent to: SELECT prderPrice FROM Orders ORDER BY orderID DESC LIMIT 1
3. String Functions Functions that operate on strings (varchars) UCASE() – convert a string to uppercase LCASE() – convert a string to lower case MID() – extract characters from the middle of a string LEN() – find the length of a string
UCASE() personIDlastNamefirstNameaddresscity 1HansenOlaTimoteivn 10Sandnes 2SvendsonToveBorgvn 23Sandnes 3PettersenKariStorgt 20Stavanger SELECT UCASE(lastName) as lastName, firstName FROM Persons lastNamefirstName HANSENOla SVENDSONTove PETTERSENKari Persons
MID() SELECT MID(city,1,4) as SmallCity FROM Persons Column name Start Character End Character SmallCity Sand Stav
LEN() SELECT LEN(Address) as LengthOfAddress FROM Persons LengthOfAddress
4. Date Functions Functions for manipulating dates NOW() – get the current system date and time SELECT productName, unitPrice, NOW() as perDate FROM Products prod_IdproductNameunitunitPrice 1Jarlsberg1000 g Mascarpone1000 g Gorgonzola1000 g15.67 productNameunitPriceperDate Jarlsberg /7/ :25:02 AM Mascarpone /7/ :25:02 AM Gorgonzola /7/ :25:02 AM
Summary After today’s lecture you should be able to write or interpret queries that include: Aggregating functions Ordering functions String functions Date functions