Chapter 11 Group Functions (up to p.402)

Chapter 11 Group Functions (up to p.402)
Jason C. H. Chen, Ph.D. Professor of MIS School of Business Gonzaga University Spokane, WA USA

Objectives Differentiate between single-row and multiple-row functions
Use the SUM and AVG functions for numeric calculations Use the COUNT function to return the number of records containing non-NULL values Use COUNT(*) to include records containing NULL values Use the MIN and MAX functions with nonnumeric fields

Objectives (continued)
Determine when to use the GROUP BY clause to group data Identify when the HAVING clause should be used List the order of precedence for evaluating WHERE, GROUP BY, and HAVING clauses State the maximum depth for nesting group functions Nest a group function inside of a single-row function

Objectives (continued)
Calculate the standard deviation and variance of a set of data, using the STDDEV and VARIANCE functions Explain the concept of multidimensional analysis Perform enhanced aggregation grouping with the GROUPING SETS, CUBE, and ROLLUP Use composite columns and concatenated groupings in grouping operations

1. Download chapter 11 files from Bb to c:\oradata\chapter11\
Refresh the Database 1. Download chapter 11 files from Bb to c:\oradata\chapter11\ 2. Run the following script file Start c:\oradata\chapter11\JLDB_Build_11.sql

Group Functions Return one result per group of rows processed Are also called multiple-row and aggregate functions All group functions ignore NULL values except COUNT(*) Use DISTINCT to suppress duplicate values

Added Clauses Figure SELECT statement syntax

Calculates total amount stored in a numeric column for a group of rows
SUM Function Calculates total amount stored in a numeric column for a group of rows Figure Using the SUM function to calculate order profit

Calculates the average of numeric values in a specified column
AVG Function Calculates the average of numeric values in a specified column Figure Using the AVG function to calculate average profit

-- chapter 11, Figure 11-5(B); p.389
Figure Using the AVG function to calculate average profit -- chapter 11, Figure 11-5(B); p.389 SELECT TO_CHAR(AVG( retail - cost), '$999.99') "Average Profit" FROM books WHERE category = 'COMPUTER';

COUNT Function Two purposes Count non-NULL values
Count total records, including those with NULL values

COUNT Function – Non-NULL Values
Include column name in argument to count number of occurrences Figure Using the COUNT function with the DISTINCT option

COUNT Function – NULL Values
Include asterisk in argument to count number of rows Figure Using the COUNT(*) function to include NULL values

MAX Function Returns largest value
Figure Using the MAX function on numeric data

Returns the smallest value
MIN Function Returns the smallest value Figure Using the MIN function on date data

Datatypes The COUNT, MIN, and MAX functions can be used on values with character, numeric, and date datatypes

Grouping Data GROUP BY clause Used to group data
Must be used for any individual column in the SELECT clause with a group function Cannot reference column aliases

GROUP BY Example Figure Adding the GROUP BY clause

Common Error A common error is missing a GROUP BY clause for nonaggregated columns in the SELECT clause Figure Flawed query: Including both aggregate and nonaggregate columns requires a GROUP BY clause

It is an individual record (just “Profit” not “Highest Profit”)
Figure Inappropriate use of GROUP BY

Figure 11-20 Calculate the total amount due by each customer and order

Restricting Aggregated Output
HAVING clause serves as the WHERE clause for grouped data. It is used to eliminate certain groups from further consideration. 4 records are eliminated Figure Using a HAVING clause to restrict which groups are displayed

Restricting Aggregated Output (continued)
When included in the same SELECT statement, the clauses are evaluated in the order of: WHERE GROUP BY HAVING

Restricting Aggregated Output (continued)
Figure Using the WHERE, GROUP BY, and HAVING clauses

Figure 11-23 Using a HAVING clause to restrict grouped output

Figure 11-24 Filtering correctly with the WHERE and HAVING clauses

It is quite inefficient and considered poor SQL programming practice
It is quite inefficient and considered poor SQL programming practice. The statement must process ALL rows in the BOOKS table with the aggregated calculation and then eliminate categories. Figure Filtering incorrectly with the HAVING clauses Figure Filtering correctly with the WHERE and HAVING clauses

Inner function is resolved first Maximum nesting depth: 2
Nesting Functions Inner function is resolved first Maximum nesting depth: 2 Figure Nesting group functions

Exercises Practice all the examples in the text. A Script file is available on the Bb (file name: ch11Queries.sql) After completing all examples, do the HW. In-class Exercise #7 (p.424)

Homework - Hands-On Assignments
me with one attachment (Oracle_ch11_Spool_Lname_Fname.) to: with subject title of Bmis441-01_Oracle_ch11 (or Bmis441-02_Oracle_ch11) Read and Practice all examples on Chapters 11 1. Run the script files (in the folder \oradata\chapter11\): JLDB_Build_11.sql 2. Read Oracle assignment and create a script file Oracle_ch11_Lname_Fname.sql for questions (ALL EVEN problems; pp ) on “Hands-on Assignments”. Use appropriate COLUMN or other SQL commands to produce readable outputs (or your grade will be discounted –see a sample output on the Bb) 3. Execute and test one problem at a time and make sure they are all running successfully. 4. When you done, spool the script files (see next slide for spooling instructions) and the file (Oracle_ch11_Spool_Lname_Fname.txt) to me by the midnight before the next class.

How to Spool your Script and Output Files
After you tested the script file of Oracle_ch11_Lname_Fname.sql successfully, follow the instructions below to spool both script and output files: Step 0. Run the following script file from SQL*Plus (since you have created JLDB tables) Start c:\oradata\chapter11\JLDB_Build_11.sql 1. type the following on SQL> Spool c:\oradata\Oracle_ch11_Spool_Lname_Fname.txt (make sure your name is entered) 2. open Oracle_ch11_Lname_Fname.sql that you already tested 3. copy and paste all the SQL commands (including all comments) to the SQL*PLUS 4. type Spool Off on the SQL> The output should contain your personal information, all SQL commands and their solution on the .txt file and saved in C: drive (oradata\ folder) me with the spooled file (.txt) with attachment to: with subject title of Bmis441-01_Oracle_ch11 (or Bmis441-02_Oracle_ch11)

Summary The AVG, SUM, STDDEV, and VARIANCE functions are used only with numeric fields The COUNT, MAX, and MIN functions can be applied to any datatype The AVG, SUM, MAX, MIN, STDDEV, and VARIANCE functions all ignore NULL values By default, the AVG, SUM, MAX, MIN, COUNT, STDDEV, and VARIANCE functions include duplicate values

Summary (continued) The GROUP BY clause is used to divide table data into groups If a SELECT clause contains both an individual field name and a group function, the field name must also be included in a GROUP BY clause The HAVING clause is used to restrict groups in a group function Group functions can be nested to a depth of only two. The inner function is always performed first, using the specified grouping. The results of the inner function are used as input for the outer function.

Summary (continued) The STDDEV and VARIANCE functions are used to perform statistical analyses on a set of data GROUPING SETS operations can be used to perform multiple GROUP BY aggregations with a single query The CUBE extension of the GROUP BY calculates aggregations for all possible combinations or groupings of columns included The ROLLUP extension of the GROUP BY calculates increasing levels of accumulated subtotals for the column list provided Composite columns and concatenated groupings can be used in GROUPING SETS, CUBE, and ROLLUP operations The GROUP_ID function helps eliminate duplicate grouping results

STDDEV Function Figure Using the STDDEV function

Determines data dispersion within a group
VARIANCE Function Determines data dispersion within a group Figure Using the VARIANCE function

Enhanced Aggregation for Reporting
Oracle provides extensions to the GROUP BY clause, which allow both aggregation across multiple dimensions or the generation of increasing levels of subtotals with a single SELECT statement A dimension is a term used to describe any category used in analyzing data, such as time, geography, and product line Each dimension could contain various levels of aggregation; for example, the time dimension may include aggregation by month, quarter, and year; the product dimension may include product type, sales, store, region and month.

Figure Slicing a data cube
REGION Hoffer’s text (chapter 11) CUSTOMER

Excel Pivot Table Example
Figure A pivot table with two dimensions on a row

Excel Pivot Table Example (continued)
Figure A pivot table with one row and one column dimension

GROUPING SETS The grouping sets expression is the component on which the other GROUP BY extensions, ROLLUP and CUBE, are built. With this extension, you can use a single query statement to perform multiple GROUP BY clauses. The single query in Figure produces the average retail price for books in four groupings: 1) publisher (the Name column) and category, 2) category, 3) publisher, and 4) overall average. -- chapter 11, Figure 11-32; p.408 SELECT name, category, COUNT(isbn), TO_CHAR(AVG(retail), '99.99') "Avg Retail" FROM publisher JOIN books USING (pubid) WHERE pubid IN (2,3,5) GROUP BY GROUPING SETS (name, category, (name, category), ()) Try without GROUING SETS

Without GROUPING SETS -- chapter 11, Figure 11-32(a); p.408
SELECT name, category, COUNT(isbn), TO_CHAR(AVG(retail), '99.99') "Avg Retail" FROM publisher JOIN books USING (pubid) WHERE pubid IN (2,3,5) GROUP BY name, category;

Grouping Sets Figure Using a GROUPING SETS expression in a GROUP BY clause

The CUBE Extension The CUBE extension of GROUP BY instructs Oracle to perform aggregations for all possible combinations of the specified columns (e.g., two columns: Name (publisher name) and Category). The outputs matches Figure If you need only a subset of the four aggregate levels calculated, you must use the GROUPING SETS expression because the CUBE extension always performs all aggregation levels. Adding a GROUPING function to the CUBE extension (Fig ) to identify subtotal rows in the results (helpful in labeling sorting, and restricting output).

CUBE Figure Using the CUBE extension of GROUP BY

Figure 11-35 The GROUPING function returns a 1 to identify subtotal rows

The ROLLUP Extension The ROLLUP extension of GROUP BY calculates cumulative subtotals for the specified columns. If multiple columns are indicated, subtotals are performed for each column in the argument list, except the one on the far right. A grand total is also calculated.

ROLLUP Figure Using the ROLLUP extension of GROUP BY

Category is the column outside the ROLLUP and is considered the aggregate value.
A subtotal is calculated for the aggregate value as well as for each unique value of the ROLLUP column in the aggregate value – i.e., by Category and each Name in each Category . Figure Using a partial ROLLUP

Chapter 11 Group Functions (up to p.402)

Similar presentations

Presentation on theme: "Chapter 11 Group Functions (up to p.402)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 11 Group Functions (up to p.402)

Similar presentations

Presentation on theme: "Chapter 11 Group Functions (up to p.402)"— Presentation transcript:

Similar presentations

About project

Feedback