Mark Inman U.S. Navy (Naval Sea Logistics Center) Session #213 Analytic SQL for Beginners
Speaker Qualifications Mark Inman – IT Specialist – U.S. Navy Oracle Certified Professional Database Administrator 9i Presented a similar presentation to coworkers.
Background Analytic SQL was introduced in Oracle 8i. Where in Oracle Documentation? –Data Warehousing Guide “SQL for Analysis” or “SQL for Analysis and Reporting”. good for everyday use
Analytic Syntax Column List Only Form Of –Function –Partition Clause –Order by Clause –Windowing Clause
Analytic Universe Non- Analytic Available PARTITION BY ORDER BY WINDOWING Ranking Windowing Aggregate Reporting Aggregate RATIO_TO_REPORT LAG/LEAD FIRST/LAST Linear Regression Inverse Percentile
Objective 1 To get aggregate and detail data in the same query – without selecting the same table twice.
Simple Example Analytic select object_id, owner, max(object_id) over () as max_object_id /* ANALYTIC EXPRESSION */ from thing where rownum <= 5; OBJECT_ID OWNER MAX_OBJECT_ID SYS SYS SYS SYS SYS 44
Simple Example Analytic Query Plan Execution Plan SELECT STATEMENT Optimizer=ALL_ROWS 1 0 WINDOW (BUFFER) 2 1 COUNT (STOPKEY) 3 2 TABLE ACCESS (FULL) OF 'THING' (TABLE)
Duplicate Non-Analytic Attempt 1 select object_id, owner, max(object_id) from thing where rownum <= 5 group by object_id;, owner * ERROR at line 3: ORA-00979: not a GROUP BY expression Oops!
Duplicate Non-Analytic Attempt 2 select object_id, owner, max(object_id) max_object_id from thing where rownum <= 5 group by object_id, owner; OBJECT_ID OWNER MAX_OBJECT_ID SYS SYS SYS SYS SYS 44
Duplicate Non-Analytic Success in 3 select object_id, owner, z.max_object_id from thing, ( select max(object_id) max_object_id from thing where rownum <= 5 ) z where rownum <= 5; OBJECT_ID OWNER MAX_OBJECT_ID SYS SYS SYS SYS SYS 44
Non-Analytic Query Plan Execution Plan SELECT STATEMENT Optimizer=ALL_ROWS 1 0 COUNT (STOPKEY) 2 1 NESTED LOOPS 3 2 VIEW 4 3 SORT (AGGREGATE) 5 4 COUNT (STOPKEY) 6 5 TABLE ACCESS (FULL) OF 'THING' (TABLE) 7 2 TABLE ACCESS (FULL) OF 'THING' (TABLE) Two table scans – but so what – it is fast!
SET AUTOTRACE Statistics recursive calls db block gets consistent gets physical reads redo size bytes sent via SQL*Net to client bytes received via SQL*Net from client SQL*Net roundtrips to/from client sorts (memory) sorts (disk) rows processed
SET AUTOTRACE Emphasis on … –db block gets (current tkprof) –consistent gets (query tkprof) –sorts (memory) –table scans from “Execution Plan” No emphasis on … –physical reads –elapsed time (SET TIMING ON)
Statistics StatAnalyticNon-Analytic recursive calls 1 0 db block gets 0 0 consistent gets 4 9 physical reads 0 0 redo size 0 0 bytes sent …605 bytes received …508 SQL*Net roundtrips 2 2 sorts (memory) 1 0 sorts (disk) 0 0
SET AUTOTRACE Options –traceonly (no rows) –statistics –explain SQL*Plus command PLUSTRACE role required – not a default Documentation –SQL*Plus® User's Guide and Reference –Effective Oracle by Design by Thomas Kyte
Statistics - Scaling Non- Analytic Buffer Gets Analytic Memory Sorts Analytic Buffer Gets Analytic Memory Sorts all
Statistics - Scaling
Objective 1 - To get aggregate and detail data in the same query – without selecting the same table twice. Better Performance Scales Better Smaller Query (Less Lines of Code)
Objective 1 - To get aggregate and detail data in the same query – without selecting the same table twice. MAX OVER () Reporting Aggregate Function {SUM | AVG | MAX | MIN | COUNT | STDDEV | VARIANCE... } ([ALL | DISTINCT] {value expression1 | *}) OVER ([PARTITION BY value expression2[,...]])
Objective 2 To compare traditional ranking and analytic ranking and show why analytic ranking is better.
Non-Analytic Top-1 Query select owner, object_name, object_type, last_ddl_time, rownum from ( select * from minman_dba.thing ORDER BY LAST_DDL_TIME ASC NULLS FIRST ) where rownum = 1;
Analytic Top-1 Query select owner, object_name, subobject_name, object_type, MY_ROWNUM from ( select x.*, ROW_NUMBER() OVER ( ORDER BY LAST_DDL_TIME ASC NULLS FIRST ) AS MY_ROWNUM from minman_dba.thing x ) where MY_ROWNUM = 1;
Top-1 Queries select owner, object_name, subobject_name, object_type, MY_ROWNUM from ( select x.*, ROW_NUMBER() OVER ( ORDER BY LAST_DDL_TIME ASC NULLS FIRST ) AS MY_ROWNUM from minman_dba.thing x ) where MY_ROWNUM = 1; select owner, object_name, object_type, last_ddl_time, ROWNUM from ( select * from minman_dba.thing ORDER BY LAST_DDL_TIME ASC NULLS FIRST ) where ROWNUM = 1; SQL Keyword Alias in Column List
Top-1 Queries - Plans Execution Plan SELECT STATEMENT Optimizer=ALL_ROWS 1 0 COUNT (STOPKEY) 2 1 VIEW 3 2 SORT (ORDER BY STOPKEY) 4 3 TABLE ACCESS (FULL) OF 'THING' (TABLE) Execution Plan SELECT STATEMENT Optimizer=ALL_ROWS 1 0 VIEW 2 1 WINDOW (SORT PUSHED RANK) 3 2 TABLE ACCESS (FULL) OF 'THING' (TABLE)
Top-1 Queries - Statistics Statistics db block gets 642 consistent gets 0 physical reads 1 sorts (memory) 0 sorts (disk) Statistics db block gets 642 consistent gets 0 physical reads 1 sorts (memory) 0 sorts (disk)
Top-1 Queries – Two Object Types – Non-Analytic select owner, object_name, object_type, last_ddl_time, rownum from ( select * from minman_dba.thing where object_type = 'TABLE' order by last_ddl_time ) where rownum = 1 union all … select owner, object_name, object_type, last_ddl_time, rownum from ( select * from minman_dba.thing where object_type = 'PROCEDURE' order by last_ddl_time ) where rownum = 1
Top-1 Queries – Two Object Types – Non-Analytic OWNER OBJECT_NAME OBJECT_TYP LAST_DDL_ ROWNUM SYS UNDO$ TABLE 03-FEB-06 1 SYS PSTUBT PROCEDURE 03-FEB-06 1
Top-1 Queries – Two Object Types – Non-Analytic select y.owner, y.object_name, y.object_type, y.last_ddl_time from ( select object_type, min(last_ddl_time) min_last_ddl_time from minman_dba.thing where object_type in ('TABLE','PROCEDURE') group by object_type ) x inner join minman_dba.thing y on x.min_last_ddl_time = y.last_ddl_time; 30 rows selected
Top-1 Queries – Two Object Types - Analytic select owner, object_name, object_type, last_ddl_time, my_rownum from ( select t.*, row_number() over ( PARTITION BY OBJECT_TYPE order by last_ddl_time ) my_rownum from minman_dba.thing t WHERE OBJECT_TYPE IN ('TABLE','PROCEDURE') ) where my_rownum = 1
Top-1 Queries – Two Object Types - Analytic OWNER OBJECT_NAME OBJECT_TYP LAST_DDL_ ROWNUM SYS UNDO$ TABLE 03-FEB-06 1 SYS PSTUBT PROCEDURE 03-FEB-06 1
Top-1 Queries – Two Object Types – Query Plans Execution Plan SELECT STATEMENT Optimizer=ALL_ROWS 1 0 UNION-ALL 2 1 COUNT (STOPKEY) 3 2 VIEW 4 3 SORT (ORDER BY STOPKEY) 5 4 TABLE ACCESS (FULL) OF 'THING' (TABLE) 6 1 COUNT (STOPKEY) 7 6 VIEW 8 7 SORT (ORDER BY STOPKEY) 9 8 TABLE ACCESS (FULL) OF 'THING' (TABLE) Execution Plan SELECT STATEMENT Optimizer=ALL_ROWS 1 0 VIEW 2 1 WINDOW (SORT PUSHED RANK) 3 2 TABLE ACCESS (FULL) OF 'THING' (TABLE)
Top-1 Queries – Two Object Types – Statistics Statistics db block gets 1284 consistent gets 0 physical reads 2 sorts (memory) 0 sorts (disk) Statistics db block gets 642 consistent gets 0 physical reads 1 sorts (memory) 0 sorts (disk)
Top-1 Queries – All Object Types - Analytic select owner, object_name, object_type, last_ddl_time, my_rownum from ( select t.*, row_number() over ( PARTITION BY OBJECT_TYPE order by last_ddl_time ) my_rownum from minman_dba.thing t ) where my_rownum = 1
Top-1 Queries – Two Object Types - Analytic OWNER OBJECT_NAME OBJECT_TYPE LAST_DDL_ MY_ROWNUM SYS C_OBJ# CLUSTER 03-FEB-06 1 SYS LOW_GROUP CONSUMER GROUP 03-FEB-06 1 SYS REGISTRY$CTX CONTEXT 03-FEB-06 1 SH CUSTOMERS_DIM DIMENSION 25-FEB-07 1 SYS DATA_FILE_DIR DIRECTORY 25-FEB-07 1 SYS AQ$_SCHEDULER$_JOBQTAB_V EVALUATION CONTEXT 03-FEB-06 1 SYS GETTVOID FUNCTION 03-FEB-06 1 SYS I_OBJ# INDEX 03-FEB-06 1 SYSTEM LOGMNRC_GTCS_PK INDEX PARTITION 03-FEB-06 1 EXFSYS EXPFILTER INDEXTYPE 03-FEB-06 1 SYS /cc11c9d8_SerialVerFrame JAVA CLASS 03-FEB-06 1 … we are not showing the full result
RANK and DENSE_RANK select owner, object_name, object_type, last_ddl_time, rn, r, dr from ( select t.*, row_number() over (partition by object_type order by last_ddl_time) RN, rank() over (partition by object_type order by last_ddl_time) R, dense_rank() over (partition by object_type order by last_ddl_time) DR from minman_dba.thing t where object_Type in ('PROCEDURE') ) where rn between 1 and 10;
RANK and DENSE_RANK ownerobject_namelast ddl timeRNRDR SYSPSTUBT SYSPSTUB SYSSUBPTXT SYSSUBPTXT SYSODCIINDEXINFOFLAGSDUMP SYSODCIINDEXINFODUMP SYSODCIPREDINFODUMP SYSODCIQUERYINFODUMP SYSODCICOLINFODUMP SYSODCISTATSOPTIONSDUMP
Objective 2- To compare traditional ranking and analytic ranking and show why analytic ranking is better. Better Performance Scales Better Smaller Query (Less Lines of Code) PARTITION BY
Objective 2- To compare traditional ranking and analytic ranking and show why analytic ranking is better. ROW_NUMBER ( ) OVER ( [query_partition_clause] order_by_clause ) RANK ( ) OVER ( [query_partition_clause] order_by_clause ) DENSE_RANK ( ) OVER ( [query_partition_clause] order_by_clause )
Objective 3 - To show additional flexibility of analytic expressions. create table another_thing ( first_col char(1), second_col number ) / insert into another_thing values ('A', ); insert into another_thing values ('A', ); insert into another_thing values ('A', ); insert into another_thing values ('A', ); insert into another_thing values ('B', ); insert into another_thing values ('B',34231); insert into another_thing values ('B', );
Additional Flexibility select first_col, second_col, row_number() over (partition by first_col order by second_col asc) my_1st_rownum from another_thing; F SECOND_COL MY_1ST_ROWNUM A E+10 1 A E+10 2 A E+14 3 A E+14 4 B B B E+14 3
Additional Flexibility select first_col, second_col, row_number() over (partition by first_col order by second_col asc) my_1st_rownum, ROW_NUMBER() OVER (PARTITION BY FIRST_COL ORDER BY SECOND_COL DESC) MY_2ND_ROWNUM from another_thing; F SECOND_COL MY_1ST_ROWNUM MY_2ND_ROWNUM A E A E A E A E B B B E
Additional Flexibility select first_col, second_col, row_number() over (partition by first_col order by second_col) my_1st_rownum, row_number() over (partition by first_col order by second_col desc) my_2nd_rownum, ROW_NUMBER() OVER ( PARTITION BY FIRST_COL ORDER BY MOD(SECOND_COL,10) ASC NULLS FIRST ) MY_3RD_ROWNUM from another_thing;
Additional Flexibility first col second colmy 1 st rownum my 2 nd rownum my 3 rd rownum A3.4897E A5.7865E A1.7823E A3.2486E B B B4.3330E+14312
Additional Flexibility – Query Plan Execution Plan SELECT STATEMENT Optimizer=CHOOSE 1 0 WINDOW (SORT) 2 1 WINDOW (SORT) 3 2 WINDOW (SORT) 4 3 TABLE ACCESS (FULL) OF 'ANOTHER_THING'
Additional Flexibility select first_col, second_col, row_number() over (partition by first_col order by second_col) my_1st_rownum, row_number() over (partition by first_col order by second_col desc) my_2nd_rownum, row_number() over ( partition by first_col order by mod(second_col,10) asc nulls first ) my_3rd_rownum from another_thing order by second_col;
Additional Flexibility first col second colmy 1 st rownum my 2 nd rownum my 3 rd rownum B B A3.4897E A5.7865E A1.7823E A3.2486E B4.3330E+14312
Additional Flexibility – Query Plan Execution Plan SELECT STATEMENT Optimizer=ALL_ROWS 1 0 SORT (ORDER BY) 2 1 WINDOW (SORT) 3 2 WINDOW (SORT) 4 3 WINDOW (SORT) 5 4 TABLE ACCESS (FULL) OF 'ANOTHER_THING' (TABLE)
Objective 3 - To show additional flexibility of analytic expressions. Better Performance Scales Better Smaller Query (Less Lines of Code) PARTITION BY Multiple ORDER BY – Single Select Multiple PARTITION – Single Select
Items Learned in this Session To get aggregate and detail data in the same query – without selecting the same table twice. To compare traditional ranking and analytic ranking and show why analytic ranking is better. To show additional flexibility of analytic expressions.
Questions?
Thank You Please fill out the evaluation. Speaker: Mark Inman Session Name: Analytic SQL for Beginners Session Number: 213 Mark Inman