The Model Clause explained Tony Hasler, UKOUG Birmingham 2012 Tony Hasler, Anvil Computer Services Ltd.

Slides:



Advertisements
Similar presentations
Advanced SQL (part 1) CS263 Lecture 7.
Advertisements

Tuning Oracle SQL The Basics of Efficient SQLThe Basics of Efficient SQL Common Sense Indexing The Optimizer –Making SQL Efficient Finding Problem Queries.
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Presented By Akin S Walter-Johnson Ms Principal PeerLabs, Inc
© 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Joins and Sub-queries in SQL.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
1Jonathan Lewis EOUG Jun 2000 Execution Plans Explain Plan - part 2 Parallel - Partitions - Problems.
What Happens when a SQL statement is issued?
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Chapter 11 Group Functions
Introduction to Structured Query Language (SQL)
Introduction to Structured Query Language (SQL)
Database Systems More SQL Database Design -- More SQL1.
Introduction to Structured Query Language (SQL)
A Guide to SQL, Seventh Edition. Objectives Retrieve data from a database using SQL commands Use compound conditions Use computed columns Use the SQL.
Query Optimization, part 2 CS634 Lecture 13, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Executing Explain Plans and Explaining Execution Plans Craig Martin 01/20/2011.
Chapter 3 Single-Table Queries
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Database Management 9. course. Execution of queries.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
SQL advanced select using Oracle 1 7. Multiple Tables: Joins and Set Operations 8. Subqueries: Nested Queries.
Oracle Database Administration Lecture 2 SQL language.
Chapter 7 © 2013 Pearson Education, Inc. Publishing as Prentice Hall 1 Modern Database Management 11 th Edition Jeffrey A. Hoffer, V. Ramesh, Heikki Topi.
Module 5 Planning for SQL Server® 2008 R2 Indexing.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
1 Single Table Queries. 2 Objectives  SELECT, WHERE  AND / OR / NOT conditions  Computed columns  LIKE, IN, BETWEEN operators  ORDER BY, GROUP BY,
11-1 Improve response time of interactive programs. Improve batch throughput. To ensure scalability of applications load vs. performance. Reduce system.
Concepts of Database Management Seventh Edition
Week 10 Quiz 9 Answers Group 28 Christine Hallstrom Deena Phadnis.
Star Transformations Tony Hasler, UKOUG Birmingham 2012 Tony Hasler, Anvil Computer Services Ltd.
劉 志 俊 (Chih-Chin Liu) 中華大學 資訊工程系 October 2001 Chap 9 SQL (III): Advanced Queries.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
Mark Inman U.S. Navy (Naval Sea Logistics Center) Session #213 Analytic SQL for Beginners.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Oracle tuning: a tutorial Saikat Chakraborty. Introduction In this session we will try to learn how to write optimized SQL statements in Oracle 8i We.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
Module 4 Database SQL Tuning Section 3 Application Performance.
Oracle Database Performance Secrets Finally Revealed Greg Rahn & Michael Hallas Oracle Real-World Performance Group Server Technologies.
Chapter 12 Subqueries and Merge Statements
1 Chapter 13 Parallel SQL. 2 Understanding Parallel SQL Enables a SQL statement to be: – Split into multiple threads – Each thread processed simultaneously.
Chapter 4 Logical & Physical Database Design
Chapter 5 Index and Clustering
Microsoft Office 2013 ®® Calculating Data with Formulas and Functions.
A Guide to SQL, Eighth Edition Chapter Four Single-Table Queries.
Sorting and Joining.
In this session, you will learn to: Query data by using joins Query data by using subqueries Objectives.
Query Processing – Implementing Set Operations and Joins Chap. 19.
LM 5 Introduction to SQL MISM 4135 Instructor: Dr. Lei Li.
 CONACT UC:  Magnific training   
Scott Fallen Sales Engineer, SQL Sentry Blog: scottfallen.blogspot.com.
Execution Plans Detail From Zero to Hero İsmail Adar.
BTM 382 Database Management Chapter 8 Advanced SQL Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia.
CSC314 DAY 9 Intermediate SQL 1. Chapter 6 © 2013 Pearson Education, Inc. Publishing as Prentice Hall USING AND DEFINING VIEWS  Views provide users controlled.
Concepts of Database Management, Fifth Edition Chapter 3: The Relational Model 2: SQL.
Tuning Oracle SQL The Basics of Efficient SQL Common Sense Indexing
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Tuning Transact-SQL Queries
Chapter 12 Subqueries and MERGE Oracle 10g: SQL
02 | Advanced SELECT Statements
Scaling SQL with different approaches
Physical Join Operators
Enhance BI Applications and Simplify Development
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Contents Preface I Introduction Lesson Objectives I-2
Query Transformations
Presentation transcript:

The Model Clause explained Tony Hasler, UKOUG Birmingham 2012 Tony Hasler, Anvil Computer Services Ltd.

Who is Tony Hasler? Google Tony Hasler! My blog contains all the material from this presentation on the front page Additional blog entry (October) on the model clause Tony Hasler

Health Warning about the model clause No support for Active Session History (ASH) whilst on the CPU Limited support for SQL Performance Monitor No session statistics related to modelling No trace events for diagnosis (that I can find) Performance is poor when data spills to disk Introduced in 10g but no new features in 11g The main example in chapter 22 (10g) or chapter 23 (11g) in the Data Warehousing Guide (calculating mortgage amortization) is incorrect and always has been (mortgage fact ‘Payment’ should be ‘PaymentAmt’) Perhaps this is not a flagship component! However, the model clause: −Enables you to do stuff that cannot otherwise be (easily) done in SQL −The parallelisation capabilities can be useful in some cases Tony Hasler

What is in this talk and what is not IN: −What the model clause is −What it is used for and what it isn’t −Syntax and semantics of the main features −Recommendations on how to use the model clause −A real life example Not in due to time constraints: −An exhaustive list of all model clause features −This can be found in the Data Warehousing Guide Chapter 22 (10g) or Chapter 23 (11g) −No demos Tony Hasler

What the model clause is and what is it used for Part of a query block meaning it can appear after any select keyword. Provides spreadsheet like capabilities inside SQL Can perform unimplemented SQL features, e.g. moving medians. => Just because something is inefficient to execute doesn’t mean you don’t need to do it! Iteration is possible, such as with ‘Feedback loops’ (see Example 4 Calculating Using Simultaneous Equations in the Data Warehousing guide) A series of calculations within the model can be sequenced manually or dependencies can be determined by the engine (only use when necessary, not because of laziness) Tony Hasler

Model terms compared with Excel Tony Hasler Rows/columns Dimensions Worksheets Partitions Formulas Rules Values Measures

Comparison of Model clause terms with Excel spreadsheet terms Model TermExcel termDifferences PARTITIONWorksheetIt is possible for formulas in one Excel worksheet to reference cells in another. It is not possible for partitions in a model clause to reference each other. DIMENSIONRow and columnIn Excel there are always exactly two dimensions. In a model clause there are one or more. MEASUREA cell valueIn Excel only one value is referenced by the dimensions. The model clause allows one or more measures. RULESFormulasNested cell references possible with models Tony Hasler

Overview of a query block with no model clause Tony Hasler ORDER BY clause Scalar, aggregate, and analytic clauses allowed Joins and outer join predicates Only scalar functions allowed WHERE clause: inner join predicates and selection predicates Only scalar functions allowed Logically precedes GROUP BY clause Only scalar functions allowed Logically precedes HAVING clauseScalar and aggregate functions allowed DISTINCT Scalar, aggregate, and analytic functions allowed Logically precedes Analytic calculations SELECT list Logically precedes

Overview of a query block with a model clause Tony Hasler Joins, join predicates, selection predicates and GROUP BY clause Only scalar functions allowed HAVING clause Scalar and aggregate functions allowed Logically precedes Model PARTITION, DIMENSION and MEASURES clauses Scalar, aggregate, and analytic functions allowed Logically precedes Model RULES clause Independent calculations SELECT list Only scalar functions on model outputs allowed Logically precedes ORDER BY clause Only scalar functions on model outputs allowed Logically precedes Analytic calculations DISTINCT Logically precedes

Recommendations on use of the model clause Use factored subqueries to avoid mixing aggregate functions and analytic functions with the model clause Use only scalar values as inputs to the model clause Perform all calculations in the model clause so that… The final select list is just a simple list of identifiers Be careful using the model clause in a sub query as the CBO assumes that the cardinality of the output of a model clause is the same as the cardinality of the input. Tony Hasler

A simplified version of the real world problem CREATE TABLE stock_holdings ( customer_name VARCHAR2 (100),stock_name VARCHAR2 (100),business_date DATE,VALUE NUMBER ); Calculate the moving one year average (mean) value and moving standard deviation, and zscore for each customer, stock, and business date. Tony Hasler

Mathematical terms loosely explained The Average (mean) is a “typical value” The Standard Deviation is a “typical difference between a value and the mean” The Zscore is the number of standard deviations that a particular value deviates from the mean. In other words it is a measure of how unusual a value is. Tony Hasler

Problem 1: Oracle doesn’t support moving intervals of months or years properly!!! WITH q1 AS ( SELECT DATE ' ' + ROWNUM mydate FROM DUAL CONNECT BY LEVEL <= 400),q2 AS (SELECT mydate,COUNT (*) OVER ( ORDER BY mydate RANGE BETWEEN INTERVAL '1' YEAR PRECEDING AND CURRENT ROW) cnt FROM q1) SELECT * FROM q2 WHERE mydate = DATE ' '; Result: MYDATE CNT 10/01/ Tony Hasler

Model clause solution (1) WITH q1 AS ( SELECT DATE ' ' + ROWNUM mydate, 0 cnt FROM DUAL CONNECT BY LEVEL <= 400) SELECT mydate, cnt FROM q1 MODEL RETURN UPDATED ROWS DIMENSION BY (mydate) MEASURES (cnt) RULES (cnt [DATE ' '] = COUNT (*) [mydate BETWEEN ADD_MONTHS (CV () + 1, -12) AND CV ()]); Tony Hasler

Model clause solution (2) CREATE TABLE business_dates ( business_date PRIMARY KEY NOT NULL,business_days_in_year NOT NULL,first_day_in_year NOT NULL,business_day_number NOT NULL ) ORGANIZATION INDEX AS WITH q1 AS ( SELECT DISTINCT business_date, 0 AS business_days_in_year, SYSDATE AS first_day_in_year, 0 AS business_day_number FROM stock_holdings ) SELECT business_date,business_days_in_year,first_day_in_year,business_day_number FROM q1 MODEL DIMENSION BY (business_date) MEASURES(business_days_in_year,first_day_in_year,business_day_number) RULES ( first_day_in_year[ANY] = ADD_MONTHS(CV(business_date)+1,-12), business_days_in_year[ANY] = COUNT(*)[business_date BETWEEN first_day_in_year[CV()] AND CV()], business_day_number[ANY] = ROW_NUMBER() OVER(ORDER BY BUSINESS_DATE)); Tony Hasler

Execution plan for previous statement | Id | Operation | Name | | 0 | CREATE TABLE STATEMENT | | | 1 | LOAD AS SELECT | BUSINESS_DATES | | 2 | SQL MODEL ORDERED | | | 3 | VIEW | | | 4 | HASH UNIQUE | | | 5 | TABLE ACCESS FULL | STOCK_HOLDINGS | | 6 | WINDOW (IN SQL MODEL) SORT| | Tony Hasler

Problem 2: avoiding data densification Oracle recommends the use of partitioned outer joins for data densification to simplify analytic functions. The model clause can also be used for data densification In my case, however, this would have multiplied the number of rows enormously as customers tended to hold particular stocks for about 10% of the time. We can use the model clause to avoid this by performing multiple calculations in sequence Tony Hasler

Model clause solution SELECT customer_name,stock_name,business_date,first_day_in_year,VALUE,mov_avg,mov_stdd,zscore FROM stock_holdings sh, business_dates bd WHERE sh.business_date=bd.business_date MODEL PARTITION BY (customer_name,stock_name) DIMENSION BY (sh.business_date) MEASURES (VALUE,business_days_in_year, first_day_in_year,0 AS mov_avg, 0 AS mov_stdd, 0 AS zscore, 0 AS mov_sum, 0 AS mov_cnt) RULES ( mov_sum[ANY] = SUM(VALUE)[business_date BETWEEN first_day_in_year[CV()]AND CV()], mov_cnt[ANY] = COUNT(*) [business_date BETWEEN first_day_in_year[CV()] AND CV()], mov_avg[ANY] = mov_sum[CV()]/ business_days_in_year[CV()], mov_stdd[ANY] = SQRT(( (VAR_POP(VALUE)[business_date BETWEEN first_day_in_year[CV()] AND CV()] *mov_cnt[CV()]) +(mov_sum[CV()]* (AVG(VALUE)[business_date BETWEEN first_day_in_year[CV()] AND CV()] - mov_avg[CV()]))) /business_days_in_year[CV()] ), zscore[ANY] = DECODE(mov_stdd[CV()],0,0,(VALUE[CV()]-mov_avg[CV()])/mov_stdd[CV()]) ); Tony Hasler

Parallel execution plan | Id | Operation | Name | | 0 | SELECT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | | 3 | BUFFER SORT | | | 4 | SQL MODEL ORDERED | | | 5 | PX RECEIVE | | | 6 | PX SEND HASH | :TQ10000 | | 7 | NESTED LOOPS | | | 8 | PX BLOCK ITERATOR | | | 9 | TABLE ACCESS FULL| STOCK_HOLDINGS | |* 10 | INDEX UNIQUE SCAN | SYS_IOT_TOP_78729 | Tony Hasler

Summary The model clause allows you to build your own analytical functions and/or your own analytical windows – amongst other things. The model clause also allows you to parallelise calculations and can be useful even if the calculations are supported without the model clause However, model clause aggregates will be slower than standard analytic functions Performance degrades rapidly when partitions spill to disk. Tony Hasler

Questions Tony Hasler