Youre Smarter than a Database Overcoming the optimizers bad cardinality estimates.

Slides:



Advertisements
Similar presentations
1.
Advertisements

Tuning Oracle SQL The Basics of Efficient SQLThe Basics of Efficient SQL Common Sense Indexing The Optimizer –Making SQL Efficient Finding Problem Queries.
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Introduction to SQL Tuning Brown Bag Three essential concepts.
SQL Tuning Briefing Null is not equal to null but null is null.
SQL Server performance tuning basics
M ODULE 4 D ATABASE T UNING Section 3 Application Performance 1 ITEC 450 Fall 2012.
Overview of performance tuning strategies Oracle Performance Tuning Allan Young June 2008.
© Bharati Vidyapeeths Institute of Computer Applications and Management, New Delhi © Bharati Vidyapeeths Institute of Computer Applications and.
Independent consultant Available for consulting In-house workshops Cost-Based Optimizer Performance By Design Performance Troubleshooting Oracle ACE Director.
Overview of Query Evaluation (contd.) Chapter 12 Ramakrishnan and Gehrke (Sections )
Natural Data Clustering: Why Nested Loops Win So Often May, 2008 ©2008 Dan Tow, All rights reserved SingingSQL.
What Happens when a SQL statement is issued?
Exadata Distinctives Brown Bag New features for tuning Oracle database applications.
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
CS4432: Database Systems II
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
SQL Performance 2011/12 Joe Chang, SolidQ
Slide: 1 Presentation Title Presentation Sub-Title Copyright 2010 Robert Haas, EnterpriseDB Corporation. Creative Commons 3.0 Attribution. The PostgreSQL.
David Konopnicki Choosing Access Path ä The basic methods. ä The access paths and when they are available. ä How the optimizer chooses among the.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Optimization Exercises. Question 1 How do you think the following query should be computed? What indexes would you suggest to use? SELECT E.ename, D.mgr.
AN INTRODUCTION TO EXECUTION PLAN OF QUERIES These slides have been adapted from a presentation originally made by ORACLE. The full set of original slides.
Relational Database Performance CSCI 6442 Copyright 2013, David C. Roberts, all rights reserved.
Executing Explain Plans and Explaining Execution Plans Craig Martin 01/20/2011.
Database System Architecture and Performance CSCI 6442 ©Copyright 2015, David C. Roberts, all rights reserved.
Oracle Database Administration Lecture 6 Indexes, Optimizer, Hints.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Access Path Selection in a Relational Database Management System Selinger et al.
Oracle Index study for Event TAG DB M. Boschini S. Della Torre
Database Management 9. course. Execution of queries.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Query Optimization Chap. 19. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying where.
Views Lesson 7.
SQL Performance and Optimization l SQL Overview l Performance Tuning Process l SQL-Tuning –EXPLAIN PLANs –Tuning Tools –Optimizing Table Scans –Optimizing.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Oracle tuning: a tutorial Saikat Chakraborty. Introduction In this session we will try to learn how to write optimized SQL statements in Oracle 8i We.
Module 4 Database SQL Tuning Section 3 Application Performance.
Indexes and Views Unit 7.
Query Optimizer (Chapter ). Optimization Minimizes uses of resources by choosing best set of alternative query access plans considers I/O cost,
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
J.NemecAre Your Statistics Bad Enough?1 Verify the effectiveness of gathering optimizer statistics Jaromir D.B. Nemec UKOUG
Chapter 5 Index and Clustering
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
10g Tuning Highlights Presenter JEREMY SCHNEIDER Senior Consultant, ITC Technology Services.
Sorting and Joining.
Query Processing – Implementing Set Operations and Joins Chap. 19.
QUERY CONSTRUCTION CS1100: Data, Databases, and Queries CS1100Microsoft Access1.
Thinking in Sets and SQL Query Logical Processing.
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration.
SQL Tuning Scripts Bobby Durrett US Foodservice
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP SQL SERVER Database Administration.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
The PostgreSQL Query Planner Robert Haas PostgreSQL East 2010.
How is data stored? ● Table and index Data are stored in blocks(aka Page). ● All IO is done at least one block at a time. ● Typical block size is 8Kb.
Tuning Oracle SQL The Basics of Efficient SQL Common Sense Indexing
SQL Server Statistics and its relationship with Query Optimizer
Scaling SQL with different approaches
Query Tuning without Production Data
Choosing Access Path The basic methods.
Optimizing SQL Queries
Decoding the Cardinality Estimator to Speed Up Queries
Cardinality Estimator 2014/2016
JULIE McLAIN-HARPER LINKEDIN: JM HARPER
Execution Plans Demystified
Diving into Query Execution Plans
Joins and other advanced Queries
Performance Tuning ETL Process
Presentation transcript:

Youre Smarter than a Database Overcoming the optimizers bad cardinality estimates

About me Bobby Durrett US Foodservice Scripts in rielle/sqltuning.zip

What you know

What the database knows

Before SQL Example - mainframe Datacom/DB COBOL List index names Write loops read a from one index i1 where one.c=10 while more table one rows exist get next row read b from two index i2 where two.a = one.a while more table two rows exist get next row print one.a,two.b end while

SQL Tell what you want, not how to get it select one.a,two.b from one,two where one.c=10 and one.a=two.a;

Pre-SQL versus SQL Pre-SQL code very efficient – runs in megabytes – VSE mainframe COBOL Labor intensive SQL can be inefficient – runs in gigabytes (if you are lucky!) Much more productive – do in minutes what took hours before – create tables

What the database doesnt know Optimizer has a limited set of statistics that describe the data It can miscalculate the number of rows a query will return, its cardinality A cardinality error can lead optimizer to choose a slow way to run the SQL

Example plan/Cardinality | Id | Operation | Name | Rows | Cost | 0 | SELECT STATEMENT | | 10 | 3 |* 1 | TABLE ACCESS FULL| TEST1 | 10 | Plan = how Oracle will run your query Rows = how many rows optimizer thinks that step will return Cost = estimate of time query will take, a function of the number of rows

How to fix cardinality problems Find out if it really is a cardinality issue Determine the reason it occurred Single column Multiple columns Choose a strategy Give the optimizer more information Override optimizer decision Change the application

Four examples Four examples of how the optimizer calculates cardinality Full scripts and their outputs on portal, pieces on slides – edited for simplicity

Step 1: Find out if it really is a cardinality issue Example 1 Data select a,count(*) from test1 group by a; A COUNT(*) Query select * from test1 where a=1;

Step 1: Find out if it really is a cardinality issue Get estimated cardinality from plan | Id | Operation | Name | Rows | | 0 | SELECT STATEMENT | | 10 | |* 1 | TABLE ACCESS FULL| TEST1 | 10 | Do query for actual number of rows select count(*) from test1 where a=1;

Step 1: Find out if it really is a cardinality issue Plan is a tree – find cardinality and select count(*) on part of query represented by that part of plan. join tablejoin table

Step 2: Understand the reason for the wrong cardinality Unequal distribution of data: Within a single column Last name Smith or Jones Among multiple columns – Address Zipcode and State

Step 2: Understand the reason for the wrong cardinality Example 2 - Unequal distribution of values in a single column 1,000,000 rows with value 1 1 row with value 2 select a,count(*) from TEST2 group by a; A COUNT(*)

Step 2: Understand the reason for the wrong cardinality SQL statement – returns one row select * from TEST2 where a=2;

Step 2: Understand the reason for the wrong cardinality Plan with wrong number of rows = 500,000 Full scan instead of range scan – 100 times slower | Operation | Name | Rows | | SELECT STATEMENT | | 500K| | INDEX FAST FULL SCAN| TEST2INDEX | 500K|

Step 2: Understand the reason for the wrong cardinality Column statistics – two distinct values LOW HIGH NUM_DISTINCT Table statistic – total # of rows – 1,000,001 NUM_ROWS

Step 2: Understand the reason for the wrong cardinality Rows in plan = (rows in table)/ (distinct values of column) = /2 Optimizer knew that there were only two values – assumed they had equal number of rows

Step 2: Understand the reason for the wrong cardinality Example 3 - Combinations of column values not equally distributed 1,000,000 rows with values 1,1 1,000,000 rows with values 2,2 1 row with value 1,2 ~ Equal numbers of 1s and 2s in each column A B COUNT(*)

Step 2: Understand the reason for the wrong cardinality SQL statement – retrieves one row select sum(a+b) from TEST3 where a=1 and b=2;

Step 2: Understand the reason for the wrong cardinality Plan with wrong number of rows = 500,000 Inefficient full scan | Operation | Name | Rows | | SELECT STATEMENT | | 1 | | SORT AGGREGATE | | 1 | | INDEX FAST FULL SCAN| TEST3INDEX | 500K|

Step 2: Understand the reason for the wrong cardinality Column statistics C LOW HIGH NUM_DISTINCT A B Table statistic – total # of rows – 2,000,001 NUM_ROWS

Step 2: Understand the reason for the wrong cardinality Rows in plan = (rows in table)/ (distinct values A * distinct values B) = /(2 * 2) Optimizer assumes all four combinations (1,1),(1,2),(2,1),(2,2) equally likely

Step 2: Understand the reason for the wrong cardinality How to tell which assumption is in play? Select count(*) each column select a,count(*) from TEST3 group by a; select b,count(*) from TEST3 group by b; count(*) each column combination select a,b,count(*) from TEST3 group by a,b;

Step 3: Choose the best strategy for fixing the cardinality problem Giving the optimizer more information Histograms SQL Profiles Overriding optimizer decisions Hints Changing the application Try to use optimizer as much as possible to minimize development work

Step 3: Choose the best strategy for fixing the cardinality problem Giving the optimizer more information – using histograms Works for unequal distribution within a single column A histogram records the distribution of values within a column in up to 254 buckets Works best on columns with fewer than 255 distinct values

Step 3: Choose the best strategy for fixing the cardinality problem Run gather_table_stats command to get histograms on the column – 254 is max number of buckets method_opt=>'FOR ALL COLUMNS SIZE 254'

Step 3: Choose the best strategy for fixing the cardinality problem Plan for Example 2 with correct number of rows with histogram Uses range scan | Operation | Name | Rows | | SELECT STATEMENT | | 1 | | INDEX RANGE SCAN| TEST2INDEX | 1 |

Step 3: Choose the best strategy for fixing the cardinality problem Column statistics – two buckets LOW HIGH NUM_DISTINCT NUM_BUCKETS Table statistic – unchanged NUM_ROWS

Step 3: Choose the best strategy for fixing the cardinality problem Time without histograms (1 second): Elapsed: 00:00:01.00 Time with histograms(1/100th second): Elapsed: 00:00:00.01

Step 3: Choose the best strategy for fixing the cardinality problem Giving the optimizer more information – using SQL Profiles Works for unequal distribution among multiple columns Includes information about the relationship between columns in the SQL – correlated columns or predicates

Step 3: Choose the best strategy for fixing the cardinality problem SQL Tuning Advisor gathers statistics on the columns...DBMS_SQLTUNE.CREATE_TUNING_TASK(......DBMS_SQLTUNE.EXECUTE_TUNING_TASK(... Accept the SQL Profile it creates to use the new statistics...DBMS_SQLTUNE.ACCEPT_SQL_PROFILE (...

Step 3: Choose the best strategy for fixing the cardinality problem Example 3 plan with correct number of rows = 1 using SQL profile | Operation | Name | Rows | Bytes | | SELECT STATEMENT | | 1 | 6 | | SORT AGGREGATE | | 1 | 6 | | INDEX RANGE SCAN| TEST3INDEX | 1 | 6 | |

Step 3: Choose the best strategy for fixing the cardinality problem Time without a profile (1 second): Elapsed: 00:00:01.09 Time with a profile(1/100th second): Elapsed: 00:00:00.01

Step 3: Choose the best strategy for fixing the cardinality problem Overriding optimizer decisions – using hints Example 4 has unequal distribution of column values across two tables – histograms and SQL Profiles dont work Hint forces index range scan Small amount of additional code – not like Cobol on mainframe

Step 3: Choose the best strategy for fixing the cardinality problem Example 4 - SMALL table MANY relates to 1 – there are many rows with value 1 FEW relates to 2 – there are few with value 2 insert into SMALL values ('MANY',1); insert into SMALL values ('FEW',2);

Step 3: Choose the best strategy for fixing the cardinality problem Example 4 - LARGE table: 1,000,000 rows with value 1 1 row with value 2 NUM COUNT(*)

Step 3: Choose the best strategy for fixing the cardinality problem SQL statement – returns one row select B.NUM from SMALL A,LARGE B where A.NUM=B.NUM and A.NAME='FEW';

Step 3: Choose the best strategy for fixing the cardinality problem Plan with wrong number of rows = 125, | Operation | Name | Rows | | SELECT STATEMENT | | 125K| | HASH JOIN | | 125K| | TABLE ACCESS FULL | SMALL | 1 | | INDEX FAST FULL SCAN| LARGEINDEX | 1000K|

Step 3: Choose the best strategy for fixing the cardinality problem Column statistics – two buckets on all columns – using histograms LOW HIGH NUM_DISTINCT NUM_BUCKETS LOW HIGH NUM_DISTINCT NUM_BUCKETS FEW MANY 2 2

Step 3: Choose the best strategy for fixing the cardinality problem Table statistics – SMALL has 2 rows, LARGE NUM_ROWS NUM_ROWS

Step 3: Choose the best strategy for fixing the cardinality problem = /8 Optimizer appears to assume all eight combinations of the three columns values are equally likely Cant verify formula – references dont include formula with histograms Even worse without histograms – cardinality is

Step 3: Choose the best strategy for fixing the cardinality problem No SQL profile from SQL Tuning Advisor: There are no recommendations to improve the statement. Neither histograms nor SQL profiles help example 4

Step 3: Choose the best strategy for fixing the cardinality problem Statement with hints: Use index Dont do full scan select /*+ INDEX(B LARGEINDEX) NO_INDEX_FFS(B LARGEINDEX) */ B.NUM from SMALL A,LARGE B where a.NUM=B.NUM and A.NAME='FEW';

Step 3: Choose the best strategy for fixing the cardinality problem Time without a hint (1 second): Elapsed: 00:00:01.03 Time with a hint (1/100th second): Elapsed: 00:00:00.01

Step 3: Choose the best strategy for fixing the cardinality problem Changing the application Change your tables so that the optimizer gets your SQLs cardinality right Requires more work designing tables, but keeps productivity benefits of SQL

Step 3: Choose the best strategy for fixing the cardinality problem Example 4 – moved NAME column to LARGE table and split table in two One million (MANY,1) rows in LARGEA One (FEW,2) row in LARGEB Query: select NUM from (select * from largea union select * from largeb) where NAME='FEW';

Step 3: Choose the best strategy for fixing the cardinality problem Plan is just as efficient as with hint : Number of rows = 2 (reality is 1) Range Scan | Id | Operation | Name | Rows | | 0 | SELECT STATEMENT | | 2 | | 1 | VIEW | | 2 | | 2 | SORT UNIQUE | | 2 | | 3 | UNION-ALL | | | | 4 | TABLE ACCESS BY INDEX ROWID| LARGEA | 1 | |* 5 | INDEX RANGE SCAN | LARGEAINDEX | 1 | | 6 | TABLE ACCESS BY INDEX ROWID| LARGEB | 1 | |* 7 | INDEX RANGE SCAN | LARGEBINDEX | 1 |

Step 3: Choose the best strategy for fixing the cardinality problem Time without table change (1 second): Elapsed: 00:00:01.03 Time with table change (1/100th second): Elapsed: 00:00:00.01

Conclusion SQL improves productivity, optimizer has limits Identify cases where cardinality is wrong Understand why the database got it wrong One column Multiple columns Choose best strategy to fix Give optimizer more info Override optimizers choices Redesign tables

References Cost Based Optimizer Fundamentals, Jonathan Lewis Metalink Note: , Limitations of the Oracle Cost Based Optimizer Metalink Note: , Predicate Selectivity Histograms – Myths and Facts, Wolfgang Breitling Select Journal, Volume 13, Number 3