SQL Server Query Optimizer Cost Formulas Joe Chang

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Understanding SQL Server Query Execution Plans
Paper by: Yu Li, Jianliang Xu, Byron Choi, and Haibo Hu Department of Computer Science Hong Kong Baptist University Slides and Presentation By: Justin.
© IBM Corporation Informix Chat with the Labs John F. Miller III Unlocking the Mysteries Behind Update Statistics STSM.
SQL Performance 2011/12 Joe Chang, SolidQ
EXECUTION PLANS By Nimesh Shah, Amit Bhawnani. Outline  What is execution plan  How are execution plans created  How to get an execution plan  Graphical.
Automating Performance … Joe Chang SolidQ
Parallel Database Systems
Slide: 1 Presentation Title Presentation Sub-Title Copyright 2010 Robert Haas, EnterpriseDB Corporation. Creative Commons 3.0 Attribution. The PostgreSQL.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
1 Overview of Storage and Indexing Chapter 8 (part 1)
Chapter 8 File organization and Indices.
1 File Organizations and Indexing Module 4, Lecture 2 “How index-learning turns no student pale Yet holds the eel of science by the tail.” -- Alexander.
Virtual techdays INDIA │ 9-11 February 2011 SQL 2008 Query Tuning Praveen Srivatsa │ Principal SME – StudyDesk91 │ Director, AsthraSoft Consulting │ Microsoft.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
1 Overview of Storage and Indexing Chapter 8 1. Basics about file management 2. Introduction to indexing 3. First glimpse at indices and workloads.
Making Data Warehouse Easy Conor Cunningham – Principal Architect Thomas Kejser – Principal PM.
Parallel Execution Plans Joe Chang
SQL Server 2005 Performance Enhancements for Large Queries Joe Chang
Database Management Systems, R. Ramakrishnan and J. Gehrke1 File Organizations and Indexing Chapter 8.
Troubleshooting SQL Server Enterprise Geodatabase Performance Issues
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
1 Overview of Storage and Indexing Chapter 8 (part 1)
Parallel Execution Plans Joe Chang
Large Data Operations Joe Chang
Parallel Execution Plans Joe Chang
Lecture 5 Cost Estimation and Data Access Methods.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
TPC-H Studies Joe Chang
Indexes / Session 2/ 1 of 36 Session 2 Module 3: Types of Indexes Module 4: Maintaining Indexes.
SQL Server Scaling on Big Iron (NUMA) Systems Joe Chang TPC-H.
Query Optimizer Execution Plan Cost Model Joe Chang
1 Chapter 13 Parallel SQL. 2 Understanding Parallel SQL Enables a SQL statement to be: – Split into multiple threads – Each thread processed simultaneously.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 5 Index and Clustering
CPSC 404, Laks V.S. Lakshmanan1 Evaluation of Relational Operations – Join Chapter 14 Ramakrishnan and Gehrke (Section 14.4)
Eugene Meidinger Execution Plans
How to kill SQL Server Performance Håkan Winther.
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration.
Scott Fallen Sales Engineer, SQL Sentry Blog: scottfallen.blogspot.com.
Execution Plans Detail From Zero to Hero İsmail Adar.
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP SQL SERVER Database Administration.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
Indexing strategies and good physical designs for performance tuning Kenneth Ureña /SpanishPASSVC.
The PostgreSQL Query Planner Robert Haas PostgreSQL East 2010.
SQL Server Statistics and its relationship with Query Optimizer
Module 11: File Structure
Execution Planning for Success
Query Tuning without Production Data
Query Tuning without Production Data
Query Tuning without Production Data
Joe Chang yahoo . com qdpma.com
Introduction to Execution Plans
Cardinality Estimator 2014/2016
Physical Join Operators
JULIE McLAIN-HARPER LINKEDIN: JM HARPER
File Organizations and Indexing
Reading Execution Plans Successfully
SQL Server Query Optimizer Cost Formulas
Dave Bland LinkedIn SQL Server Execution Plans: How to use them to find performance bottlenecks Dave Bland LinkedIn
Introduction to Execution Plans
Evaluation of Relational Operations: Other Techniques
Diving into Query Execution Plans
Introduction to Execution Plans
Reading execution plans successfully
Introduction to Execution Plans
Presentation transcript:

SQL Server Query Optimizer Cost Formulas Joe Chang

Scope - Query Optimizer Parse SQL Execution Plans Cost Model Rows and pages in each operation Data Distribution Statistics Estimate rows and pages Sources – David Dewitt, Conor Cunningham

Query Optimizer References Conor Cunningham Chapter in Inside SQL Server Conor vs. SQL David Dewitt PASS 2010 Summit Keynote Search: Microsoft Jim Gray Systems Lab dewitt/download

My material stBasedOptimizer.html ml

Paul White – Page Free Space Inside the Optimiser: Constructing a Plan – Part 4 e-the-optimiser-constructing-a-plan-part-4.aspx DBCC RULEON/RULEOFF Inside the Optimizer: Plan Costing 9/01/inside-the-optimizer-plan-costing.aspx DBCC TRACEON (3604); -- Show DBCC output DBCC SETCPUWEIGHT(1E0); -- Default CPU weight DBCC SETIOWEIGHT(0.6E0); -- I/O multiplier = 0.6 DBCC SHOWWEIGHTS; -- Show the settings

Execution Plan Cost Model Index Seek + Key Lookup – Table Scan Joins – Loop, Hash, Merge Updates (Includes Insert & Delete) Really complicated, not covered here Parallel Execution Plans

Why this is Useful? When does the QO use: an index versus table scan Loop Join or Hash/Merge with Scan? Is there a difference between the Cost Model and True Cost Structure? Should I use query hints Parallel Execution Strategy Modern servers – 64+ cores

SQL Server Books Online Query Governor Cost Limit Cost Threshold for Parallelism The cost refers to an estimated elapsed time in seconds required to run the serial plan on a specific hardware configuration. … SQL Server creates and runs a parallel plan for a query only when the estimated cost to run a serial plan for the same query is higher than the value set in cost threshold for parallelism. Query cost refers to the estimated elapsed time, in seconds, required to complete a query on a specific hardware configuration.

Adventure Works Example

Estimated Execution Plan

Clustered Index Scan

Index Seek

Index Seek + Key Lookup

Heap Table

Heap Operations

The Formula – Seek, Scan (Clustered) Index Scan, Table Scan, Index Seek IO Cost per page CPU Cost per row

Key Lookup (& Loop Join) Key/RID Lookup, Nested Loops Join IO Cost x % that require Lookup CPU Cost per Lookup per additional rows

IO Cost Model Sequential - Random Cost is elapsed time in seconds Random = 1/320 Sequential … = 1/1350 Random: 320 IOPS Sequential 1350 pages/sec, or 10.8MB/s

Key Lookup – Scan Cross over Key Lookup rows to pages scanned ratio 1 Key Lookup cost approximately 4 pages in scan operation Non-parallel plan, with other costs Cross-over approx 3.5 pages per KL row Parallel Plan Closer to 4 pages per Key Lookup row

Loop, Hash and Merge Joins

L H M

Sort

Loop Hash and Merge CostFixed Incremental Loop~ Seek + seek cost: IO, CPU Hash~ * Merge~ † Many-to-Many Merge Sort~ * Hash incremental cost depends on inner/outer source size † Merge join incremental is per IS & OS row? Merge + Sort fixed cost approx same as Hash fixed cost

Loop, Hash, Merge CostFixed Incremental LoopZeroHigh HashHighMedium MergeMediumLow Merge Join requires both source rows in index sorted order. Regular Merge only for 1-1 or 1-many Many-to-many merge join is more expensive

Plan Cross-over Theory Cost Rows Index Seek + Key Lookup Table Scan

Theory & Actual? Cost Rows KL Theory Table Scan KL alternate reality? KL Actual!

Plan and Actual IO RandomSequentialRatio Plan320 1,350 (10.8M/s) Current HD200* 12,800 (100MB/s)64* SAN200 1,280 (10MB/s)6.4 SSD 20,000 25,000~1 *Note: original slide incorrectly listed 640:1

Loop, Hash & Merge

Loop Join

Merge Join

Hash Join

Insert, Update & Delete Really complicated See material from Conor For large number of rows (25%?) Consider dropping indexes

Delete Rows Index foreign keys when: Deletes from primary table are frequent

Parallel Execution Plans Parallel Execution Parallelism Gather, Repartition, Distribute Streams, Partitions

Parallel Execution Plan

Parallel Operations Distribute Streams Non-parallel source, parallel destination Repartition Streams Parallel source and destination Gather Streams Destination is non-parallel Bitmap

Scan

DOP 1 DOP 2 DOP 4 DOP 8 IO Cost same CPU reduce by degree of parallelism, except no reduction for DOP 16 2X 4X 8X IO contributes most of cost!

DOP 16 DOP 8

IO Cost is the same CPU cost reduced in proportion to degree of parallelism, last 2X excluded? On a weak storage system, a single thread can saturate the IO channel, Additional threads will not increase IO (reduce IO duration). A very powerful storage system can provide IO proportional to the number of threads. It might be nice if this was optimizer option? The IO component can be a very large portion of the overall plan cost Not reducing IO cost in parallel plan may inhibit generating favorable plan, i.e., not sufficient to offset the contribution from the Parallelism operations. A parallel execution plan is more likely on larger systems (-P to fake it?)

Partitioned Tables Regular Table Partitioned Tables No Repartition Streams operations!

Parallel Execution: Super Scaling Suppose at DOP 1, a query runs for 100 seconds, with one CPU fully pegged CPU time = 100 sec, elapse time = 100 sec What is best case for DOP 2? Assuming nearly zero Repartition Threads cost CPU time = 100 sec, elapsed time = 50? Super Scaling: CPU time decreases going from Non-Parallel to Parallel plan! No, I have not been drinking, today, yet

Super Scaling CPU-sec goes down from DOP 1 to 2 and higher (typically 8) CPU normalized to DOP 1 Speed up relative to DOP 1 3.5X speedup from DOP 1 to 2 (Normalized to DOP 1)

Most probable cause Bitmap Operator in Parallel Plan Bitmap Filters are great, Question for Microsoft: Can I use Bitmap Filters in OLTP systems with non-parallel plans?

Negative Scaling Query time “Speedup”

CPU

Small Queries – Plan Cost vs. Act Query 3 and 16 have lower plan cost than Q17, but not included Q4,6,17 great scaling to DOP 4, then weak Negative scaling also occurs Query time Plan Cost

What did I get for all that extra CPU?, Interpretation: sharp jump in CPU means poor scaling, disproportionate means negative scaling Query 2 negative at DOP 2, Q4 is good, Q6 get speedup, but at CPU premium, Q17 and 20 negative after DOP 8 CPU time Speed up

Parallel Exec – Small Queries Why? Almost No value OLTP with 32, 64+ cores Parallelism good if super-scaling Default max degree of parallelism 0 Seriously bad news, especially for small Q Increase cost threshold for parallelism?

Parallel Settings - Strategy Mostly for OLTP Cost Threshold for Parallelism Default: Plan Cost > 5: Proposed: In 1997, Pentium Pro 200MHz, ~5 sec for 50MB table (index range) scan Today, Xeon 5680, 3.3GHz, ~ 30X faster Parallel plan could run milli-sec

Parallel Settings - Strategy Mostly for OLTP Cost Threshold for Parallelism Default: Plan Cost > 5: Proposed: In 1997, Pentium Pro 200MHz, ~5 sec for 50MB table (index range) scan Today, Xeon 5680, 3.3GHz, ~ 30X faster Parallel plan could run milli-sec

Parallel Settings - Strategy Cost Threshold for Parallelism Default: Plan Cost > 5: Proposed: In 1997, Pentium Pro 200MHz, ~5 sec for 50MB table (index range) scan Today, Xeon 5680, 3.3GHz, ~ 30X faster Parallel plan could run milli-sec Max Degree of Parallelism (OLTP) Default: 0, unrestricted, Proposed: 2-4 Use OPTION (MAXDOP n)

Too Many Indexes Complicates Query Optimization Too many possible execution plan Large Updates – Maintenance Consider dropping indexes

Parameters and Variables Unknown, remote source Remote Scan: 10,000 rows Remote Seek xxx rows Unknown >, <, BETWEEN > or <: 30% of rows BETWEEN: 1/10 of rows

Temp tables and Table Variables