Parallel Execution Plans Joe Chang

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Understanding SQL Server Query Execution Plans
SQL Server performance tuning basics
Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
SQL Performance 2011/12 Joe Chang, SolidQ
EXECUTION PLANS By Nimesh Shah, Amit Bhawnani. Outline  What is execution plan  How are execution plans created  How to get an execution plan  Graphical.
Automating Performance … Joe Chang SolidQ
Slide: 1 Presentation Title Presentation Sub-Title Copyright 2010 Robert Haas, EnterpriseDB Corporation. Creative Commons 3.0 Attribution. The PostgreSQL.
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
Microsoft SQL Server Administration for SAP SQL Server Architecture.
SQL Server Query Optimizer Cost Formulas Joe Chang
Parallel Execution Plans Joe Chang
SQL Server 2005 Performance Enhancements for Large Queries Joe Chang
Troubleshooting SQL Server Enterprise Geodatabase Performance Issues
Quantitative Performance Analysis Joe Chang
Oracle Database Administration Lecture 6 Indexes, Optimizer, Hints.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Insert, Update & Delete Performance Joe Chang
Primary Key, Cluster Key & Identity Loop, Hash & Merge Joins Joe Chang
Applications hitting a wall today with SQL Server Locking/Latching Scale-up Throughput or latency SLA Applications which do not use SQL Server.
Index Building Overview Database tables Building flow (logical) Sequential Drawbacks Parallel processing Recovery Helpful rules.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
CS 338Query Evaluation7-1 Query Evaluation Lecture Topics Query interpretation Basic operations Costs of basic operations Examples Textbook Chapter 12.
Large Data Operations Joe Chang
Parallel Execution Plans Joe Chang
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
TPC-H Studies Joe Chang
Query Optimizer Execution Plan Cost Model Joe Chang
1 Chapter 13 Parallel SQL. 2 Understanding Parallel SQL Enables a SQL statement to be: – Split into multiple threads – Each thread processed simultaneously.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Session 1 Module 1: Introduction to Data Integrity
MISSION CRITICAL COMPUTING Siebel Database Considerations.
1 Chapter 9 Tuning Table Access. 2 Overview Improve performance of access to single table Explain access methods – Full Table Scan – Index – Partition-level.
CS 440 Database Management Systems Lecture 6: Data storage & access methods 1.
IMS 4212: Application Architecture and Intro to Stored Procedures 1 Dr. Lawrence West, Management Dept., University of Central Florida
TOP 10 Thinks you shouldn’t do with/in your database
Ch 7. Working with relational data. Transactions Group of statements executed as a group. If all statements execute successfully, changes are committed.
Lock Tuning. Overview Data definition language (DDL) statements are considered harmful DDL is the language used to access and manipulate catalog or metadata.
Table Structures and Indexing. The concept of indexing If you were asked to search for the name “Adam Wilbert” in a phonebook, you would go directly to.
Best Practices in Loading Large Datasets Asanka Padmakumara (BSc,MCTS) SQL Server Sri Lanka User Group Meeting Oct 2013.
How to kill SQL Server Performance Håkan Winther.
APRIL 13 th Introduction About me Duško Mirković 7 years of experience.
Execution Plans Detail From Zero to Hero İsmail Adar.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
SAP Tuning 실무 SK㈜ ERP TFT.
SQL Server Performance Tuning Starter Kit Randolph West | Born SQL.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
The PostgreSQL Query Planner Robert Haas PostgreSQL East 2010.
Module 11: File Structure
Database Management System
Scaling SQL with different approaches
Presented by: Warren Sifre
Introduction to Execution Plans
Azure SQL Data Warehouse Performance Tuning
Relational Operations
Physical Database Design
Transactions, Locking and Query Optimisation
Shaving of Microseconds
Statistics for beginners – In-Memory OLTP
SQL Server Query Optimizer Cost Formulas
Introduction to Execution Plans
Diving into Query Execution Plans
Introduction to Execution Plans
Introduction to Execution Plans
All about Indexes Gail Shaw.
Presentation transcript:

Parallel Execution Plans Joe Chang

Parallel Execution Plans Allows single query to use multiple processors Query should run faster but may consume more resources Example 1 thread: 10 sec run time, 10 CPU-sec 2 threads: 6 sec run time, 12 CPU-sec

Parallel Execution Configuration Cost Threshold For Parallelism Minimum query plan threshold for considering queries for parallel execution Default 5: Considering increasing to for new systems Max Degree of Parallelism Default 0: Can use all available processors SQL Server determines level based on available memory and recent CPU usage

Parallel Plan Operators The Distribute Streams operator consumes a single input stream of records and produces multiple output streams. The record contents and format are not changed. Each record from the input stream appears in one of the output streams. This operator automatically preserves the relative order of the input records in the output streams. Usually, hashing is used to decide to which output stream a particular input record belongs. The Repartition Streams operator consumes multiple streams and produces multiple streams of records. The record contents and format are not changed. Each record from an input stream is placed into one output stream. If this operator is order-preserving, then all input streams must be ordered and merged into several ordered output streams. The Gather Streams operator consumes several input streams and produces a single output stream of records by combining the input streams. The record contents and format are not changed. If this operator is order-preserving, then all input streams must be ordered.

Execution Plan Cost Formulas Table Scan or Index Scan I/O: per page CPU: per row Index Seek – Plan Formula I/O Cost = per additional page (≤1GB) = per additional page (>1GB) CPU Cost = per additional row Bookmark Lookup – May have changed ? I/O Cost = multiple of (≤1GB) = multiple of (>1GB) CPU Cost = per row Table Scan or Index Scan IUD I/O Cost ~ – (>100 rows) IUD CPU Cost = per row

Cost Interpretation Time in seconds? CPU time? sec -> 160/sec >1350/sec (8KB) ->169/sec(64K)-> 10.8MB/sec S2K BOL: Administering SQL Server, Managing Servers, Setting Configuration Options: cost threshold for parallelism Opt Query cost refers to the estimated elapsed time, in seconds, required to execute a query on a specific hardware configuration. Too fast for 7200RPM disk random I/Os. About right for 1997 sequential disk transfer rate?

Test Table CREATE TABLE M3A_20 ( GroupID int NOT NULL, ID int NOT NULL, ID2 int NOT NULL, ID3 int NOT NULL, ID4 int NOT NULL, sID smallint NOT NULL, bID1 bigint NOT NULL, bID2 bigint NOT NULL, bID3 bigint NOT NULL, rMoney money NOT NULL, rDate datetime NOT NULL, rReal real NOT NULL, rDecimal decimal (9,4) NOT NULL, CONSTRAINT [PK_M3A_20] PRIMARY KEY CLUSTERED ( [GroupID], [ID] ) WITH FILLFACTOR = 100 ) GO

Data Population Script 1 SET NOCOUNT ON SELECT SELECT BEGIN BEGIN TRANSACTION = BEGIN INSERT M3A_20 (GroupID, ID, ID2, ID3, ID4, sID, bID1, bID2, bID3, rMoney, rDate, rReal, rDecimal) VALUES ( *rand(), 10000*rand(), 10000*rand() ) IF > 0 BEGIN GOTO B END END COMMIT TRANSACTION CHECKPOINT PRINT CONVERT(varchar,GETDATE(),121) + ', row ' + END B: IF > 0 COMMIT TRANSACTION PRINT '01 Complete ' + CONVERT(varchar,GETDATE(),121) + ', row ' + + ', Trancount ' +

Data Population Script 1 Notes Double While Loop Each Insert/Update/Delete statement is an implicit transaction Gets separate transaction log entry Explicit transaction – generates a single transaction log write (max 64KB per IO) Single TRAN for entire loop requires excessively large log file Inserts are grouped into intermediate size batches

Data Population Scripts 2 int = 1 <= 3 BEGIN INSERT M3A_11 (GroupID,ID,ID2,ID3,ID4,sID,bID1,bID2,bID3,rMoney,rDate,rReal, rDecimal) SELECT TOP GroupID, ID, ID, ID3, ID4, sID, bID1, bID2, bID3, rMoney, rDate, rReal, rDecimal FROM M3A_20 WHERE GroupID = 1 AND ID BETWEEN + 1 CHECKPOINT PRINT '11 Step ' + + ', ' + CONVERT(varchar,GETDATE(),121) END UPDATE STATISTICS M3A_01 (PK_M3A_01) WITH FULLSCAN CREATE STATISTICS ST_01 ON M3A_01 (ID) WITH FULLSCAN, NORECOMPUTE Primary table populated using single row inserts in a WHILE loop, Additional tables populated with INSERT / SELECT statement Single row inserts ~20-30K rows/sec INSERT / SELECT statement ~100K+ rows/sec

Index Seek Plans Many rows returned, Non-parallel plan Parallel Execution disabled Cost: 9.34 Cost: 9.82 Cost: 4.94 Parallel Plan

Index Seek Details Non-parallel plan Parallel plan

Index Seek – Non-parallel Cost assigned to SELECT Index Seek, 1M rows in 11,115 pages (81 bytes/row, 90% Fill) I/O cost is: CPU Cost is Cost & sub-tree Cost is correct, I/O & CPU is ½ of correct value

Index Seek – Parallel Plan No cost assigned to SELECT Index Seek I/O and CPU cost ½ of non-parallel plan

Index Seek with Aggregate 1234

Index Seek Aggregate Parallel Plan Details

Table Scan Cost: 9.01 Cost: 8.26

Table Scan Details Non-parallel plan Parallel plan I/O cost same CPU cost ½ of non parallel plan

Table Scan Details Non-parallel plan Parallel plan No cost on Select No cost I/O cost same CPU cost ½ of non parallel plan

Parallel Plan Cost Formulas Patterns CPU costs are ½ of non-parallel plan Index Seek I/O cost are also ½ Scan I/O cost is same as non-parallel plan Parallel plan costs are based on 2 processors Actual number of processors determined at runtime Overhead operations Distribute, Repartition & Gather Streams

Hash Join Cost: 6.50 Cost: ,000 rows 15 byte OS row size

Hash Join Details Non-parallel plan Parallel plan

Hash Join Details Non-parallel plan Parallel plan

Hash Join – Non-parallel plan

Hash Join – Parallel Plan

Hash Join with I/O Cost 900,000 rows MAXDOP 1 Cost 74.1 Cost 85.1

Hash Join – Join I/O Cost 730,000 rows 740,000 rows

Hash Join - Bitmap

Hash Join Cost Formula Index Seek – Plan Formula I/O Cost = per additional page (≤1GB) = per additional page (>1GB) CPU Cost = per additional row Hash Join CPU Cost = base (2-30 rows) (100 rows) per row (parallel) per row per 4 bytes in OS per additional row in IS I/O Cost = per row over 64MB (Row Size+8) per 4 byte over 15B

Parallel Cost Formula Base Cost Repartition Stream Cost per row = Base (15 Bytes) per additional 4 Bytes Gather Stream Cost per row = Base(15) per additional 4 Bytes Dispatch

Loop Join

Loop Join Details Non-parallel plan Outer Source Parallel plan Outer Source

Loop Join Details Inner Source cost identical for both non-parallel and parallel plans

Loop Join Details Non-parallel plan Parallel plan

Merge Join

Merge Join Details Non-parallel plan Parallel plan

Merge Join Details Non-parallel plan Parallel plan

Merge Join Details Non-parallel plan Parallel plan

Index Seek + Aggregate Test Opteron2.2GHz 1M Xeon 2.4GHz/512K

Index Seek + Aggregate Test, Itanium 2 Itanium 2 1.5GHz/6M

Index Seek + Aggregate Test, SUM(INT) Itanium 2 1.5GHz/6M

Index Seek + Aggregate Test, NULL Itanium 2 1.5GHz/6M

Loop Join, COUNT(*) Itanium 2 1.5GHz/6M

Hash Join, COUNT(*) Itanium 2 1.5GHz/6M

Merge Join, COUNT(*) Itanium 2 1.5GHz/6M

General Recommendations Useful in DW, ETL, and maintenance activities Use judgment on transactions processing Is throughput more important Or faster expensive queries Increase Cost Threshold from 5 to Limit MAXDOP to 4 Verify or limit parallelism on Xeon systems with Hyper-Threading enabled

Additional Information SQL Server Quantitative Performance Analysis Server System Architecture Processor Performance Direct Connect Gigabit Networking Parallel Execution Plans Large Data Operations Transferring Statistics SQL Server Backup Performance with Imceda LiteSpeed