Presentation is loading. Please wait.

Presentation is loading. Please wait.

Joe Chang jchang6 @ yahoo www.qdpma.com Comprehensive Indexing via Automated Execution Plan Analysis (ExecStats) Joe Chang jchang6 @ yahoo www.qdpma.com.

Similar presentations


Presentation on theme: "Joe Chang jchang6 @ yahoo www.qdpma.com Comprehensive Indexing via Automated Execution Plan Analysis (ExecStats) Joe Chang jchang6 @ yahoo www.qdpma.com."— Presentation transcript:

1 Joe Chang jchang6 @ yahoo www.qdpma.com
Comprehensive Indexing via Automated Execution Plan Analysis (ExecStats) Joe Chang yahoo Slide deck here

2 About Joe SQL Server consultant since 1999
Query Optimizer execution plan cost formulas (2002) True cost structure of SQL plan operations (2003?) Database with distribution statistics only, no data 2004 Decoding statblob-stats_stream writing your own statistics Disk IO cost structure Tools for system monitoring, execution plan analysis See ExecStats Download: Blog:

3 Objectives Complexities & depth SQL performance
Cause and Effect Focus on the execution plan Inefficient plans – missing indexes very large estimate/actual row discrepancies Comprehensive Index Strategy few good indexes, but no more than necessary

4 Non-Objective List of rules to be followed blindly
without consideration for the underlying reason and whether rule actually applies in the current circumstance DBA skill: cause and effect analysis & assessment

5 Preliminary: Correct Results
Normalization Data stored once, avoid anomalies Unique Keys Avoid duplicate rows Foreign Keys Avoid orphaned rows Incorrect architecture requires use of SELECT DISTINCT etc. to correct architecture deficiencies Which may cause performance problems as well Correct action is to address the architecture mistakes before the performance issue.

6 Performance Big Picture
SQL Tables natural keys Indexes Statistics & Compile parameters Tables and SQL combined implement business logic Natural keys with unique indexes, not SQL Compile Row estimate propagation errors Query Optimizer Index and Statistics maintenance policy DOP Memory Parallel plans Execution Plan Recompile temp table / table variable 1 Logic may need more than one execution plan? API Server Cursors: open, prepare, execute, close? Storage Engine Index & Stats Maintenance Compile cost versus execution cost? SET NO COUNT Information messages Hardware Plan cache bloat? The Execution Plan links all the elements of performance Index tuning alone has limited value Over indexing can cause problems as well

7 Indexing Principles Good cluster key choice Good nonclustered indexes
Grouping + unique, not too wide Good nonclustered indexes For key queries, not necessarily every query Covered indexes where practical Create and drop custom indexes for maintenance ops/special circum. No more indexes than necessary Update overhead Compile overhead May tolerate occasional scans to avoid update maintenance Note emphasis on good, not perfect

8 SQL – Plan – Index Usage Execution Plan dm_exec_query_stats
dm_db_index_usage_stats sys.dm_io_virtual_file_stats (database_id, file_id) sys.dm_os_volume_stats (database_id, file_id) STATS_DATE ( object_id , stats_id ) dm_db_stats_properties: object_id, stats_id, last_updated, rows, rows_sampled, steps, unfiltered_rows, modification_counter dm_exec_query_profiles

9 Using DMVs – Execution Plan
dm_exec_query_stats dm_exec_sql_text dm_exec_query_plan dm_exec_text_query_plan dm_db_index_usage_stats dm_db_index_operational_stats dm_db_index_physical_stats DBCC SHOW_STATISTICS STATS_DATE(object_id, stats_id) dm_db_stats_properties Execution Plan Indexes, joins Compile parameters System views Indexes, key columns, Include list, filter, XML, Columns store etc. sys.dm_db_stats_properties, is available in SQL Server 2012 starting with Service Pack 1 and in SQL Server 2008 R2 starting with SP2. last_updated, rows, rows_sampled, steps, unfiltered_rows, modification_counter dm_exec_query_profiles 2014 Real time query progress? sys.dm_io_virtual_file_stats (database_id, file_id) sys.dm_os_volume_stats (database_id, file_id) STATS_DATE ( object_id , stats_id ) dm_db_stats_properties: object_id, stats_id, last_updated, rows, rows_sampled, steps, unfiltered_rows, modification_counter dm_exec_query_profiles

10 Performance Analysis Getting Top SQL from dm_exec_query_stats
Manually examining top execution plans Index Reduction – dm_db_index_usage_stats Drop unused indexes (based on long period) Consolidating indexes with similar keys Infrequently used indexes? Must hunt down SQL, possibly low item in query stats Can it use another index?

11 Real World Example July 2014
3.6B rows, 3.4TB, 1.5TB data 1.8TB indexes Key tables have 21, 8, 14, 4 and 14 nonclustered indexes

12 Largest Table – 21 (NC) Indexes

13 Index Reduction 4.6B rows, 2.3TB, 1.7TB data 0.5TB indexes
Dec 2014 4.6B rows, 2.3TB, 1.7TB data 0.5TB indexes Key tables have 6, 4, 3, 2 and 4 nonclustered indexes Nonclustered indexes reduced from 21 to 6

14 Index Reduction + Compression
Feb 2015 4.9B rows, 0.9TB, 0.55TB data 0.34TB indexes Key tables have 7, 5, 3, 2 and 4 nonclustered indexes 1 index added, 1 index awaiting removal, possibly 2

15 Compression Notes Very high compression was achieved
Because all keys were 16-byte GUID Even on dimensions, when natural key would have been 1, 2 or 4 bytes! Core data + indexes 648GB data 230GB indexes w/o compression 174GB data 185GB indexes w/compression Reduction in I/O even with SSD storage far outweigh compression overhead! System memory: 256GB (220) Storage: Violin (NAND Flash)

16 Index reduction Compression & statistics Violin HDD Storage GDC

17 Database view

18 File IO view

19 Table view columns

20 Indexes - continued Number of execution plans that reference the index in Seeks, Scans, Lookups, Insert/Updates and Deletes Literal identifying the execution plans that reference the index in Seeks, Scans, Lookups, Insert/Upd & Deletes

21 Query Execution Stats - 1

22 Query Execution Stats - more

23 Dataspace – Partition Scheme view
Partition View

24 Procedure and Functions
Columns Dbid, schema, object, object_id, type, Create date, modify date, Number of references (NumRef) (literal) plan reference (from QExec Stats) Caller reference (Functions only)

25 Volumes

26

27

28 Slides not used

29 Performance Strategy Tables – support business logic
Normalization, uniqueness etc. SQL – clear SARG, Query optimizer interpretable 1 Logic maps to X Execution plans Indexes – good cluster key choice Good nonclustered indexes, no more than necessary Statistics – sample strategy & update frequency Compile parameter strategy Temp table / Table variable strategy: Recompile & Row est. prop. error Parallel execution plans: DOP and CTOP strategy Identity key / alternative: large & small customers

30 Identify (weight) important SQL statements
stored procedure: parameter values & code path Recompile impact for temp tables Execution plan cross references SQL & indexes Actual plan is better than estimate plan Compile parameters & skewed statistics Temp tables - Recompile impact Automate Execution Plan analysis to fully cross-reference SQL to index usage

31 SQL & Execution Plan Sources
Estimated Execution Plan dm_exec_query_stats Contents of plan cache + execution statistics List of stored procedures SELECT name FROM sys.procedures Any SQL list Plans not in cache, to be generated Can also execute SQL for actual plans

32 sys.dm_exec_query_stats
sql_handle token for batch or stored procedure statement_start_offset sql_handle + offset = SQL statement plan_handle SQL (batch) can have multiple plans on recompile query_hash identify queries with similar logic, differing only by literal values

33 sys.procedures Get list of stored procedures in database
functions are called from procedure? Generate estimated execution plan for each Default parameters Full map of index usage to stored procedure No trigger details in estimated plan

34 SQL List Configuration file has SQL to retrieve SQL list Can be
explicit SQL or stored procedures with parameters Same procedure, multiple parameter set To expose different code path (actual plan) EXEC proc RECOMPILE (estimated plan)

35 About ExecStats Execution plan sources
General information Execution plan sources dm_exec_query_stats list of all stored procedures (estimated) List of SQL in table (estimated or actual plan) Trace file Correlates execution plans to index usage Procedures, functions and triggers Rollup file IO stats by DB, filegroup, disk/vol, data/log Distribution Statistics Output to Excel, sqlplan file, (sql in txt file)

36 ExecStats Output Files
Txt – runtime info Log – abbreviated SQL error logs Excel – Missing Indexes DMV SQL plan directory This can be sent to someone who can identify and fix your problem

37 Important Items Query cost – plan efficiency? Recompiles?
Compile parameters – skewed statistics CPU versus Duration (worker – elapsed time) Disk IO, network transmission, parallel plan? Execution count – network roundtrip? Plan cost – Parallelism High volume of quick queries is bad, so is excessive DOP Index – current rows, rows at time stats generated, sample rows & date

38 Execution Plans estimate - actual
Actual: estimated cost, actual rows, DOP Compile parameters Actual rows/executions versus estimated Execute stored procedure once for each possible code path – with appropriate parameters

39 Execution Plans Analysis
Predicate index key columns does not matching full SARG SQL has function on SARG, data type mismatch Compile parameters & statistics Actual and Estimated rows/execution mismatch Large table scans: how many rows output? Rebinds and Rewinds – key lookup Parallelism

40 Execution Plans Pay attention to: Compile parameters
Large table scans: how many rows output? Predicate search condition without suitable index Rebinds and Rewinds – key lookup Parallelism

41 Index Usage – missing IX, excess IX?
Index usage – seek, scan, lookup & update Unused indexes (infrequent code?) can be dropped Infrequent usage: check plan references Similar indexes (leading keys) Same keys, different order Check plan reference – consolidate if possible Scans to large tables or even nonclustered IX Is it real (SELECT TOP 1 may not be a real scan) Lookups – can these be reduced?

42

43 SQL Server Skills & Roles
. Architect Table structure, unique keys Data Architect normalization Developers SQL code Performance DBA Index + Statistics Maintenance Hardware & Storage

44 SQL Server Performance History
Before DMVs (SQL Server 2000) Profiler/Trace to get top SQL Execution plans – not really exportable Which indexes are actually used? Today Trace/Extended Events sometimes not necessary If the dm_exec_query_stats content is good Execution plans are exportable Index Usage Stats

45 How much can be automated?
Data collection all, of course Top resource consumers, etc. Assessment sometimes Is there a problem Can it be fixed or improved Fix/Change sometimes Indexes SQL – sometimes Table structure, architecture no If problems could be solved by pushing a button, what would be the skill requirements to be a DBA? Great accomplishments – 99% perspiration 1% inspiration

46 Performance Approaches
Check against list of “Best Practices” Manual DMV scripts approach Find Top 5 or 10 SQL Fix it if/when there is a problem All Indexes and procedures/SQL Examine the complete set of stored procedures Or the full list of SQL statements Good indexes for all SQL, no more indexes than

47 Why bother when there are no problems?
No problems for over 1 year Never bothered to collect performance baseline Problem Today – Find it with DMV, fix it the problem was xxx but why did it occur today & not before? Probably statistics or compile parameters, but prove it? Why ExecStats SQL scripts? – too much manual work Third party tools? – only find problem

48 Rigorous Optimization
Table structure, SQL, Client-side Cluster Key Good (nonclustered) Indexes All indexes are actually used No more indexes than necessary Consolidate similar indexes same keys, same order, or reverse order? What SQL is impacted? Statistics update Index maintenance Must consider the full set of SQL/procedures in removing indexes?

49 SQL versus programming languages
SQL – great for data access Not good for everything else When SQL becomes horribly complicated What would the code looks like in VB/Java/Cxx Client-side program C#

50 Performance Information
Server, Storage OS & SQL Server Settings SQL Server SQL, query execution statistics, execution plan Compile parameters Indexes and index usage statistics Statistics sampling – when? percentage? skew?

51


Download ppt "Joe Chang jchang6 @ yahoo www.qdpma.com Comprehensive Indexing via Automated Execution Plan Analysis (ExecStats) Joe Chang jchang6 @ yahoo www.qdpma.com."

Similar presentations


Ads by Google