Download presentation
Presentation is loading. Please wait.
1
Statistics What are the chances
Lori Edwards, SQL Sentry
2
Introduction Sales Engineer for SQL Sentry since 2/2013
Previously – DBA since 2003 PASS volunteer PASSion Award winner 2011 LinkedIn: Blog: 2 | Statistics - What are the chances
3
Why statistics? This was me circa 2006… 3 |
3 | Statistics - What are the chances
4
Agenda What are statistics? How and when are they created?
Statistics and the query optimizer Demo Statistics maintenance Changes in SQL Server 2014 Questions 4 | Statistics - What are the chances
5
What are statistics? Statistics contain information about the distribution of data within a columns Statistics can also be created on unindexed columns Statistics on multi-column indexes are on the first column of the index 5 | Statistics - What are the chances
6
Important Concepts Predicate – The condition in the WHERE or JOIN element of a query Density – relates to the number of unique values within a column Selectivity – percentage of rows that match predicate Cardinality – The estimated number of rows returned by a query operator Filter and join predicates Higher density – fewer unique values Density values can be >0 – means a single distinct value Selectivity – higher selectivity indicates a lower percentage of matching values Cardinality estimates come from density vector and histogram information when available. Otherwise heuristics (WAG) are used 6 | Statistics - What are the chances
7
How are statistics created?
Creating indexes AUTO CREATE STATISTICS CREATE STATISTICS fullscan, sample, norecompute, incremental [ only] sp_createstats indexonly, fullscan, norecompute, incremental [ only] Filtered statistics Statistics on ready-only dbs or read-only snapshots Auto Create Statistics is on by default This works the same way for temporary tables. For very large tables, you may want to create statistics manually rather than have them created at runtime. Sp_createstats can be useful for data warehouses. When creating filtered statistics, include a where clause Statistics on read only dbs or snapshots are temporary and created within tempdb. Viewing the is_temporary column in sys.stats or sys_stats_columns view will indicate temporary statistics 7 | Statistics - What are the chances
8
Viewing statistics sp_helpstats(‘tablename’, statistic name or ALL)
sys.stats catalog view Stats_date sys.dm_db_stats_properties(object_id, stats_id) 8 | Statistics - What are the chances
9
Viewing statistics DBCC SHOW_STATISTICS(‘tablename’, statistic name)
Output is the statistics object (or stat blob) The stat blob is made up of three distinct parts Header Density Vector Histogram 9 | Statistics - What are the chances
10
The Stat Blob Header – meta data about the statistic
Update date and rows v. rows sampled are most important Density value here can be ignored – displayed for backward compatibility for versions before 2008 10 | Statistics - What are the chances
11
The Stat Blob Density Vector – Density values for the column(s) in the statistic For index based statistics, all column combinations will appear including the clustered key value Density value over multiple columns not as accurate – assuming independence, the density is found by multiplying the individual selectivities 11 | Statistics - What are the chances
12
The Stat Blob Histogram – Shows the data distribution for a column in a tabular format 12 | Statistics - What are the chances
13
The Stat Blob Histogram 13 | Statistics - What are the chances
14
The Stat Blob Histogram 14 | Statistics - What are the chances
15
Statistics, Cardinality and the Query Optimizer
Statistics are necessary to estimate row count, or cardinality Cardinality along with the operator cost model = total operator cost Costing is taken into account when choosing a plan Query optimizer’s job is not to find the best plan – it’s to find a good plan quickly. 15 | Statistics - What are the chances
16
Statistics, Cardinality and the Query Optimizer
Getting the most from cardinality estimates Keep it clean Options Parameters Temporary tables Column order Using multiple columns from the same table Table variables and Table Valued Parameters Option (optimize for unknown) No statistics on multi-statement table-valued functions No statistics created for table variables Especially with columns within the same table. Trace flag 2453 for table variables – available 2012 SP2 and 2014 CU#3 or with recompile 11/20/201811/20/2018 | Statistics - What are the chances
17
Decisions affected by cardinality
Parallel or serial plans Index seek or scan Join algorithms Including inner/outer table selection Spool generation Key lookup vs. table scan Stream or hash aggregates Implicit conversion can impact cardinality estimates Comparing columns within the same table can affect cardinality estimates – computed columns can help Keep predicates as clean as possible 11/20/201811/20/2018 | Statistics - What are the chances
18
Cardinality and memory grants
Many execution plans require memory grants for specific operators (hash, sort, spool) Memory grants = cost of operator * estimated row count Both overestimates and underestimates can have detrimental effects Memory grants can be viewed via sys.dm_exec_query_memory_grants 11/20/201811/20/2018 | Statistics - What are the chances
19
Demo 19 | Statistics - What are the chances
20
Updating statistics In the majority of environments, AUTO UPDATE STATISTICS should be on When does that kick in? It depends on the number of rows in the table As a note, the AUTO_UPDATE_STATISTICS_ASYNC option will determine whether the update occurs when an execution plan with out of date statistics is called or after the query executes Sp_autostats on table/index/statistics object Updates when -> Rowcount goes from 0 to >0 Rowcount <= 500 and more than 500 changes have occurred Rowcount > 500 and % of rowcount changes have been made Asynchronously will use old statistics (and execution plans based on them) until the statistics are built in the background – synchronously – the query will halt while the statistics are updated – depending on the size of your table this could be a few seconds or many minutes 20 | Statistics - What are the chances
21
Updating statistics For very large tables Trace flag 2371
Available SS 2008 R2 SP1 21 | Statistics - What are the chances
22
Updating statistics manually
sp_updatestats vs UPDATE STATISTICS sp_updatestats updates all of the statistics within a database that have had at least one modification Note: will update with the default (or last) sample Filtered statistics will not be optimized For memory optimized tables, ALL statistics are updated UPDATE STATISTICS requires a table name with optional index or statistics name Updating statistics as part of index maintenance 22 | Statistics - What are the chances
23
Updating statistics How important is it to keep statistics updated?
It’s important, but items to consider Does the table data change frequently (either in size or data distribution)? How will plan recompilation impact your environment? And is plan recompilation a factor on your hardware? 23 | Statistics - What are the chances
24
Cardinality Estimator changes in 2014
Independancy assumption Ascending key problem Filtered predicates on different tables Join estimate algorithm Trace flags 2312 and 9481 Independency assumption Legacy – (count from predicate 1 * count from predicate 2)/ total row count – assuming total independence New – selectivity of the most selective filter * sqrt(next most selective filter) – doesn’t assume total independence or total correlation Ascending key – legacy if value fell outside of histogram, cardinality was set at 1 New – rows sampled * all density (average frequency) for samples. For fullscan, it uses the number of rows inserted since last stats rebuild Filtered predicatesLegacy – assumed correlation New – assumes no correlation Join estimation – Legacy step by step correlation in histograms New – uses a coarse alighnment – this is where you might see more inaccurate estimations using the new CE TF 2312 – forces use of new CE on 2012 dbs TF 9481 – forces use of legacy CE on 2014 dbs Can be enabled at server, session or query level – to enable on a query level use QUERYTRACEON hint TF 2453 can not be set on a query level using QUERYTRACEON 11/20/201811/20/2018 | Statistics - What are the chances
25
Summary Statistics provide valuable information on the data within columns Statistics are created on indexes automatically Can also be created manually and autocreated Accurate statistics help the query optimizer to find the best plan View stats with sp_helpstats and DBCC SHOW_STATISTICS Maintain stats by setting AUTO_UPDATE_STATISTICS on and using (sometimes) sp_updatestats 25 | Statistics - What are the chances
26
Resources Statistics in SQL Server 2014: AUTO_UPDATE_STATS_ASYNC option: Trace flag 2371: Understanding Statistics Updates: And some additional statistics questions: 26 | Statistics - What are the chances
27
Questions? me at Or tweet Thank you! 27 | Statistics - What are the chances
28
Thank You Sponsors! Visit the Sponsor tables to enter their end of day raffles. Turn in your completed Event Evaluation form at the end of the day in the Registration area to be entered in additional drawings. Want more free training? Check out the Houston Area SQL Server User Group which meets on the 2nd Tuesday of each month. Details at 6/13/2015
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.