Download presentation
Presentation is loading. Please wait.
1
Statistics And New Cardinality Estimator (CE)
Susantha Bathige (MSc,BCS,MCSA,MCITP) Sr. DBA
2
Statistics And New Cardinality Estimator
Susantha Bathige SQL Saturday #582, Melbourne 11th February 2017
3
Schedule Changes Microsoft Room Sunil Agarwal now at 09:00 instead of 16:30 Ajay Jagannathan now at 15:15 instead of 09:00 Darren Gosbell now at 16:30 instead of 15:15 Sandisk Room Leila Etaati now at 14:05 instead of 16:30 Hamish Watson now at 16:30 instead of 14:05 Insight / Ignia Room Rolf Tesmer & Kristina Rumpff now at 15:15 instead of 16:30 Martin Cairney now at 16:30 instead of 15:15
4
Housekeeping Mobile Phones Evaluations Coffee
Please set to “stun” during sessions Evaluations Please complete a session Evaluation to provide feedback to our wonderful speakers! Also complete the Event Evaluation forms – please fill them in and return them at the end of the day Coffee There a Coffee cart provided by WardyIT outside the Microsoft & Sandisk rooms if you need a caffeine hit before the next session (speakers, please remove the line about coffee if your presentation is in the afternoon)
5
Schedule Changes Microsoft Room Sunil Agarwal now at 09:00 instead of 16:30 Ajay Jagannathan now at 15:15 instead of 09:00 Darren Gosbell now at 16:30 instead of 15:15 Sandisk Room Leila Etaati now at 14:05 instead of 16:30 Hamish Watson now at 16:30 instead of 14:05 Insight / Ignia Room Rolf Tesmer & Kristina Rumpff now at 15:15 instead of 16:30 Martin Cairney now at 16:30 instead of 15:15
6
About me MCSA Database Administrator (SQL Server 2012/2014)
MCITP SQL Server 2005 Sr. DBA Specialized in SQL Server Working as a Production DBA at Pearson Has more than 12+ years of experience in SQL Server starting with SQL Server 7.0 Writer: MSSQLTips.com, Personal blog ( Active member in SQL Server User Groups in Denver and Sri Lanka Working on other database systems such as NoSQL DB systems like Cassandra and MongoDB
7
In This Session Part 1 – Introduction Part 2 – Statistics
Part 3 – Filtered Statistics Part 4 – New CE model Part 5 – Statistics Maintenance Part 6 - Recap
8
Issues With Execution Plans
Query timeout Inconsistent performance of stored procedures High tempdb utilization High CPU utilization Over utilized memory Some solution you might have tried. Query re-write makes a huge difference Recompilation is not always bad. Sometimes it's a very good thing to do. SQL Server needs to understand how much data your query has. However, SQL Server can’t do this each and every execution of the query because then it's costly. We need to do that up front and we need to have good estimates. A query plan which is going to process one row is very different to query which is going to process 1 million rows. If the optimizer has a bad input very beginning then everything downline the would not likely to benefit either. The more information the optimizer has and more accurate that info is, the better off we are. That is our goal today. The two methods that we can influence the optimizer are; statistics Indexing.
9
Execution Plan Data Access methods JOIN operators Logic
Execution plan is created based on statistical understanding of actual data. JOIN operators Logic
10
Some Concepts #t 0.5 0.25 Predicate Density
Condition which can evaluate to TRUE, FALSE or UNKNOWN. Density Uniqueness of a column. Density = 1 / No. of distinct values Ranges from 0 to 1. Estimates/Cardinality estimation Estimated no.of records return by predicates (WHERE, HAVING, JOIN) or GROUP BY operations. Selectivity Concept similar to cardinality estimation. Percentage of rows that satisfy by a predicate. Highly selective predicate returns small no.of records. #t 0.25 0.5 Predicate can be ; JOIN Filter (WHERE, HAVING)
11
Understanding Density
1 Low Density High Density High Selectivity Low Selectivity Returns less no.of records Returns high no.of records More unique values Less unique values
12
When is an Index useful? WHERE Col1='A' WHERE Col1='B' WHERE Col1='C'
That is based on the selectivity of the column. SQL Server has to make good decisions fast. In fact, its goal is not to find its best query plan but it’s to find good plan fast. How SQL Server does that, the answer to that is statistics and estimates.
13
How SQL Server generates an Exec. Plan
Parse Check for language syntax Output: Query Tree Bind Series of validation steps Columns and tables in the tree are compared Output: Algebrizer tree Optimize Evaluate different possible plans Use statistics to evaluate different plans Output: Execution Plan Execute Executes the plan
14
Optimal Non optimal Execution plan Fast execution
Less resource consumption Optimal Slow execution High resource consumption Non optimal
15
In This Session Part 1 – Introduction Part 2 – Statistics
Part 3 – Filtered Statistics Part 4 – New CE model Part 5 – Statistics Maintenance Part 6 - Recap
16
Statistics Type of object exists in the database.
Uses to generate execution plan. Can be associated with and index or they can exists with table column. Can create manually or query optimizer will create it for you. Has three main pieces; Header Density Vector Histogram Uses a method call step compression. SQL Server uses cost based algorithm. It also uses heuristics and rules.
17
Inside Statistics Object
Header Density Vector Histogram
18
DEMO - 1 Inspecting Statistics Object
19
How many records this will return
SELECT COUNT(*) FROM [Sales].[SalesOrderDetail] WHERE ProductID=870
20
Estimates vs Actuals What is actuals? Figures after the query execution. What is estimates? Pre-determined values. Why SQL Server uses estimates? To create best possible Execution plan. When it uses estimates? At query optimization stage. How SQL Server uses estimates? We will go through in detail. We calculate cardinality estimates from the leaf level of a query execution plan all the way up to the plan root. The descendant operators provide estimates to their parents.
21
DEMO - 2 Calculation of Density Values
22
Cardinality Estimator
Calculate estimated no. of row counts for each operator within a query Exec. Plan Statistics C.E Estimated row counts
23
Cardinality Estimator and Estimates
How many rows will satisfy a single filter predicate? Or multiple filter predicates? How many rows will satisfy a join predicate between two tables? How many distinct values do we expect from a specific column? A set of columns? Major factor in deciding which physical operator and plan shapes
24
DEMO - 3 How SQL Server Calculates Estimated no. of rows
25
In This Session Part 1 – Introduction Part 2 – Statistics
Part 3 – Filtered Statistics Part 4 – New CE model Part 5 – Statistics Maintenance Part 6 - Recap
26
Filtered Statistics Introduced in SQL Server 2008
DO NOT create automatically Creates with the filtered index Creates with CREATE STATISTICS along with WHERE clause Solves inaccurate estimates due to correlation between columns.
27
DEMO - 4 Filtered Statistics
28
Database Options Related To Statistics
29
Importance of accurate cardinality estimates
Memory Access Method Table Scan Index Scan Clustered Index Scan Index Seek Clustered Index Seek RID Lookup Join Type (Nested Loop, Merge, Hash)
30
In This Session Part 1 – Introduction Part 2 – Statistics
Part 3 – Filtered Statistics Part 4 – New CE model Part 5 – Statistics Maintenance Part 6 - Recap
31
New CE Model Introduced in SQL Server 2014 (C.L 120)
Major redesign of CE model after SQL Server 7.0 (C.L 70) Affect Exec. Plan quality While many workloads will benefit from the new CE, in some cases, workload performance may degrade without a specific tuning effort. You need to test your workload in new CE before enable it in production.
32
How To Activate New CE Model
ALTER DATABASE [AdventureWorks2012] SET COMPATIBILITY_LEVEL = 120; TF 2312 Server level - DBCC TRACEON(2312,-1) Session level - DBCC TRACEON(2312) Query level - QUERYTRACEON 2312 To revert – TF 9481
33
Database Compatibility Level Server or Session Level Trace Flag
Precedence Database Compatibility Level Server or Session Level Trace Flag Query Level Trace Flag
34
DEMO - 5 The impact of new CE model
35
Under Estimates and Over Estimates of Row Counts
Selection of parallel plan when serial plan would be optimal Inappropriate join strategies Inefficient index navigation strategies Inflated memory grants Selection of serial plan when parallelism would have been an optimal Inappropriate join strategies Inefficient index selection and navigation strategies
36
How do you know which CE model is using
In XML plan XE Plan property window
37
Questions Question: Let’s assume you have db1 with compatibility level of 120 and db2 with compatibility level of 110, and you have a query that references two databases. What is the CE model uses for this situation? Answer: If the query is compiled under db1, new cardinality estimator will be used. But if the query is compiled under db2, old cardinality estimator will be used.
38
Questions… Question: In SQL Server 2016 database, you want to use new features of SQL Server 2016 and also you want to use the legacy CE model. How can you achieve this? Answer: You can use Database Scoped Configuration option as below; ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = ON;
39
In This Session Part 1 – Introduction Part 2 – Statistics
Part 3 – Filtered Statistics Part 4 – New CE model Part 5 – Statistics Maintenance Part 6 - Recap
40
How To Create Statistics
Automatically by the Query Optimizer AUTO_CREATE_STATISTICS DB option Always single column statistics Explicit creation by using CREATE STATISTICS statement Single or multi column statistics When an index is created
41
How To Update Statistics
AUTO_UPDATE_STATISTICS ON (DB option) Automatically updates when they are out of date. Synchronous (default). Happens before optimization. AUTO_UPDATE_STATISTICS OFF Asynchronous – happens after optimization. Index rebuild operation UPDATE STATISTICS statement Note: Index Reorganization does not update statistics not even index statistics. Updating statistics either manually or automatically will invalidate the related Exec. Plans and will cause a new optimization next time when the query executes. Quality of the statistics depends on the sample size. Increase the SAMPLE SIZE or FULL SCAN would benefit. Index statistics are initially creates with the FULL SCAN. AUTO_UPDATE_STATISTICS OFF Introduced in SQL Server 2005. Reduced query execution time because it does not wait to generate the new stats.
42
Statistics Maintenance
UPDATE STATISTICS dbo.SalesOrderDetail UPDATE STATISTICS dbo.SalesOrderDetail WITH SAMPLE 50 PERCENT UPDATE STATISTICS dbo.SalesOrderDetail WITH FULLSCAN, COLUMNS UPDATE STATISTICS dbo.SalesOrderDetail WITH FULLSCAN, INDEX UPDATE STATISTICS dbo.SalesOrderDetail WITH FULLSCAN UPDATE STATISTICS dbo.SalesOrderDetail WITH FULLSCAN, ALL ALTER INDEX ix_ProductID ON dbo.SalesOrderDetail REBUILD
43
In This Session Part 1 – Introduction Part 2 – Statistics
Part 3 – Filtered Statistics Part 4 – New CE model Part 5 – Statistics Maintenance Part 6 - Recap
44
Recap Up to date statistics are the main key factor for quality execution plan. Statistics needs be maintained. Multi-column statistics can be used to improve query performance when there is correlation between columns. Filtered statistics can be used to improve query performance in certain situations. Create many if needed. While many workloads will benefit from the new Cardinality Estimator changes, in some cases, workload performance may degrade without a specific tuning effort. Use parameters instead of variables in stored procedures.
45
Recap… Auto Create Statistics should be ON.
Consider asynchronous statistics update only if your facing delays in query execution. Explicitly created statistics should be maintained otherwise they will easily get outdated.
46
References SQL Server Technical Article: Research paper:
Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator Research paper: A BlackBox Approach to Query Cardinality Estimation Testing Cardinality Estimation Models in SQL Server cardinality-estimator-part-1/
47
Questions? Please make sure you visit our fantastic sponsors to get your card stamped to be in the running for a raffle prize:
48
How did we do? Please complete an Evaluation to provide feedback to our wonderful speakers! SQL Clinic Don’t forget to check out the SQL Clinic to talk directly to Microsoft staff and MVP’s about your biggest pain points or suggestions for the next versions of SQL Server Lunchtime Sponsor Sessions Learn more over lunch, come hear presentations from our gold sponsors including WardyIT, SanDisk, RockSolid and Insight Enterprises Evaluations Also complete the Event Evaluation forms – please fill them in and return them at the end of the day
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.