Four Rules For Columnstore Query Performance Joe Obbish
SQL Saturday Madison: Silver Sponsors
SQL Saturday Madison: Gold Sponsor
SQL Saturday Madison: Gold Sponsor
SQL Saturday Madison After Party Join us @6:30pm for some networking and fun. Appetizers provided. Madison’s 119 King Street Madison, WI 53703
Join your local WI Chapter FoxPASS - Appleton, WI MADPASS - Madison, WI Western Wisconsin PASS - Eau Claire, WI WausauPASS - Wausau, WI WI SSUG - Waukesha, WI
Save $$$ on your PASS Summit Registration PASS Summit is the largest conference for technical professionals who leverage the Microsoft Data Platform. November 6th – 9th Seattle, WA Use this code to save $150 off your registration: SSDISHN1C Use this code to get access to all 2017 Summit sessions: SQLSTRHN1C
About Me Business Intelligence Developer for EHR company Answer questions on dba.stackexchange.com Blog about SQL Server at https://orderbyselectnull.com/ Done performance tuning for thousands of queries
This Slide Intentionally Left Blank
Scope I work with data warehouses that use on-disk clustered columnstore indexes with very few nonclustered indexes. Material applies to SQL Server 2016 and 2017.
The Consequences of Creative T-SQL Creative code can enter your database in many different ways: Your (least) favorite third party reporting tool. Your (least) favorite end user. Your (least) favorite developer using an ORM. Creative code can have different consequences with CCIs compared to rowstore.
The Four Rules Take Advantage of Columnstore Features Define and Maintain your Table Get Batch Mode Avoid Some Query Patterns
Take Advantage of Columnstore Features Column elimination Rowgroup elimination Aggregate pushdown String predicate pushdown
Column Elimination Compressed data is stored in columnar format. SELECT * is worse than ever. UPDATE queries need to read all columns.
Rowgroup Elimination “To determine which rows groups to eliminate, the columnstore index uses metadata to store the minimum and maximum values of each column segment for each rowgroup. When none of the column segment ranges meet the query predicate criteria, the entire rowgroup is skipped without doing any actual IO.” Doesn’t work for string data types.
Aggregate Pushdown Calculate some aggregates at the scan level. MIN, MAX, AVG, SUM, COUNT, COUNT_BIG Only works for exact numeric data types that fit within 8 bytes. Might not work depending on characteristics of compressed data. Trivial plans can cause issues in SQL Server 2016.
String Predicate Pushdown “With string predicate pushdown, the query execution computes the predicate against the values in the dictionary and if it qualifies, all rows referring to the dictionary value are automatically qualified.” Including for completeness (no demos).
Demos 1
The Four Rules Take Advantage of Columnstore Features Define and Maintain your Table Get Batch Mode Avoid Some Query Patterns
Is This Bad?
Do Index Maintenance Columnstore maintenance is very different from rowstore index maintenance. Skipping maintenance can result in reduced query performance, hundreds of wasted GBs of space, and other problems.
Use the Right Data Types Use data types that fit your data as closely as possible. Different data types have different levels of support for columnstore query performance features.
Be Smart About Partitioning Partitioning is essential for not small columnstore indexes. Partitioning can actually improve query performance on columnstore indexes! Avoid partition schemes that are too small or that don’t fit well with your load method.
Demos 2
The Four Rules Take Advantage of Columnstore Features Define and Maintain your Table Get Batch Mode Avoid Some Query Patterns
Batch Mode Query plan operators can process sets of rows at a time instead of operating row by row. Operate on compressed data directly. Take advantage of fancy CPU features. Avoid expensive row mode exchange operators.
Supported Operators Supported operators change by version, but nothing new in SQL Server 2017. Supported operators: columnstore scan hash match compute scalar sort concatenation top sort filter window aggregates
Watch out for Batch Mode Sorts You should generally be happy to see batch mode operators except for batch mode sorts in some scenarios. Microsoft’s implementation of batch mode sorts arguably isn’t complete yet and can lead to performance problems. TF 9347 disables them.
Stay Current on CUs Microsoft has released many important bug fixes related to Columnstore since 2016 SP1 base. Don’t stay on 2016 SP1 base!
Relevant KBs KB 3210747 KB 3202425 KB 4013111 KB 4024860 KB 3208460 VSTS 11704339 KB 3195825 KB 4017154 KB 4040014 KB 3216543 KB 4015034 KB 4040276
Demos 3
The Four Rules Take Advantage of Columnstore Features Define and Maintain your Table Get Batch Mode Avoid Some Query Patterns
Some Queries Really Need NCIs String aggregation doesn’t work well with CCIs until SQL Server 2017. Not all joins are eligible for hash join. Recursive queries are not a good match for CCIs.
Ruinous Row Goals Think how a query plan might change if you add TOP 1 to it. Blocking operators, such as many of those eligible for batch mode, become less attractive. Arguably CCIs are incompatible with most row goal optimizations.
Demos 4
The Four Rules Take Advantage of Columnstore Features Define and Maintain your Table Get Batch Mode Avoid Some Query Patterns
Resources Columnstore indexes - Query performance Niko Neugebauer’s blog Row Goal Series by Paul White CCIs And Recursive CTEs by Erik Darling One of my CCI blog posts
Please Fill Out Feedback Forms