Download presentation
Presentation is loading. Please wait.
1
The Five Ws of Columnstore Indexes
Madison | APR The Five Ws of Columnstore Indexes
2
SQL Saturday Madison: Silver Sponsors
3
SQL Saturday Madison: Gold Sponsor
4
SQL Saturday Madison: Gold Sponsor
5
SQL Saturday Madison After Party
Join for some networking and fun. Appetizers provided. Madison’s 119 King Street Madison, WI 53703
6
Join your local WI Chapter
FoxPASS - Appleton, WI MADPASS - Madison, WI Western Wisconsin PASS - Eau Claire, WI WausauPASS - Wausau, WI WI SSUG - Waukesha, WI
7
Save $$$ on your PASS Summit Registration
PASS Summit is the largest conference for technical professionals who leverage the Microsoft Data Platform. November 6th – 9th Seattle, WA Use this code to save $150 off your registration: SSDISHN1C Use this code to get access to all 2017 Summit sessions: SQLSTRHN1C
8
Agenda Who? What? Why? How? Where? When? Demos
9
Who? Eureka Dr. Seuss Whoville Whos Deco Trim via Amazon.com
10
Who is this guy? John Eisbrener @johnedba john@dbatlas.com
DBA: Default Blame Assignee DBA for over 10 years MSSQL, Oracle, Greenplum, Postgres Owner/Principal Consultant of a boutique consulting firm, DB Atlas
11
Who are you? DBAs Architects Developers Analysts Management Others
12
What? Knowyourmeme.com
13
What makes Columnstore Indexes Special?
What is an Index? Key differences between Rowstore and Columnstore Indexes Key differences between Clustered and Nonclustered Columnstore Indexes
14
What is an Index? Structure that contains data or pointers to data
Designed to search for data efficiently Designed to perform as a database grows in size The type of index determines how data is stored on disk Highly customizable Columnstore Unofficial Versions SQL 2012 – Alpha SQL 2014 – Beta SQL 2016 – Version 1.0 SQL 2017 – Version 1.1
15
Difference between Rowstore and Columnstore Indexes
Rowstore Index Columnstore Index Row-wise Format Compression is optional Returns all columns defined within the index B+ Trees Column-wise format Compression is required Returns only the columns needed Header and Data
16
Row-wise vs Column-Wise Storage
17
Clustered vs Nonclustered
Clustered Columnstore Index (CCI) Nonclustered Columnstore Index (NCCI) One Per Table Table is Stored in Column- wise format Significant Table Compression Cannot define a filter One Per Table Sits on top of Heap or Clustered (Rowstore) Index Copy of Data; uses more space Can define a filter
18
Why? Youtube.com
19
Why do Columnstore Indexes work so well?
Importance of Compression Brief Overview of Dictionary-based algorithms Column Elimination Rowgroup Elimination
20
Importance of Compression
Reduce Limitations imposed by Data Storage Disk Memory Throughput Proprietary Compression Algorithm Dictionary Based
21
Dictionary-Based Compression
Lossless General Approach Build a Dictionary of Symbols (e.g. words, numbers, etc.) Assign minimal binary codes to each Symbol Smaller binary codes are assigned to more common symbols Replace raw data Symbols with Binary Codes to reduce the size of the data Works best when Symbols are homogenous
22
Dictionary-Based Compression
23
Column Elimination Return only those columns used within the Query
Better compression ratios for data being returned because data is homogenous Column ordering in the (N)CCI Index Definition doesn’t matter, Column Elimination will happen regardless NCCI ordering is defined by the underlying Rowstore Indexes CCI Ascending/Descending order can be implied with how the data is loaded WITH (MAXDOP = 1) Partitioning can also help
24
Rowgroup Elimination Also referred to as Segment Elimination
If the Segment doesn’t contain values identified within the Query Predicate, the entire Rowgroup is eliminated Occurs prior to Column Elimination Not utilized for LOB-based, string-based, or binary datatypes Evaluation of the Segment Header Stores Min/Max of values within Segment
25
Rowgroup Elimination Example
26
How? How it’s Made
27
How do Columnstore Indexes work with changing data?
Rowgroups DeltaStore Inserts Deletes and Updates Tuple Mover ColumnStore Batch Execution Mode
28
Rowgroups Buckets of up to 1 million rows Can be in one of 3 states
Open Closed Compressed Open/Closed are stored in Row-wise format Compressed is stored in Column-wise format
29
DeltaStore
30
Tuple Mover
31
ColumnStore
32
Batch Execution Mode Introduced in SQL 2012 along with Columnstore Indexes Columnstore Index is required on the table Only usable by certain execution plan operators Aggregates/Scans/Hash Matches/Window Aggregates Passes a batch of up to 900 rows between execution plan operators Basically a turbo button for execution plans
33
Where? Lego.com
34
Where can you use Columnstore Indexes?
Datatype Restrictions NCCI Restrictions Optimal Workloads CCIs NCCIs
35
Datatype Restrictions
Will not work with the following datatypes ntext, text, and image nvarchar(max), varchar(max), and varbinary(max) Does not apply to CCIs in SQL Server 2017 only rowversion (and timestamp) sql_variant CLR types (hierarchyid and spatial types) xml
36
NCCI Restrictions Cannot have more than 1024 columns
Cannot be created on a view or indexed view Cannot include a sparse column Cannot be redefined by using the ALTER INDEX statement Use CREATE INDEX WITH (DROP_EXISTING = ON) Cannot include large object (LOB) columns of type nvarchar(max), varchar(max), and varbinary(max)
37
Optimal Workloads - CCIs
Traditional DWH Fact Tables Dimension Tables with over 1 million rows Insert Mostly Workloads History Table of a Temporal Table Logging Tables Updates/Deletes < 10% of all DML Create Nonclustered (Rowstore) Indexes on CCI Improve Query Performance by avoiding Full-Table Scans Large In-Memory OLTP tables
38
Optimal Workloads - NCCIs
OLTP tables with more than 1 million rows Tables that may feed a large number of analytical/aggregate queries Common tables feeding SSRS/Power BI Reports Tables that generate a high amount of Scans Very wide tables that are not easy to create Covering Indexes on Tables that could benefit from being a CCI, but cannot be offline for a long period of time
39
Identify Candidate Tables
Several Scripts have been developed by the community Niko Neugebauer (GitHub Library CISL) Sunil Agarwal (Microsoft Blog Post)
40
When? Apple.com
41
When to use various Columnstore Features?
Compression Delay Filtered NCCIs Maintenance Routines With other features in SQL Server
42
Compression Delay Keyword
Used to delay the Tuple Mover from moving a Closed Rowgroup to a Compressed Rowgroup Max value is 10080, or 7 days Helpful for frequently-updated “hot” data Closed Rowgroups can still be updated/deleted, only when a Rowgroup is compressed is the data immutable Compressing a Closed Rowgroup will require system resources, and you may want these operations to run off-hours
43
Filtered NCCIs Use Compression Delay isn’t long enough
Query Engine will use what it can from Filtered NCCI and pull remaining data from Rowstore Index Must Redefine using CREATE NONCLUSTERED COLUMNSTORE INDEX WITH (DROP_EXISTING=ON) Requires specific SET Options
44
Maintenance Routines Reorganize
Physically removes rows from a rowgroup when 10% or more of the rows have been logically deleted Combines one or more compressed rowgroups to increase rows per rowgroup up to the maximum of 1,024,576 rows Manually Compresses any Closed RowGroups Compresses all Closed AND Open RowGroups when using WITH (COMPRESS_ALL_ROW_GROUPS) hint
45
Maintenance Routines (Continued)
Rebuild Re-compresses all data into the columnstore Historically (e.g and 2012) used to be the only way to reduce fragmentation Locks the table during the rebuild operation SQL 2017 introduces ONLINE rebuilds for NCCIs only Will be used primarily when there is a lot of fragmentation within the Compressed Rowgroups
46
Other features that work well with Columnstore Indexes
Temporal Tables CCI on History Table Availability Groups with Read-Only Replicas Point your reports there! Partitioned Tables
47
Demos OhMaGif.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.