Boosting DWH-Performance with SQL Server 2016 ColumnStore Index.

Slides:



Advertisements
Similar presentations
Yukon – What is New Rajesh Gala. Yukon – What is new.NET Framework Programming Data Types Exception Handling Batches Databases Database Engine Administration.
Advertisements

SQL Server 2012 Data Warehousing Deep Dive Dejan Sarka, SolidQ
Big Data Working with Terabytes in SQL Server Andrew Novick
Dandy Weyn Sr. Technical Product Mkt.
Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
Project Management Database and SQL Server Katmai New Features Qingsong Yao
The Baker’s Dozen Business Intelligence 13 Tips for the SQL Server Columnstore Index Kevin S. Goff Microsoft SQL Server MVP.
IIS Server ETL IIS Server This is OPERATIONAL ANALYTICS.
Agenda 10 Key SQL 2012 BI Innovations BI Semantic Model Project ‘Apollo’ Vertipaq xVelocity in SQL 2012.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
Columnstore Indexes in SQL Server 2012 Conor Cunningham Principal Architect, Microsoft SQL Server Representing Microsoft Development.
SQL Server 2014: Overview Phil ssistalk.com.
SQL Server Indexes Indexes. Overview Indexes are used to help speed search results in a database. A careful use of indexes can greatly improve search.
Indexes and Views Unit 7.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 4 Logical & Physical Database Design
INTRODUCING SQL SERVER 2012 COLUMNSTORE INDEXES Exploring and Managing SQL Server 2012 Database Engine Improvements.
Sofia Event Center November 2013 Margarita Naumova SQL Master Academy.
Cloudera Kudu Introduction
 An independent SQL Consultant  A user of SQL Server from version 2000 onwards with 12+ years experience.
5 Trends in the Data Warehousing Space Source: TDWI Report – Next Generation DW.
Boosting DWH-Performance with SQL Server 2014 ColumnStore Index, In-Memory Tables & Natively Compiled Stored Procedures.
October 15-18, 2013 Charlotte, NC Accelerating Database Performance Using Compression Joseph D’Antoni, Solutions Architect Anexinet.
--A Gem of SQL Server 2012, particularly for Data Warehousing-- Present By Steven Wang.
SQLUG.be Case study: Redesign CDR archiving on SQL Server 2012 By Ludo Bernaerts April 16,2012.
Turbocharge your DW Queries with ColumnStore Indexes Susan Price Senior Program Manager DW and Big Data.
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy.
SQL Server 2016: Real-time operational analytics
Best Practices for Columnstore Indexes Warner Chaves SQL MCM / MVP SQLTurbo.com Pythian.com.
A Lap Around Columstore Martin Catherall SQL Saturday #464, Melbourne 20 th February 2016.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Introducing Hekaton The next step in SQL Server OLTP performance Mladen Prajdić
Memory-Optimized Tables Querying at the speed of light.
Introduction to columnstore indexes Taras Bobrovytskyi SQL wincor nixdorf.
IIS Server ETL Key Issues  Complex Implementation  Requires two Servers (CapEx and OpEx)  Data Latency in Analytics  More businesses demand/require.
Enable Operational Analytics (HTAP) in SQL Server 2016 and Azure SQL Database Sunil Agarwal Principal Program Manager, SQL Server Product Tiger Team
Columnstore Indexing: From SQL Server 2012 to SQL Server 2014
In-Memory Capabilities
Power BI Performance Tips & Tricks
Operational Analytics in SQL Server 2016 and Azure SQL Database
Columnstore Index - is it the DW "Faster" switch you are looking for?
Hustle and Bustle of SQL Pages
Four Rules For Columnstore Query Performance
Graeme Malcolm | Data Technology Specialist, Content Master
The Five Ws of Columnstore Indexes
Database Administration for the Non-DBA
Blazing-Fast Performance:
Migrating a Disk-based Table to a Memory-optimized one in SQL Server
PREMIER SPONSOR GOLD SPONSORS SILVER SPONSORS BRONZE SPONSORS SUPPORTERS.
ColumnStore Index Primer
Indexes … WHERE key = Table Index 22 Row pointer Key Indexes
Azure SQL Data Warehouse Performance Tuning
SQL 2014 In-Memory OLTP What, Why, and How
Introduction to columnstore indexes
11/29/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
TechEd /2/2018 7:32 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Power BI for large databases
Azure SQL DWH: Optimization
Microsoft SQL Server 2014 for Oracle DBAs Module 7
The Five Ws of Columnstore Indexes
Realtime Analytics OLAP & OLTP in the mix
Sunil Agarwal | Principal Program Manager
Four Rules For Columnstore Query Performance
Clustered Columnstore Indexes (SQL Server 2014)
CSTORE E0261 Jayant Haritsa Computer Science and Automation
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Using Columnstore indexes in Azure DevOps Services. Lessons learned
SQL Server Columnar Storage
Using Columnstore indexes in Azure DevOps Services. Lessons learned.
Sunil Agarwal | Principal Program Manager
Presentation transcript:

Boosting DWH-Performance with SQL Server 2016 ColumnStore Index

Sponsors Gold Sponsors: Bronze Sponsors: Swag Sponsors:

Introduction Markus Ehrenmüller-Jensen Business Intelligence runtastic Pluskaufstraße Pasching Austria, Europe SQL Server BI Developer Database Developer Database Admin

Agenda Column Store Non-Clustered Column Store Index Clustered Column Store Index What’s new in SQL 2016

Evolution of Microsoft Data Platform SQL Server 2000 SQL Server 2005 SQL Server 2008 XML ● KPIs Management Studio ● Mirroring Compression ● Policy-Based Mgmt ● Programmability SQL Server 2012 ColumnStore Index ● AlwaysOn ● Data Quality Services ● Power View ● Cloud Connectivity PowerPivot ● SharePoint Integration ● Master Data Services SQL Server 2008 R2 SQL Server 2014 In-Memory Across Workloads ● Performance & Scale ● Hybrid Cloud Optimized ● HDInsight ● Cloud BI

In-Memory in SQL Server Cache Buffer Pool Pin-to-Memory DBCC PINTABLE ColumnStore (Non-)Clustered Index SSAS Tabular Power Pivot In-Memory OLTP Memory Optimized Table Natively Compiled Stored Procedure

Ordinary Report

ColumnStore Index

DEMO Improve Query Performance with ColumnStore Index

How? xVelocity (VertiPaq, PowerPivot, BISM Tabular) Compression Column-Elimination Segment/Rowgroup-Elimination Parallel Read Operations Query Processor in Batch-Mode Typically 10x faster

xVelocity Column Store Power Pivot Analysis Services SQL Server Non clustered Index SQL Clustered Index SQL 2014+

Compression X-Velocity In-Memory Compression Engine Creation takes up to 1.5x longer then b-tree Algorithms Value Scale (1023, 1002  scale 1000; 23, 2) Bit Array (x, y, z, x  100, 010, 001, 100) Binary Bitmap Run-Length (x, x, x, y, z, z  3x, 1y, 2z) Dictionary (x, y, z, x  1, 2, 3, 1) Huffman Lempel-Ziv-Welch Decision after sample (1% / max 1 Mio rows) No estimation available sp_estimate_data_compression_savings not supported for ColumnStore

CREATE INDEX (elapsed time) NCCI (22943 ms) CCI (22879 ms) CCI Archival (22067 ms) PAGE (5668 ms) Clustered (4925 ms) ROW (3282 ms)

INSERT (elapsed time) PAGE (72946 ms) CCI Archival (54221 ms) CCI (54123 ms) NCCI (42596 ms) ROW (21652 ms) Clustered (14905 ms) Heap (6941 ms)

Compression Heap & NCCI (1622 MB) Heap (1382 MB) ROW (794 MB) PAGE (202 MB) CCI (25 MB) Archival (11 MB)

Dictionary Primary (global) Across all segments Mandatory Preferable Secondary (local) Per segment Optional sys.column_store_dictionaries

ARCHIVAL Compression LZ777 compression on top Additional ~30% of space-savings Can be applied per partition

Column-Elimination Only needed column are read As opposite to RowStore Where only whole rows can be read

KeyAlternateKeyNameStock 1AR-5381Adjustable Race1000 2BA-8327Bearing Ball1000 3BE-2349Ball Bearing Cage800 4BE-2908Ball Bearing Grease800 5BL-2036Blade800 6CA-5965LL Crankarm500 7CA-6738ML Crankarm500 Page 1 of row store Page 2 of row store Segment for column 1 Segment for column 2 Segment for column 3 Segment for column 4 RowStore vs. ColumnStore Segment for column 1 Segment for column 2 Segment for column 3 Segment for column 4 Rowgroup

Segment Part of row group for a single column # of rows per row group is the same for all segments No sort order inside a segment Max. 1, rows Trimmed segement(s) because of to few rows, DOP or memory pressure The bigger, the better compression The smaller, the better segment-elimination Stored as BLOB (through 8k pages) Directory Allocation status, # of rows, min/max value sys.column_store_segments

Segment-Elimination Segment is the smallest unit Reading 1 Mio rows or not Min/max value Alignment.sql

Parallel Read Operations

Batch-Mode SQL Chunks of 1000 rows per batch Better CPU efficiency Vs. Row mode Every row processed after each other

Typically 10x faster Heap (4291 ms) Clustered (2151 ms) ROW (1695 ms) NCCI (970 ms) PAGE (938 ms) CCI (140 ms) CCI Archival (99 ms)

Use Cases > 1 Mio rows Aggregations, groupings & filters (DWH/OLAP) Write once, read multiple times Less distinctive values per column Sweet spot Design (eg. matching data type, no functions), star & snowflake schema, inner joins Can substitute datamarts/aggregation-tables ROLAP & Tabular Model DirectQuery Query Optimizer includes CS Index Suggested_tables.sql

Restrictions (2014) Data types: ntext, text, and image, vardecimal, varchar(max) and nvarchar(max), rowversion (and timestamp), sql_variant, CLR types (hierarchyid and spatial types), xml, uniqueidentifier Page/Row compression Replication Change Tracking, Change Data Capture Filestream Enterprise Edition only

Nonclustered ColumnStore Index (NCCI) May combined with other indices Decide which column to include Only one NCCI per table Redundant storage No sort order SQL Server 2012+

NCCI Restrictions No constraints allowed (unique) Indexed table is read-only

NCCI Best Practice Memory, Memory, Memory Include all columns MAXDOP > 1 Fact-tables and big dimension tables Choose Clustered Index wise Update/Insert Disable/enable index Partitioning View/Union all

Clustered ColumnStore Index (CCI) Physical columnar storage (instead of row based) Not really clustered (as stored in unsorted segments) No reduntant storage All columns included automatically No other index allowed UPDATE-able Supports more data types Switching between ROW-mode and BATCH-mode allowed SQL Server 2014+

CCI: Structure CREATE CLUSTERED COLUMNSTORE INDEX INSERT DELETE REORGANIZE REBUILD Columnstore Deleted Bitmap Deltastore(s)

CCI: Deltastore Ordinary Rowstore Uncompressed heap Does not benifit from batch mode Compression is expensive operation OPEN / CLOSED / INVISIBLE / TUMBSTONE / (COMPRESSED) Tuple-Mover

CCI: Tuple Mover Closed Delta Store  new Segment (compressed) Closedown 1, rows for INSERT rows for BULK INSERT Every 5 minutes Will pause 15 sec after it has done its job Invoke REBUILD/REORGANIZE after trickle load Update/Insert will be blocked

CCI: Deleted Bitmap Deleted rows of ColumnStore 2 storage format Bitmap (in memory) B-Tree (on disk) Fully deleted segments are not eliminated REBUILD to get rid of those rows

CCI: Restrictions (2014) No Constraints (unique, primary, foreign key) No Triggers Not all datatypes supported No ISOLATION LEVEL SNAPSHOT Enterprise Edition only

CCI Best Practice Memory, Memory, Memory MAXDOP > 1 Fact-tables and big dimension tables suggested_tables.sql Choose Clustered Index wise Go for BULK INSERTs INSERT is locking whole Row group DELETE is locking whole segment

ColumnStore Index Non-clustered SQL Server Additional index (redundancy) Read-only Subset of columns Max. one NCCI per table Clustered SQL Server Master Update-able All columns No additional index allowed

ColumnStore vs. In-Memory OLTP DWH vs. OLTP Aggregation vs. Single Rows Disk vs. Memory Compressed vs. uncompressed

SQL Server v2016 UPDATE-able & Filtered NCCI Deleted Buffer  deleted Bitmap Non-Clustered Index (B-Tree) on CCI Mapping index in between In-Memory ColumnStore Index Less restrictions Primary key, foreign key, all isolation levels, batch mode support, inline-definition, …

RowStore Index on CCI Mapping Index B-tree Tracks movements of rows inside of ColumnStore Row Locator SegmentID:TupleID (vs. FileID:PageID:SlotID)

Wrap up Column Store Non-Clustered Column Store Index Clustered Column Store Index What’s new in SQL 2016

Call to Action Try Non-Clustered Column Store Index Clustered Column Store Index

Sponsors Gold Sponsors: Bronze Sponsors: Swag Sponsors:

Questions? Markus Ehrenmüller-Jensen Business Intelligence runtastic Pluskaufstraße Pasching Austria, Europe SQL Server BI Developer Database Developer Database Admin