Columnstore Index - is it the DW "Faster" switch you are looking for?

Slides:



Advertisements
Similar presentations
SQL SERVER 2012 XVELOCITY COLUMNSTORE INDEX Conor Cunningham Principal Architect SQL Server Engine.
Advertisements

Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
6.814/6.830 Lecture 8 Memory Management. Column Representation Reduces Scan Time Idea: Store each column in a separate file GM AAPL.
IIS Server ETL IIS Server This is OPERATIONAL ANALYTICS.
IST722 Data Warehousing Business Intelligence Development with SQL Server Analysis Services and Excel 2013 Michael A. Fudge, Jr.
Oracle Database Administration Lecture 6 Indexes, Optimizer, Hints.
TEMPDB Capacity Planning. Indexing Advantages – Increases performance – SQL server do not have to search all the rows. – Performance, Concurrency, Required.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
© 2008 Quest Software, Inc. ALL RIGHTS RESERVED. Perfmon and Profiler 101.
T-SQL: Simple Changes That Go a Long Way DAVE ingeniousSQL.com linkedin.com/in/ingenioussql.
Indexes and Views Unit 7.
Chapter 4 Indexes. Index Architecture  By default data is inserted on a first-come, first-serve basis  Indexes bring order to this chaos  Once you.
INTRODUCING SQL SERVER 2012 COLUMNSTORE INDEXES Exploring and Managing SQL Server 2012 Database Engine Improvements.
Table Structures and Indexing. The concept of indexing If you were asked to search for the name “Adam Wilbert” in a phonebook, you would go directly to.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
Boosting DWH-Performance with SQL Server 2016 ColumnStore Index.
--A Gem of SQL Server 2012, particularly for Data Warehousing-- Present By Steven Wang.
How to kill SQL Server Performance Håkan Winther.
OM. Platinum Level Sponsors Gold Level Sponsors Pre Conference Sponsor Venue Sponsor Key Note Sponsor.
APRIL 13 th Introduction About me Duško Mirković 7 years of experience.
Execution Plans Detail From Zero to Hero İsmail Adar.
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy.
Best Practices for Columnstore Indexes Warner Chaves SQL MCM / MVP SQLTurbo.com Pythian.com.
A Lap Around Columstore Martin Catherall SQL Saturday #464, Melbourne 20 th February 2016.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Memory-Optimized Tables Querying at the speed of light.
IIS Server ETL Key Issues  Complex Implementation  Requires two Servers (CapEx and OpEx)  Data Latency in Analytics  More businesses demand/require.
Enable Operational Analytics (HTAP) in SQL Server 2016 and Azure SQL Database Sunil Agarwal Principal Program Manager, SQL Server Product Tiger Team
In-Memory Capabilities
5/25/2018 5:29 AM BRK3081 Delivering High Performance Analytics with Columnstore Index on Traditional DW and HTAP Workloads Sunil Agarwal (Microsoft) Aaron.
Indexes By Adrienne Watt.
Operational Analytics in SQL Server 2016 and Azure SQL Database
© 2016, Mike Murach & Associates, Inc.
UFC #1433 In-Memory tables 2014 vs 2016
T-SQL: Simple Changes That Go a Long Way
Module 4: Creating and Tuning Indexes
Optimizing SQL Queries
Introduction to SQL Server Management for the Non-DBA
Database Management  .
Four Rules For Columnstore Query Performance
The Five Ws of Columnstore Indexes
Database Administration for the Non-DBA
Blazing-Fast Performance:
Power BI Performance …Tips and Techniques.
Using SQL to Prepare Data for Analysis
The Key to the Database Engine
PREMIER SPONSOR GOLD SPONSORS SILVER SPONSORS BRONZE SPONSORS SUPPORTERS.
Cardinality Estimator 2014/2016
ColumnStore Index Primer
Introduction to columnstore indexes
Purpose, Pitfalls and Performance Implications
Steve Hood SimpleSQLServer.com
SQL Server 2016 Execution Plan Analysis Liviu Ieran
Microsoft SQL Server 2014 for Oracle DBAs Module 7
Database systems Lecture 6 – Indexes
The Five Ws of Columnstore Indexes
Realtime Analytics OLAP & OLTP in the mix
Sunil Agarwal | Principal Program Manager
Hybrid data warehousing
Four Rules For Columnstore Query Performance
Clustered Columnstore Indexes (SQL Server 2014)
Execution plans Eugene
From adaptive to intelligent: query processing in SQL Server 2019
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Using Columnstore indexes in Azure DevOps Services. Lessons learned
SQL Server Columnar Storage
All about Indexes Gail Shaw.
From adaptive to intelligent:
Using Columnstore indexes in Azure DevOps Services. Lessons learned.
Sunil Agarwal | Principal Program Manager
Presentation transcript:

Columnstore Index - is it the DW "Faster" switch you are looking for? Johan Ludvig Brattås Oslo – 22/2 2017

About me Johan Ludvig Brattås Johan-ludvig.brattas@capgemini.com @intoleranse Managing consultant, Capgemini Norway SQLSaturday Oslo Excel BI virtual chapter

Agenda Columnstore basics What’s new in 2016 What is Operational Analytics and In-Memory Analytics When to use Clustered Columnstore When to not use columnstore Sofie

Columnstore basics First introduced in 2012. Version 0.4… Current version 2016 = 0.9 Hybrid in-memory technology, based on xVelocity engine Columnstore as opposed to rowstore. Data grouped in rowgroups - 1,048,576 rows Minimum size 102,400 rows. Columnsegments within each rowgroup  Deltastore Batchmode execution

What’s new in 2016? Updateable nonclustered columnstore (a.k.a. Operational Analytics) In-Memory Analytics Nonclustered rowstore index on tables with Clustered Columnstore Improved Batch mode functionality Primary and Foreign Key support on tables with Clustered Columnstore Indexes String Predicate Pushdown Simple Aggregate Predicate Pushdown Improved DMVs Better index reorganzation Batchmode-forbedringer: 1 core execution plans (for too low specced systems…) Batch mode support for SORT operator Batch mode for Multiple Distinct Count Support for LEFT SEMI ANTI JOINS (NOT EXISTS…)

What is Operational Analytics Real-time analytics… Only designed for non-clustered columnstore The only real case for non-clustered columnstore usage In-memory Analytics Limitations: Adds 20% MORE memory usage DEMO In-Memory Analytics

When to use Clustered Columnstore Huge tables, like really gigantic. Ok? Lots of columns Star-schema/normalized fact tables Tables you want to scan, not seek. Demo

Some other examples on larger tables… Existing index: CREATE NONCLUSTERED INDEX [FactPR_Flybevegelse_ix] ON [dbo].[FactPunktlighetRegularitet] ( [FLYBEVEGELSE_ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO Result: (154586 row(s) affected) 00:00:27

Result: (154586 row(s) affected) 00:00:05 Reccomended index: CREATE NONCLUSTERED INDEX [Test_Full] ON [dbo].[FactPunktlighetRegularitet] ([PeriodeSk],[ISDELETED]) INCLUDE ([FactKey],[FlygningArtSk],[FlyplassSk],[FlyselskapSk],[FlytypeSk],[TimeSk],[ForsinkelseSk], [OrganisasjonSk],[AvgAnkSk],[VektKlasseSk],[AvgAnkFlyplassSk],[KansellertSk],[GateSk],[HandlerSk], [FlyoppstillingsplassSk],[AvgAnkFlyoppstillingsplassSk],[FLYBEVEGELSE_ID],[ACT_FLIGHT_KEY],[ACT_FLIGHT_LEG_KEY], [FORSINKELSE_MIN],[FLYBEVEGELSE_DATO],[KALLESIGNAL],[STD],[STA],[ETD],[ETA],[OFF_BLOCK],[ON_BLOCK],[ATD],[ATA], [REGMERKE],[ANTALL_SETER],[MTOW],[NOX_UTSLIPPSVERDI],[FRAKT],[POST],[FRAKT_POST],[FORSINKELSE_MIN_OVER_15_MIN], [FORSINKELSE_MIN_UNDER_3_MIN],[Antall_forsinkelse_kjent_Tid],[Antall_forsinkelse_kjent_Tid_3], [Antall_forsinkelse_totalt],[Totalt_Antall_kjent_tid],[Antall_kanselleringer],[FLIGHT_ID], [FLYBEVEGELSE_ID_KANSELLERT],[FLYBEVEGELSE_ID_FORSINKELSE],[TOUCH_GO_COUNT],[FellesOrgSK],[motsattOrganisasjonSK], [PeriodeSTDSk],[TimeSTDSk],[PeriodeSTASk],[TimeSTASk],[PeriodeATASk],[TimeATASk],[PeriodeATDSk],[TimeATDSk], [PeriodeOnBlockSk],[TimeOnBlockSk],[PeriodeOffBlockSk],[TimeOffBlockSk],[KorrigeringPost],[KorrigeringFrakt], [FirstBag],[Lastbag],[BagsKvalitetSk],[BusGateSk],[OmradeOSLSk]) GO Result: (154586 row(s) affected) 00:00:05

With Clustered Columnstore: create Clustered columnstore index CCI_FactPunktlighetRegularitet on [dbo].[FactPunktlighetRegularitet]; Result: (154586 row(s) affected) 00:00:09 With NONCLUSTERED Columnstore: CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_FactPunktlighetRegularitet on [dbo].[FactPunktlighetRegularitet] (FactKey, FlygningArtSk, FlyplassSk, FlyselskapSk, FlytypeSk, PeriodeSk, TimeSk, ForsinkelseSk, OrganisasjonSk, AvgAnkSk, VektKlasseSk, AvgAnkFlyplassSk, KansellertSk, GateSk, HandlerSk, FlyoppstillingsplassSk, AvgAnkFlyoppstillingsplassSk, FLYBEVEGELSE_ID, ACT_FLIGHT_KEY, ACT_FLIGHT_LEG_KEY, FORSINKELSE_MIN, FLYBEVEGELSE_DATO, KALLESIGNAL, STD, STA, ETD, ETA, OFF_BLOCK, ON_BLOCK, ATD, ATA, REGMERKE, ANTALL_SETER, MTOW, NOX_UTSLIPPSVERDI, FRAKT, POST, FRAKT_POST, BatchId, FORSINKELSE_MIN_OVER_15_MIN, FORSINKELSE_MIN_UNDER_3_MIN, Antall_forsinkelse_kjent_Tid, Antall_forsinkelse_kjent_Tid_3, Antall_forsinkelse_totalt, Totalt_Antall_kjent_tid, Antall_kanselleringer, FLIGHT_ID, FLYBEVEGELSE_ID_KANSELLERT, FLYBEVEGELSE_ID_FORSINKELSE, TOUCH_GO_COUNT, FellesOrgSK, motsattOrganisasjonSK, PeriodeSTDSk, TimeSTDSk, PeriodeSTASk, TimeSTASk, PeriodeATASk, TimeATASk, PeriodeATDSk, TimeATDSk, PeriodeOnBlockSk, TimeOnBlockSk, PeriodeOffBlockSk, TimeOffBlockSk, PeriodeStopBeltSK, TimeStopBeltSK, PeriodeStartBeltSK, TimeStartBeltSK, PARK_STA, Belt_Start_Time, Belt, KorrigeringPost, KorrigeringFrakt, ISDELETED, FirstBag, Lastbag, BagsKvalitetSk, BusGateSk, OmradeOSLSk, Flygningsart) Result: (154586 row(s) affected) 00:00:10 With both CCI and full row index: Result: (154586 row(s) affected) 00:00:05

exec sp_spaceused 'FactPunktlighetRegularitet' name rows reserved data index_size unused Current usage: FactPunktlighetRegularitet 19627654 9095832 KB 8662720 KB 428416 KB 4696 KB Med anbefalt rowindeks: 16164000 KB 7496512 KB 4768 KB Med Clustered Columnstore: 2485568 KB 2485328 KB 0 KB 240 KB

When not to use Clustered Columnstore Small and narrow tables Tables used to search for single or small range sets Tin can servers… It really depends…

Fraud database. 40 columns, 15 Bill rows, 7 TB disk usage Create Clustered Columnstore – took 9 hours Database Query Comment Feature Svartid MS SQL Query 1 (MS SQL) disk usage: 1,7TB partisjonert + column store 44 sec disk usage: 7,7TB partisjonert + column store + index 4 sec Oracle Query 1 (Oracle)   index 21:70 sec InMemory 6:13 sec

Links http://www.nikoport.com/columnstore https://github.com/NikoNeugebauer/CISL https://github.com/NikoNeugebauer/MOSL