A Lap Around Columstore Martin Catherall SQL Saturday #464, Melbourne 20 th February 2016
Mobile Phones please set to “stun” during sessions Evaluations complete online to be in the draw for fantastic prizes Wifi Details EDUROAM - Login: ext-sqlsat Password: sqlsat2016 SESSIONSEVENT sessions/sessionevaluation.aspx eventeval.aspx Housekeeping
Connect with the Community Event staff, volunteers and speakers are here to help and answer questions. Scan the QR code on the speaker badges to connect and network with them. I attack SQL challenges by dropping onto them from above.
About Me Melbourne based SQL Server Consultant with RockSolid SQL Data Platform MVP PASS Regional Mentor - APAC SQL Saturday Organiser South Island New Zealand (formerly SQLSat Christchurch) Christchurch SQL User Group Lead
A Lap Around Columnstore Indexing - Agenda A tour of the evolution of Columnstore. > >2016 Real-Time Operational Analytics. Storage format and Internals DMV’s / extended events for monitoring and learning
Let’s take a step back….why? New syntax & new functionality Columnstore Batch mode - new addition to operators in query plans…More on this later. Scales well But along with columnstore came some unfamiliar language for existing functionality. Rowstores. Row Mode. CLUSTERED does not mean sorted. Not like a Clustered Rowstore Index.
Some Additional Terms Vertipaq – xVelocity. Used with PowerPivot, Tabular also. In-memory Analytics – The Marketing Term Columnstore - The Technology Term Operation Analytics - “the ability to run performant real-time analytics on a transactional workload.” In memory (as a general term) In-memory OLTP Columnstore Also, BPE, Delayed Durability..
A Quick Overview. So, what does columnstore give us? Compression. Performance. But didn’t we have these things anyway? How often is row / page compression used? Lets do a quick comparison.
Rowstores Complete row stored on page. So when we read a row – ALL columns from the row read into memory. Rows stored in B-Tree format (both clustered and non clustered) Compression (not on by default – you have to ask for it) ROW PAGE Compression is mostly a good thing – but maybe not always. Columnstore indexes are ALWAYS compressed. Data is ORDERed in the index. Additional columns can be INCLUDEd Indexes can be filtered.
A Quick (very general) Overview.
Comparison As the name suggests data is held in columular format. As there is likely a lot of duplication - opportunities for compress are high. Columnstore compression comes by default – you have to have it. You can not disable it.
SQL Server 2012 Introduced columnstore Specifically Non Clustered Columstore Index NOTE – the NONCLUSTERED keyword was NOT required. What was the promise More compact storage. Faster querying. What was the reality? Table still stored as a ‘Rowstore’ or a heap. Index was NOT updateable. Very few operators supported Batch Mode So queries often fell back to Row Mode. NOTE :- no ordering of columstore is required.
SQL Server 2014 Introduced CLUSTERED columnstore indexes. Remember the CLUSTERED keyword (although confusion is unlikely) Introduced Clustered Columnstore. These could be updated. NONCLUSTERED COLUMNSTORE still read-only. No mixing the two types. With CLUSTERED columnstore certain datatypes disallowed. No additional constraints supported on CCI table. More operators supported for batch mode. Introduced COLUMNSTORE_ARCHIVE for increased compression. Save diskspace v performance (data accessed rarely)
SQL Server 2016 (CTP 3.3) Built on the 2012 / 2014 features. Introduced updatable NCCI Additional indexes / constraints supported. Filtered Columstore indexes More data types supported. More operators supported for “Batch Mode”
Creating Columnstore Indexes Through the GUI or directly through TSQL Or script off from the GUI
Columnstore format. Immutable Data in Columnstore. Delete bitmap. (Deletes tracked / Soft delete) Delta store (Inserts stored here) How are updates handled. (Delete plus Insert) BULK Inserts TRICKLE Inserts The Tuple Mover – updates the CS.
Columnstore format.
SQL Server 2016 – Operational Analytics. “the ability to run performant real-time analytics on a transactional workload.” Real time analytics queries on OLTP system. Ad-hoc queries. Will this mean Cubes are going…going …gone? Any need for pre-aggregated cube queries? Will this remove Analysis services from the equation?
Modes (Row and Batch)
References. Niko Neugebauer - Multi-part series on Columnstore internals. PASS 2014 / 2015 Presentations. Sunil Agarwal – MS PM – Columnstore team. Anything from Sunil – (Ignite / PASS sessions) SQL CAT – PASS 2015 presentation. Customer Scenarios and Best Practices.
That’s all folks - Questions? Please make sure you visit our fantastic sponsors:
How did we do? Please complete an online Evaluation to be included the draw for a fantastic prize! There is a prize for each session timeslot and for the overall event survey – so the more feedback you provide the more chances you have. Session Surveys sessions/sessionevaluation.aspx Post-Event Survey eventeval.aspx