Download presentation
Presentation is loading. Please wait.
Published byLorena Butler Modified over 6 years ago
1
5/25/2018 5:29 AM BRK3081 Delivering High Performance Analytics with Columnstore Index on Traditional DW and HTAP Workloads Sunil Agarwal (Microsoft) Aaron Gerdeman (FIS) © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
2
Please evaluate this session
Tech Ready 15 5/25/2018 Please evaluate this session From your Please expand notes window at bottom of slide and read. Then Delete this text box. PC or tablet: visit MyIgnite Phone: download and use the Microsoft Ignite mobile app Your input is important! © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
3
Agenda Why HTAP? Strategies to run HTAP Customer Success Story
5/25/2018 5:29 AM Agenda Why HTAP? Strategies to run HTAP Customer Success Story Common Performance Pitfalls and Recommendations © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
4
Traditional Analytics Architecture
Microsoft Ignite 2015 5/25/2018 5:29 AM Select ProduceName, ExpiryDate, SUM (inventory – item_sold) From <transactions> where [Date] >= DATEADD(day, -1, GETDATE()) Group by ProductName, ExpiryDate, DATEPART(HOUR, [Date]) Insert into <transactions> values (‘<upc-code>, ‘flowers’, $20.00) SQL Server Database Application Tier Presentation Layer IIS Server BI and analytics Dashboards Reporting SQL Server Analysis Server Key Issues Complex Implementation Requires two Servers (CapEx and OpEx) Data Latency in Analytics More businesses demand/require real-time Analytics SQL Server Relational DW Database ETL Hourly, Daily, Weekly © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
5
Demo: Value Proposition
5/25/2018 Demo: Value Proposition Sunil Agarwal © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
6
ColumnStore Index Performance
5/25/2018 5:29 AM ColumnStore Index Performance Promise 10x Data Compression (Typical) Up to 100x Query performance Magic Store data as columns BATCH Mode execution Predicate Pushdown Aggregate Pushdown Efficient data elimination Data stored as columns C5 C1 C2 C3 C4 rowgroup segment Improved compression: Data from same domain compress better Reduced I/O: Fetch only columns needed Improved Performance: More data fits in memory Optimized for CPU utilization Ideal for DW Workload © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
7
Minimizing Data Latency for Analytics
Microsoft Ignite 2015 Minimizing Data Latency for Analytics 5/25/2018 5:29 AM Select ProduceName, ExpiryDate, SUM (inventory – item_sold) From <transactions> where [Date] >= DATEADD(day, -1, GETDATE()) Group by ProductName, ExpiryDate, DATEPART(HOUR, [Date]) Insert into <transactions> values (‘<upc-code>, ‘flowers’, $20.00) SQL Server Database Application Tier Presentation Layer IIS Server Benefits No Data Latency No ETL No Separate DW Challenges Minimizing Impact on OLTP workload Delivering Performant Analytics BI and analytics Dashboards Reporting This is Real-Time ANALYTICS SQL Server Analysis Server Add columnstore index © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
8
Real-time Analytics: Nonclustered Columnstore Index (NCCI)
5/25/2018 Relational Table (disk-based) (Clustered Index/Heap) Btree Index Delete bitmap Delta rowgroups Delete Buffer Nonclustered columnstore index (NCCI) Key Points Create an updateable non-clustered columnstore index (NCCI) for analytics queries Drop all other indexes that were created for analytics. No OLTP Application changes. ColumnStore index automatically keeps up with DML operations Query Optimizer will choose columnstore index where needed © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
9
Real-time Analytics: Minimizing Columnstore Index overhead
5/25/2018 Real-time Analytics: Minimizing Columnstore Index overhead OLTP Workload Relational Table (Clustered Index/Heap) Btree Index HOT Delete bitmap Delete Buffer Delta rowgroups Nonclustered columnstore index (NCCI) – filtered index Key Points Create Columnstore only on cold data – using filtered predicate to minimize maintenance Analytics query accesses both columnstore and ‘hot’ data transparently Example – Order Management Application – create nonclustered columnstore index ….. where order_status = ‘SHIPPED’ © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
10
Real-time Analytics: Minimizing Columnstore Index overhead
5/25/2018 Real-time Analytics: Minimizing Columnstore Index overhead Relational Table (Clustered Index/Heap) Btree Index Syntax: Create nonclustered columnstore index <name> on <table> (<columns>) with (compression_delay = 30 Minutes) HOT Delete bitmap Nonclustered columnstore index (NCCI) – Compression Delay Delete Buffer Delta rowgroups Compression Delay Key Points Delta RG is only compressed after ‘Compression_Delay’ duration Minimizes/Eliminates index fragmentation © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
11
Real-time Analytics: Minimizing Columnstore overhead
AlwaysOn Availability Group Analytics workload Real-time workload Primary Replica Secondary Secondary Replica Insert into <transactions> values (‘<upc-code>, ‘flowers’, $20.00) Select ProduceName, ExpiryDate, SUM (inventory – item_sold) From <transactions> where [Date] >= DATEADD(day, -1, GETDATE()) Group by ProductName, ExpiryDate, DATEPART(HOUR, [Date]) Key Points Mission Critical Real-time Workloads typically configured for High Availability using AlwaysOn Availability Groups You can offload analytics to readable secondary replica
12
Fidelity National Information Services (FIS)
Microsoft Ignite 2016 5/25/2018 5:29 AM Fidelity National Information Services (FIS) Aaron Gerdeman © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
13
FIS: Real-Time Analytics for Securities Lending
Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Product Background Securities lending market data ~$2 trillion daily volume Traders use our product for price transparency and market intelligence Enhancement Opportunity Increased analytics on our real-time data Provide a real-time dashboard for clients to track their trading performance About FIS FIS is a global leader in financial services technology, with a focus on retail and institutional banking, payments, asset and wealth management, risk and compliance, consulting, and outsourcing solutions. Through the depth and breadth of our solutions portfolio, global capabilities and domain expertise, FIS serves more than 20,000 clients in over 130 countries. Headquartered in Jacksonville, Fla., FIS employs more than 53,000 people worldwide and holds leadership positions in payment processing, financial software and banking solutions. Providing software, services and outsourcing of the technology that empowers the financial world, FIS is a Fortune 500 company and is a member of Standard & Poor’s 500® Index. For more information about FIS, visit © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
14
FIS: Real-Time Analytics for Securities Lending
Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Starting Point SQL Server 2012 with rowstore for transactional processing ETL batches Nightly for end-of-day data Every 5 minutes on streaming data for near real-time analytics Total transaction volume 1.2 million rows per 8 hours of trading App displays rich analytics on end-of-day data… …but just simple real-time analytics © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
15
FIS: Real-Time Analytics for Securities Lending
Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Challenges Enhanced real-time analytics would require additional ETL processing, possibly impacting the transactional workload The additional ETL would complicate our workflow, but we would rather simplify The application data layer would become more complex as we build more specialized tables for certain analytics Latency exists with periodic ETL batches (and in trading “time=money”) © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
16
FIS: Real-Time Analytics for Securities Lending
Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Solution: SQL Server 2016 for HTAP Non-clustered columnstore index (NCCI) Forced compression every 100,000 rows inserted Excellent analytics query performance No impact on transactional workload Simple: no new ETL; no query changes; minimal db changes (just add the NCCI) Another Option In-memory OLTP table with a clustered columnstore index Very fast on all 1.2 million rows compressed and natively compiled procedures Not as simple to set up for us as NCCI Requires some query changes © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
17
FIS: Real-Time Analytics for Securities Lending
Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Performance Numbers Multiple simulated dashboard queries executed in less than three seconds with the non-clustered columnstore index – about four times faster than with the rowstore index. © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
18
FIS: Real-Time Analytics for Securities Lending
Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Case Study For more details on our solution, download the case study here: Financial services firm accelerates real-time analytics with SQL Server 2016 © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
19
Columnstore Index: Are you getting the best Performance?
Microsoft Ignite 2016 5/25/2018 5:29 AM Columnstore Index: Are you getting the best Performance? © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
20
Columnstore Performance – Statistics
5/25/2018 5:29 AM Scenario Query runs slow after Index creation or data load Diagnosis Missing or stale statistics requiring new stats and recompilation Why? Columnstore Index has no key columns Statistics take longer due to typically large data sizes Periodic dataload can cross statistics threshold Recommendations Manually Create Stats on required columns Use Async mode for update stats © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
21
DEMO: Stale Statistics
5/25/2018 5:29 AM DEMO: Stale Statistics Sunil Agarwal © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
22
Columnstore Performance: Index Fragmentation
5/25/2018 5:29 AM Columnstore Performance: Index Fragmentation Scenario Query gradually slows down over time or after large ETL Diagnosis Large number of deleted rows in compressed rowgroups Large number of rows in delta rowgroups Why? Columnstore compressed rowgroups are read-only. Deleted rows are NOT removed Large number of delta Rowgroups Recommendations Periodically Defragment Index using ONLINE REORGANIZE © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
23
DEMO: Index Fragmentation
5/25/2018 5:29 AM DEMO: Index Fragmentation Sunil Agarwal © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
24
Columnstore Performance: Scanning Large data
5/25/2018 5:29 AM Scenario Slow query performance due to large data scan Diagnosis None or few rowgroups are skipped Why? Wide range in Min/Max value on columns used in predicates Recommendations Create Ordered Columnstore Index Identify column frequently used in predicates Use btree clustered index on as a first step before creating Columnstore Index Use Partitioning © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
25
Rowgroup Elimination Sunil Agarwal 5/25/2018 5:29 AM
© Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
26
Columnstore Performance: No Aggregate Pushdown
5/25/2018 5:29 AM Columnstore Performance: No Aggregate Pushdown Scenario Slow query performance involving aggregates Diagnosis Aggregate computation is not pushed down to SCAN node Why Scenario limitation. For example, data type <= 8 bytes Not supported to delta rowgroup Recommendations Work-around the limitations Remove delta rowgoups © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
27
DEMO: Aggregate Pushdown
5/25/2018 5:29 AM DEMO: Aggregate Pushdown Sunil Agarwal © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
28
Columnstore Performance: Bad Query Plan
5/25/2018 5:29 AM Columnstore Performance: Bad Query Plan Scenario Slow query performance due to suboptimal query plan Diagnosis Stream Operators instead of HASH mode operators Batch Operator not chosen Why Stale Statistics Query Optimizer Limitation Recommendations Try different Cardinality Estimator Provide query operator hint © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
29
DEMO: Bad Query Plan Sunil Agarwal 5/25/2018 5:29 AM
© Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
30
Summary Why Columnstore Index? Strategies to run HTAP
5/25/2018 5:29 AM Summary Why Columnstore Index? Strategies to run HTAP Customer Success Story Common Performance Pitfalls and Recommendations © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
31
References
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.