5/25/2018 5:29 AM BRK3081 Delivering High Performance Analytics with Columnstore Index on Traditional DW and HTAP Workloads Sunil Agarwal (Microsoft) Aaron.

Slides:



Advertisements
Similar presentations
Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
Advertisements

SQL Server 2016: Real-time operational analytics
IIS Server ETL Key Issues  Complex Implementation  Requires two Servers (CapEx and OpEx)  Data Latency in Analytics  More businesses demand/require.
Enable Operational Analytics (HTAP) in SQL Server 2016 and Azure SQL Database Sunil Agarwal Principal Program Manager, SQL Server Product Tiger Team
Clustered Columnstore index deep dive
Data Platform and Analytics Foundational Training
Data Platform Modernization
Successfully migrate existing databases to Azure SQL Database
5/22/2018 1:39 AM BRK2156 Power BI Report Server: Self-service BI and enterprise reporting on-premises Christopher Finlan Senior Program Manager © Microsoft.
Microsoft Ignite /22/2018 7:21 PM BRK2007
Operational Analytics in SQL Server 2016 and Azure SQL Database
How To Deliver Apps Faster And Secure Them The Microsoft Way
System Center Marketing
Cloud Security IS Application-Centric Security
Business Critical Application Platform
Microsoft /2/2018 3:42 PM BRK3129 Query Big Data using the Expanded T-SQL footprint with PolyBase in SQL Server 2016 Casey Karst Program Manager.
Use any Amazon S3 application with Azure Blob Storage
SQL 2016 new Hosting Offers Secure Database Hybrid HyperScale
6/12/2018 2:19 PM BRK3245 DirectQuery in Analysis Services: best practices, performance, and use cases Marco Russo SQLBI © Microsoft Corporation. All rights.
Microsoft SQL Server 2017 Maximum Availability & Read Scale-Out
6/19/2018 2:57 AM THR3092 Monitor and investigate actions on your user and data with alerts, insights and reports Binyan Chen Program Manager II, Office.
Get Typed with TypeScript!
Optimizing Microsoft OneDrive for the enterprise
Performing a Seamless Migration in Azure SQL DB
Build data-driven solutions using Microsoft Visio
What a Real, Functioning DevOps Team Looks Like
7/22/2018 9:21 PM BRK3270 Building a Better Data Solution: Microsoft SQL Server and Azure Data Services Joey D’Antoni Principal Consultant Denny Cherry.
SQL Server on Linux on All-Flash Arrays
8/6/ :17 AM THR2214 Hybrid Cloud Activated A customer case study optimizing on-premises & Azure performance and cost Mor Cohen-Tal Senior Product.
Real-Time Operational Analytics overview:
Excel and Power BI Better Together Democratization of data
Workflow Orchestration with Adobe I/O
Customize Office 365 Search and create result sources
How we got a traditional bank collaborating across boundaries
Find, try and get line-of-business apps on Microsoft AppSource
Integrate Power BI with Microsoft Dynamics
Automate all things! Microsoft Azure continuous deployment
Introduction to SQL Server Management for the Non-DBA
Agile Planning with Visual Studio Team Services (VSTS)
9/22/2018 3:49 AM BRK2247 Learn from MVPs: Panel discussion on all things SharePoint and OneDrive © Microsoft Corporation. All rights reserved. MICROSOFT.
Azure PowerShell Aaron Roney Senior Program Manager Cormac McCarthy
BRK Maximize the power of SQL Azure with Dynamics AX
Data Platform Modernization
11/22/2018 1:43 PM THR3005 How to provide business insight from your data using Azure Analysis Services Peter Myers Bitwise Solutions © Microsoft Corporation.
Continuous Delivery with Visual Studio Team Services
Azure Advisor: Optimization in the best way
11/29/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
Microsoft products for non-profits
Power-up NoSQL with Azure Cosmos DB
Five cool things you can do with Windows PowerShell on Office 365
Microsoft Exchange: Through the eyes of MVPs (Panel discussion)
Sunil Agarwal | Principal Program Manager
What query folding means to self-service BI projects
Overview: Dynamics 365 for Project Service Automation
Virtual Reality with Azure and Unity
Understand your Azure cloud assets dependencies with BMC Discovery
Breaking Down the Value of A Yammer Post: 20 Things to Do
Cool Microsoft Edge Tips and Tricks
When Bad Things Happen to Good Applications
Getting the most out of Azure resources with Azure Advisor
“Hey Mom, I’ll Fix Your Computer”
4/21/2019 7:09 AM THR2098 Unlock New Opportunities with Nintex Hawkeye Process Intelligence and Workflow Analytics Sr. Product.
Consolidate, manage, backup, and secure your cloud content
Designing Bots that Fit Your Organization
Ask the Experts: Windows 10 deployment and servicing
Passwordless Service Accounts
Digital Transformation: Putting the Jigsaw Together
WCF and .NET Framework Microservices in Containers
Diagnostics and troubleshooting in Azure App Service Support Center
Optimizing your content for search and discovery
Presentation transcript:

5/25/2018 5:29 AM BRK3081 Delivering High Performance Analytics with Columnstore Index on Traditional DW and HTAP Workloads Sunil Agarwal (Microsoft) Aaron Gerdeman (FIS) © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Please evaluate this session Tech Ready 15 5/25/2018 Please evaluate this session From your Please expand notes window at bottom of slide and read. Then Delete this text box. PC or tablet: visit MyIgnite https://myignite.microsoft.com/evaluations Phone: download and use the Microsoft Ignite mobile app https://aka.ms/ignite.mobileapp Your input is important! © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Agenda Why HTAP? Strategies to run HTAP Customer Success Story 5/25/2018 5:29 AM Agenda Why HTAP? Strategies to run HTAP Customer Success Story Common Performance Pitfalls and Recommendations © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Traditional Analytics Architecture Microsoft Ignite 2015 5/25/2018 5:29 AM Select ProduceName, ExpiryDate, SUM (inventory – item_sold) From <transactions> where [Date] >= DATEADD(day, -1, GETDATE()) Group by ProductName, ExpiryDate, DATEPART(HOUR, [Date]) Insert into <transactions> values (‘<upc-code>, ‘flowers’, $20.00) SQL Server Database Application Tier Presentation Layer IIS Server BI and analytics Dashboards Reporting SQL Server Analysis Server Key Issues Complex Implementation Requires two Servers (CapEx and OpEx) Data Latency in Analytics More businesses demand/require real-time Analytics SQL Server Relational DW Database ETL Hourly, Daily, Weekly © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Demo: Value Proposition 5/25/2018 Demo: Value Proposition Sunil Agarwal © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

ColumnStore Index Performance 5/25/2018 5:29 AM ColumnStore Index Performance Promise 10x Data Compression (Typical) Up to 100x Query performance Magic Store data as columns BATCH Mode execution Predicate Pushdown Aggregate Pushdown Efficient data elimination Data stored as columns C5 C1 C2 C3 C4 rowgroup segment Improved compression: Data from same domain compress better Reduced I/O: Fetch only columns needed Improved Performance: More data fits in memory Optimized for CPU utilization Ideal for DW Workload © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Minimizing Data Latency for Analytics Microsoft Ignite 2015 Minimizing Data Latency for Analytics 5/25/2018 5:29 AM Select ProduceName, ExpiryDate, SUM (inventory – item_sold) From <transactions> where [Date] >= DATEADD(day, -1, GETDATE()) Group by ProductName, ExpiryDate, DATEPART(HOUR, [Date]) Insert into <transactions> values (‘<upc-code>, ‘flowers’, $20.00) SQL Server Database Application Tier Presentation Layer IIS Server Benefits No Data Latency No ETL No Separate DW Challenges Minimizing Impact on OLTP workload Delivering Performant Analytics BI and analytics Dashboards Reporting This is Real-Time ANALYTICS SQL Server Analysis Server Add columnstore index © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Real-time Analytics: Nonclustered Columnstore Index (NCCI) 5/25/2018 Relational Table (disk-based) (Clustered Index/Heap) Btree Index Delete bitmap Delta rowgroups Delete Buffer Nonclustered columnstore index (NCCI) Key Points Create an updateable non-clustered columnstore index (NCCI) for analytics queries Drop all other indexes that were created for analytics. No OLTP Application changes. ColumnStore index automatically keeps up with DML operations Query Optimizer will choose columnstore index where needed © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Real-time Analytics: Minimizing Columnstore Index overhead 5/25/2018 Real-time Analytics: Minimizing Columnstore Index overhead OLTP Workload Relational Table (Clustered Index/Heap) Btree Index HOT Delete bitmap Delete Buffer Delta rowgroups Nonclustered columnstore index (NCCI) – filtered index Key Points Create Columnstore only on cold data – using filtered predicate to minimize maintenance Analytics query accesses both columnstore and ‘hot’ data transparently Example – Order Management Application – create nonclustered columnstore index ….. where order_status = ‘SHIPPED’ © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Real-time Analytics: Minimizing Columnstore Index overhead 5/25/2018 Real-time Analytics: Minimizing Columnstore Index overhead Relational Table (Clustered Index/Heap) Btree Index Syntax: Create nonclustered columnstore index <name> on <table> (<columns>) with (compression_delay = 30 Minutes) HOT Delete bitmap Nonclustered columnstore index (NCCI) – Compression Delay Delete Buffer Delta rowgroups Compression Delay Key Points Delta RG is only compressed after ‘Compression_Delay’ duration Minimizes/Eliminates index fragmentation © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Real-time Analytics: Minimizing Columnstore overhead AlwaysOn Availability Group Analytics workload Real-time workload Primary Replica Secondary Secondary Replica Insert into <transactions> values (‘<upc-code>, ‘flowers’, $20.00) Select ProduceName, ExpiryDate, SUM (inventory – item_sold) From <transactions> where [Date] >= DATEADD(day, -1, GETDATE()) Group by ProductName, ExpiryDate, DATEPART(HOUR, [Date]) Key Points Mission Critical Real-time Workloads typically configured for High Availability using AlwaysOn Availability Groups You can offload analytics to readable secondary replica

Fidelity National Information Services (FIS) Microsoft Ignite 2016 5/25/2018 5:29 AM Fidelity National Information Services (FIS) Aaron Gerdeman © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

FIS: Real-Time Analytics for Securities Lending Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Product Background Securities lending market data ~$2 trillion daily volume Traders use our product for price transparency and market intelligence Enhancement Opportunity Increased analytics on our real-time data Provide a real-time dashboard for clients to track their trading performance About FIS FIS is a global leader in financial services technology, with a focus on retail and institutional banking, payments, asset and wealth management, risk and compliance, consulting, and outsourcing solutions. Through the depth and breadth of our solutions portfolio, global capabilities and domain expertise, FIS serves more than 20,000 clients in over 130 countries. Headquartered in Jacksonville, Fla., FIS employs more than 53,000 people worldwide and holds leadership positions in payment processing, financial software and banking solutions. Providing software, services and outsourcing of the technology that empowers the financial world, FIS is a Fortune 500 company and is a member of Standard & Poor’s 500® Index. For more information about FIS, visit www.fisglobal.com. © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

FIS: Real-Time Analytics for Securities Lending Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Starting Point SQL Server 2012 with rowstore for transactional processing ETL batches Nightly for end-of-day data Every 5 minutes on streaming data for near real-time analytics Total transaction volume 1.2 million rows per 8 hours of trading App displays rich analytics on end-of-day data… …but just simple real-time analytics © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

FIS: Real-Time Analytics for Securities Lending Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Challenges Enhanced real-time analytics would require additional ETL processing, possibly impacting the transactional workload The additional ETL would complicate our workflow, but we would rather simplify The application data layer would become more complex as we build more specialized tables for certain analytics Latency exists with periodic ETL batches (and in trading “time=money”) © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

FIS: Real-Time Analytics for Securities Lending Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Solution: SQL Server 2016 for HTAP Non-clustered columnstore index (NCCI) Forced compression every 100,000 rows inserted Excellent analytics query performance No impact on transactional workload Simple: no new ETL; no query changes; minimal db changes (just add the NCCI) Another Option In-memory OLTP table with a clustered columnstore index Very fast on all 1.2 million rows compressed and natively compiled procedures Not as simple to set up for us as NCCI Requires some query changes © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

FIS: Real-Time Analytics for Securities Lending Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Performance Numbers Multiple simulated dashboard queries executed in less than three seconds with the non-clustered columnstore index – about four times faster than with the rowstore index. © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

FIS: Real-Time Analytics for Securities Lending Microsoft Ignite 2016 5/25/2018 5:29 AM FIS: Real-Time Analytics for Securities Lending Case Study For more details on our solution, download the case study here: Financial services firm accelerates real-time analytics with SQL Server 2016 © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Columnstore Index: Are you getting the best Performance? Microsoft Ignite 2016 5/25/2018 5:29 AM Columnstore Index: Are you getting the best Performance? © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Columnstore Performance – Statistics 5/25/2018 5:29 AM Scenario Query runs slow after Index creation or data load Diagnosis Missing or stale statistics requiring new stats and recompilation Why? Columnstore Index has no key columns Statistics take longer due to typically large data sizes Periodic dataload can cross statistics threshold Recommendations Manually Create Stats on required columns Use Async mode for update stats © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

DEMO: Stale Statistics 5/25/2018 5:29 AM DEMO: Stale Statistics Sunil Agarwal © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Columnstore Performance: Index Fragmentation 5/25/2018 5:29 AM Columnstore Performance: Index Fragmentation Scenario Query gradually slows down over time or after large ETL Diagnosis Large number of deleted rows in compressed rowgroups Large number of rows in delta rowgroups Why? Columnstore compressed rowgroups are read-only. Deleted rows are NOT removed Large number of delta Rowgroups Recommendations Periodically Defragment Index using ONLINE REORGANIZE © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

DEMO: Index Fragmentation 5/25/2018 5:29 AM DEMO: Index Fragmentation Sunil Agarwal © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Columnstore Performance: Scanning Large data 5/25/2018 5:29 AM Scenario Slow query performance due to large data scan Diagnosis None or few rowgroups are skipped Why? Wide range in Min/Max value on columns used in predicates Recommendations Create Ordered Columnstore Index Identify column frequently used in predicates Use btree clustered index on as a first step before creating Columnstore Index Use Partitioning © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Rowgroup Elimination Sunil Agarwal 5/25/2018 5:29 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Columnstore Performance: No Aggregate Pushdown 5/25/2018 5:29 AM Columnstore Performance: No Aggregate Pushdown Scenario Slow query performance involving aggregates Diagnosis Aggregate computation is not pushed down to SCAN node Why Scenario limitation. For example, data type <= 8 bytes Not supported to delta rowgroup Recommendations Work-around the limitations Remove delta rowgoups © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

DEMO: Aggregate Pushdown 5/25/2018 5:29 AM DEMO: Aggregate Pushdown Sunil Agarwal © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Columnstore Performance: Bad Query Plan 5/25/2018 5:29 AM Columnstore Performance: Bad Query Plan Scenario Slow query performance due to suboptimal query plan Diagnosis Stream Operators instead of HASH mode operators Batch Operator not chosen Why Stale Statistics Query Optimizer Limitation Recommendations Try different Cardinality Estimator Provide query operator hint © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

DEMO: Bad Query Plan Sunil Agarwal 5/25/2018 5:29 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Summary Why Columnstore Index? Strategies to run HTAP 5/25/2018 5:29 AM Summary Why Columnstore Index? Strategies to run HTAP Customer Success Story Common Performance Pitfalls and Recommendations © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

References