SQL Server 2016 Operational Analytics

Slides:



Advertisements
Similar presentations
Extreme Performance with Oracle Data Warehousing
Advertisements

Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
1. SQL Server 2014 In-Memory by Design Arthur Zubarev June 21, 2014.
Big Data Working with Terabytes in SQL Server Andrew Novick
A Fast Growing Market. Interesting New Players Lyzasoft.
Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
Microsoft Ignite /16/2017 3:29 PM
Meanwhile RAM cost continues to drop Moore’s Law on total CPU processing power holds but in parallel processing… CPU clock rate stalled… Because.
IIS Server ETL IIS Server This is OPERATIONAL ANALYTICS.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Microsoft SQL Server x 46% 900+ For Hosting Service Providers
Performance and Scalability. Performance and Scalability Challenges Optimizing PerformanceScaling UpScaling Out.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
Microsoft ® SQL Server ® 2008 and SQL Server 2008 R2 Infrastructure Planning and Design Published: February 2009 Updated: January 2012.
Your Data Any Place, Any Time Online Transaction Processing.
SQL Server 2014: Overview Phil ssistalk.com.
Applications hitting a wall today with SQL Server Locking/Latching Scale-up Throughput or latency SLA Applications which do not use SQL Server.
Meet Kevin Liu Principal Lead Program Manager Kevin Liu has been with Microsoft and the SQL Server engine team for 7 years, working on key projects like.
Microsoft Ignite /24/2017 9:51 PM
SharePoint enhancements through SQL Server RSS integration with SharePoint What’s New Elimination of IIS
Mission critical features in SQL 2016 David Lyth Pat Martin Premier Field Engineers, Microsoft New Zealand.
Cloud first Speed Agility Proven Feedback All of this results in a better on-premises SQL Server SQL Server 2016.
1 Chapter 13 Parallel SQL. 2 Understanding Parallel SQL Enables a SQL statement to be: – Split into multiple threads – Each thread processed simultaneously.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN Welcome November 2012 Columnstore Indexes.
Your Data Any Place, Any Time Performance and Scalability.
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.
Matt Lavery & Joanna Podgoetsky Being a DBA is cool again with SQL 2016 DAT335 A.
Warwick Rudd – Henry Rooney – How Available is SQL Server 2016? DAT33 6.
Patrick Ortiz Global SQL Solution Architect Dell Inc. BIN209.
5 Trends in the Data Warehousing Space Source: TDWI Report – Next Generation DW.
SQL Server 2016 New Innovations. Microsoft Data Platform Relational Beyond Relational On-premises Cloud Comprehensiv e Connected Choice SQL Server Azure.
Azure SQL DW – Elastic Data Analytics in the cloud Josh Sivey | Microsoft TSP #492 | Phoenix.
SQL Server Evolution New innovations Jen Underwood Sr. Program Manager of Business Intelligence & Analytics Microsoft George Walters Sr. Technical Solutions.
Warwick Rudd | Principal Consultant – consulting.com.au #456 | Auckland 2015 Mission Critical SQL Server.
SQL Server 2016 Mohit K. Gupta | Microsoft SQL Server PFE.
SQL Server 2016 Security Features Marek Chmel Microsoft MVP: Data Platform Microsoft MCT: Regional Lead MCSE: Data Platform Certified Ethical Hacker.
Oracle Announced New In- Memory Database G1 Emre Eftelioglu, Fen Liu [09/27/13] 1 [1]
SQL Server 2016 editions – what’s new Express Mission critical performance SecurityData warehousing Business intelligence Advanced Analytics Hybrid cloud.
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy.
SQL Server 2016: Real-time operational analytics
Best Practices for Columnstore Indexes Warner Chaves SQL MCM / MVP SQLTurbo.com Pythian.com.
Session Name Pelin ATICI SQL Premier Field Engineer.
IIS Server ETL Key Issues  Complex Implementation  Requires two Servers (CapEx and OpEx)  Data Latency in Analytics  More businesses demand/require.
HDC: SQL Server 2016 New Features & Demos. Phil Brammer
Enable Operational Analytics (HTAP) in SQL Server 2016 and Azure SQL Database Sunil Agarwal Principal Program Manager, SQL Server Product Tiger Team
SQL Server 2016 features by edition
Data Platform and Analytics Foundational Training
In-Memory Capabilities
Data Platform and Analytics Foundational Training
System Center Marketing
5/25/2018 5:29 AM BRK3081 Delivering High Performance Analytics with Columnstore Index on Traditional DW and HTAP Workloads Sunil Agarwal (Microsoft) Aaron.
Operational Analytics in SQL Server 2016 and Azure SQL Database
System Center Marketing
Business Critical Application Platform
SQL 2016 new Hosting Offers Secure Database Hybrid HyperScale
Real-Time Operational Analytics overview:
Introduction Module 16 9/5/2018 9:26 PM
Business Critical Application Platform
Capitalize on modern technology
What is the Azure SQL Datawarehouse?
PREMIER SPONSOR GOLD SPONSORS SILVER SPONSORS BRONZE SPONSORS SUPPORTERS.
Real world In-Memory OLTP
SQL 2014 In-Memory OLTP What, Why, and How
Designing Business Intelligence Solutions with Microsoft SQL Server
20 Questions with Azure SQL Data Warehouse
Applying Data Warehouse Techniques
Applying Data Warehouse Techniques
SQL Server 2016 High Performance Database Offer.
Sunil Agarwal | Principal Program Manager
Presentation transcript:

SQL Server 2016 Operational Analytics

Sponsorzy strategiczni Sponsorzy srebrni

Łukasz Grala Microsoft MVP Data Platform | MCT | MCSE Architect - Mentor Data Platform & Business Intelligence Solutions Trainer Data Platform and Business Intelligence University Lecturer Author Webcasts and Publications Microsoft MVP Data Platform Leader PLSSUG Poznań Phd Student on Poznan University of Technology, Faculty of Computing Science (topics – database and datawarehouse architecture, data mining, machine learning) lukasz@grala.biz lukasz@sqlexpert.pl

Marcin Szeliga Data Philosopher BI Expert and Consultant Data Platform Architect 20 years of experience with SQL Server Ph.D. Candidate at Politechnika Śląska marcin@sqlexpert.pl

Microsoft platform leads the way on-premises and cloud 4/26/2017 Leader in 2014 for Gartner Magic Quadrants Microsoft platform leads the way on-premises and cloud Operational Database Management Systems Data Warehouse Database Management Systems Business Intelligence and Analytics Platforms x86 Server Virtualization Cloud Infrastructure as a Service Enterprise Application Platform as a Service Public Cloud Storage Magic Quadrant leader in Operational Database Management Systems http://www.datastax.com/gartner-magic-quadrant-odbms https://www.gartner.com/doc/2877117/magic-quadrant-operational-database-management (paywall) Magic Quadrant leader in Data Warehouse Database Management Systems http://www.odbms.org/2014/03/2014-gartner-magic-quadrant-data-warehouse-database-management-systems/ https://www.gartner.com/doc/2678018/magic-quadrant-data-warehouse-database (paywall) Magic Quadrant leader in Business Intelligence and Analytics Platforms http://www.tableausoftware.com/gartner-magic-quadrant-2014 https://www.gartner.com/doc/2668318/magic-quadrant-business-intelligence-analytics (paywall) Magic Quadrant for x86 Server Virtualization http://tcwd.net/vblog/wp-content/uploads/2014/07/2014-3year.png http://www.gartner.com/technology/reprints.do?id=1-1WR6HLK&ct=140703&st=sb Magic Quadrant for Cloud Infrastructure as a Service http://www.gartner.com/technology/reprints.do?id=1-1UM941C&ct=140529&st=sb Magic Quadrant for Enterprise Application Platform as a Service http://www.gartner.com/technology/reprints.do?id=1-1P502BX&ct=140108&st=sb Gartner Magic Quadrant for Public Cloud Storage http://www.gartner.com/technology/reprints.do?id=1-1WWSLMM&ct=140709&st=sb © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Do more. Achieve more. Mission-critical performance Deeper insights across data Hyperscale cloud

SQL Server 2016 improvements Performance Security Availability Scalability Operational analytics Insights on operational data; Works with in-memory OLTP and disk-based OLTP In-memory OLTP enhancements Greater T-SQL surface area, terabytes of memory supported, and greater number of parallel CPUs Query data store Monitor and optimize query plans Native JSON Expanded support for JSON data Temporal database support Query data as points in time Always encrypted Sensitive data remains encrypted at all times with ability to query Row-level security Apply fine-grained access control to table rows Dynamic data masking Real-time obfuscation of data to prevent unauthorized access Other enhancements Audit success/failure of database operations TDE support for storage of in- memory OLTP tables Enhanced auditing for OLTP with ability to track history of record changes Enhanced AlwaysOn Three synchronous replicas for auto failover across domains Round robin load balancing of replicas Automatic failover based on database health DTC for transactional integrity across database instances with AlwaysOn Support for SSIS with AlwaysOn Enhanced database caching Cache data with automatic, multiple TempDB files per instance in multi- core environments Performance Enhanced in-memory performance with upto 30xfaster transactions, more than 100x faster queries than disk based relational databases and real-time operational analytics. Security Upgrades Always Encrypted technology helps protect your data at rest and in motion, on-premises and in the cloud, with master keys sitting with the application, without any application changes. High Availability Even higher availability and performance than SQL Server 2014 of your AlwaysOn secondaries with the ability to have up to 3 synchronous replicas, DTC support and round-robin load balancing of the secondaries. Scalability Enhanced database caching across multiple cores & support for Windows Server 2016 that efficiently scale compute, networking and storage in both physical and virtual environments.

Mission-critical performance Security Availability Scalability Operational analytics Insights on operational data; Works with in-memory OLTP and disk-based OLTP In-memory OLTP enhancements Greater T-SQL surface area, terabytes of memory supported, and greater number of parallel CPUs Query data store Monitor and optimize query plans Native JSON Expanded support for JSON data Temporal database support Query data as points in time Always encrypted Sensitive data remains encrypted at all times with ability to query Row-level security Apply fine-grained access control to table rows Dynamic data masking Real-time obfuscation of data to prevent unauthorized access Other enhancements Audit success/failure of database operations TDE support for storage of in- memory OLTP tables Enhanced auditing for OLTP with ability to track history of record changes Enhanced AlwaysOn Three synchronous replicas for auto failover across domains Round robin load balancing of replicas Automatic failover based on database health DTC for transactional integrity across database instances with AlwaysOn Support for SSIS with AlwaysOn Enhanced database caching Cache data with automatic, multiple TempDB files per instance in multi- core environments

What does operational mean? Refers to Operational Workload (i.e. OLTP) Examples: Enterprise Resource Planning (ERP) – Inventory, Order, Sales, Machine Data – Data from machine operations on factory floor Online Stores (e.g. Amazon, Expedia) Stock/Security trades Mission Critical No downtime (High Availability) – impact on revenue Low latency and high transaction throughput

What does analytics mean? Studying past data (e.g. operational, social media) to identify potential trends To analyze the effects of certain decisions or events (e.g. Ad campaign) Analyze past/current data to predict outcomes (e.g. credit score) Goals Enhance the business by gaining knowledge to make improvements or changes Source – MIT/SLOAN Management Review

Traditional BI architecture Microsoft Ignite 2015 4/26/2017 6:49 AM Traditional BI architecture SQL Server Database Application Tier Presentation Layer IIS Server BI and analytics Dashboards Reporting Key Issues Complex Implementation Requires two Servers (CapEx and OpEx) Data Latency in Analytics More businesses demand/require real-time Analytics SQL Server Analysis Server SQL Server Relational DW Database ETL Hourly, Daily, Weekly © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Minimizing data latency for analytics Microsoft Ignite 2015 4/26/2017 6:49 AM Minimizing data latency for analytics Benefits No Data Latency No ETL No Separate DW Challenges Analytics queries are resource intensive and can cause blocking How to minimize Impact on Operational workload Sub-optimal execution of Analytics on relational schema SQL Server Database Application Tier Presentation Layer IIS Server BI and analytics Dashboards Reporting SQL Server Analysis Server This is OPERATIONAL ANALYTICS Add analytics specific indexes © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

SQL Server 2016

Operational Analytics 4/26/2017 Operational Analytics Ability to run analytics queries concurrently with operational workload using the same schema Not a replacement for Extreme analytics queries performance possible using schemas customized (e.g. star/snowflake) and pre-aggregated cubes Data coming from non-relational sources Data coming from multiple relational sources requiring integrated analytic © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Achieved using columnstore Index Operational Analytics with SQL Server 2016 Goals Minimal impact on operational workload with concurrent analytics Performant analytics on operational schema Achieved using columnstore Index

Quick Recap: Columnstore Index Data stored as columns Data stored as rows C1 C2 C3 C5 C4 … Ideal for OLTP Efficient operation on small set of rows Improved compression: Data from same domain compress better Reduced I/O: Fetch only columns needed Improved performance: More data fits in memory Optimized for CPU utilization Ideal for DW workload

Clustered Columnstore Performance: TPC-H

Operational analytics disk-based tables 4/26/2017 6:49 AM Operational analytics disk-based tables © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

(Clustered Index/Heap) Operational Analytics with columnstore index Relational Table (Clustered Index/Heap) Btree Index Nonclustered columnstore index (NCCI) Delete bitmap Delta rowgroups Key Points Create an updateable non-clustered columnstore index (NCCI) for analytics queries Drop all other indexes that were created for analytics No application changes ColumnStore index is maintained just like any other index Query Optimizer will choose columnstore index where needed

(Clustered Index/Heap) Minimizing CSI overhead DML Operations Relational Table (Clustered Index/Heap) Btree Index HOT Delete bitmap Nonclustered columnstore index (NCCI) – filtered index Delta rowgroups Key Points Create Columnstore only on cold data – using filtered predicate to minimize maintenance Analytics query accesses both columnstore and ‘hot’ data transparently Example – Order Management Application – CREATE NONCLUSTERED COLUMNSTORE INDEX ….. WHERE order_status = ‘SHIPPED’

Operational analytics for in-memory tables 4/26/2017 6:49 AM Operational analytics for in-memory tables © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Operational Analytics with columnstore on In-Memory Tables 4/26/2017 Operational Analytics with columnstore on In-Memory Tables Hash Index No explicit delta rowgroup Rows (tail) not in columnstore stay in in-memory OLTP table No columnstore index overhead when operating on tail Background task migrates rows from tail to columnstore in chunks of 1 million rows not changed in last 1 hour Deleted Rows Table (DRT) – tracks deleted rows Columnstore data fully resident in memory Persisted together with operational data No application changes required Range Index Updateable CCI DRT Tail In-Memory OLTP Table Hot Like Delta rowgroup © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Columnstore index on in-memory table overhead DML Operations on In-Memory OLTP Operation Hash or Range Index HK-CCI Insert Insert row into HK Delete Seek row(s) to be deleted Delete the row Delete the row in HK If row in TAIL then return else insert <colstore-RID> into DRT Update Seek the row(s) Update (delete/insert) Update (delete/insert) in HK

Minimizing this overhead Microsoft Ignite 2015 4/26/2017 6:49 AM Minimizing this overhead DML Operations In-Memory OLTP Table Updateable CCI DRT Tail Range Index Hash Index Like Delta rowgroup Hot Keep hot data only in in-memory tables Example – data stays hot for 1 day, 1 week… © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Query processing

Demo time

Performance improvments Scan type Elapsed time (s) Speedup Row store scan, interop 44.441 Row store scan, native 28.445 1.6x CSI scan, interop 0.802 55.4x

Insert, Update, Delete costs and query time Operation Elasped time (s) with CSI Elasped time (s) No CSI Increase % Update Increase % Query CSI scan, interop 0.802 BASE Insert 400 000 rows 53.5 47.8 11.9% 0.869 8.4% Update 400 000 rows 42.4 28.9 46.7% 1.181 47.3% Delete 400 000 rows 38.3 30.5 25.6% 1.231 53.5%

Single thread insert and update Operation Rows affected Row store (s) Secondary CSI (s) Primary CSI (s) 1000 updates 10 000 0.893 1.400 6.866 10% insert 18M 233.9 566 291.4 2% update 3.96M 123.2 314.3 275.9

Single thread scan Millions of rows Row store Secondary CSI Primary CSI New built 180 99.1 4.7 1.71 After 1000 updates 99.4 5.4 1.75 After 10% inserts 198 108.7 14.5 9.5 After 2% updates 109.5 16.8 10.0

Elapsed time per lookup in a B-Tree index Columns projected B-tree over CSI (ms) without mapping B-tree over CSI with mapping B-tree over B-tree (ms) No compresion B-tree over B-tree (ms) Page compresion 4 3.92 5.28 2.41 3.65 8 4.33 5.73 2.32 3.85 20 6.67 8.07 2.55 4.44

Overhead on data loading of having an NCI New partition size (milion rows) Time for bulk load into CSI (ms) Time to create B-tree (ms) Ratio (index creation/bulk load) 1 5 327 66 1.24% 5 5 335 77 1.45% 10 5 354 83 1.55%

Comparing performance Operation Billions of value per second No SIMD SIMD Speedup Bit unpacking 6bits 2.08 11.55 5.55x Bit unpacking 12 bits 1.91 9.76 5.11x Bit unpacking 21 bits 1.96 5.29 2.70x Compaction 32 bits 1.24 6.70 5.40x Range predicate 16 bits 0.94 11.42 5.06x Sum 16 bit values 2.86 14.46 128-bit bitmap filter 0.97 11.77x 64KB bitmap filter 1.01 2.37 2.35x

Query performance (1) Predicate or aggregation Duration SQL2014 (ms) Speedup Billion of rows per s Q1-Q4: select count(*) from LINEITEM where <predicate> L_ORDERKEY = 235236 220 140 1.57x 12.9 L_QUANTITY = 1900 664 68 9.76x 26.5 L_SHIPMODE='AIR' 694 147 4.72x 12.2 L_SHIPDATE between '01.01.1997' and '01.01.1998' 512 87 5.89x 20.7

Query performance (2) Predicate or aggregation Duration SQL2014 (ms) Speedup Billion of rows per s Q5-Q6: select count(*) from PARTSUPP where <predicate> PS_AVAILQTY < 10 50 27 1.85x 8.9 PS_AVAILQTY = 10 45 15 3.00x 16 Q7-Q8: select <aggregates> from LINEITEM avg(L_DISCOUNT) 1272 196 6.49x 9.1 avg(L_DISCOUNT), min(L_ORDERKEY), max(L_ORDERKEY) 1978 356 5.56x 5.1

Columnstore index overhead DML operations on OLTP workload Operation BTREE (NCI) Non Clustered ColumnStore Index (NCCI) Insert Insert row into btree Insert row into btree (delta store) Delete Seek row(s) to be deleted Delete the row Seek for the row in the delta stores (there can be multiple) If row found, then delete Otherwise insert the key into delete row buffer Update Seek the row(s) Delete the row (steps same as above) Insert the updated row into delta store

Availability Groups as data warehouse Always on Availability Group Key points Mission Critical Operational Workloads typically configured for High Availability using AlwaysOn Availability Groups You can offload analytics to readable secondary replica Secondary Replica Primary Replica Source: https://msdn.microsoft.com/en-us/library/hh710054(v=sql.130).aspx To configure an AlwaysOn availability group to support read-only routing in SQL Server 2016, you can use either Transact-SQL or PowerShell. Read-only routing refers to the ability of SQL Server to route qualifying read-only connection requests to an available AlwaysOn readable secondary replica (that is, a replica that is configured to allow read-only workloads when running under the secondary role). To support read-only routing, the availability group must possess an availability group listener. Read-only clients must direct their connection requests to this listener, and the client's connection strings must specify the application intent as "read-only." That is, they must be read-intent connection requests.

Minimizing data latency for analytics Microsoft Ignite 2015 4/26/2017 6:49 AM Minimizing data latency for analytics SQL Server Database Application Tier Presentation Layer IIS Server BI and analytics Dashboards Reporting SQL Server Analysis Server Add analytics specific indexes © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

High-end Server Hardware SSAS Enterprise Readiness: Tabular New DirectQuery DirectQuery for Oracle, Teradata, ASP DirectQuery support for MDX query(Excel Tools) High-end Server Hardware Source: https://msdn.microsoft.com/en-us/library/bb522628(v=sql.130).aspx

Summary – OA with SQL Server 2016 Analytics in real-time with no data latency Rich set of options to control impact on Operational workload Industry leading solution Integrating in-memory OLTP with in-memory Analytics No Application changes required

Sponsorzy strategiczni Sponsorzy srebrni