Turbocharge your Data Warehouse Queries with Columnstore Indexes Len Wyatt Program Manager Microsoft Corporation DBI313.

Slides:



Advertisements
Similar presentations
SQL Database for a Book Store Clinton McKay. Explanation The database contains information about the books held in stock, their authors, publishers, customers,
Advertisements

Chapter 4 Joining Multiple Tables
Data Management and Index Options for SQL Server Data Warehouses Atlanta MDF.
In-Memory Technologies Enhanced High Availability New Hybrid Scenarios In-Memory OLTP 5-25x performance gain for OLTP integrated into SQL Server In-Memory.
DBI 312. SELECT prod.. FROM Product,,,, WHERE ….
1 © Copyright 2009 EMC Corporation. All rights reserved. Data Warehousing Features in SQL Server 2008 James
Planning on attending PASS Summit 2014? The world’s largest gathering of SQL Server & BI professionals Take your SQL Server skills to the.
Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
6.814/6.830 Lecture 8 Memory Management. Column Representation Reduces Scan Time Idea: Store each column in a separate file GM AAPL.
Relational Algebra – Basis for Relational Query Languages Based on presentation by Juliana Freire.
Access - 1 Table Query (View) FormReport Database Application Basic Database Objects Relationships among Access Database Objects A saved SELECT query is.
Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313.
Parallel Execution Plans Joe Chang
02 | Advanced SELECT Statements Brian Alderman | MCT, CEO / Founder of MicroTechPoint Tobias Ternstrom | Microsoft SQL Server Program Manager.
SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation.
CS 345: Topics in Data Warehousing Thursday, October 21, 2004.
©2012 Microsoft Corporation. All rights reserved. Content based on SharePoint 15 Technical Preview and published July 2012.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Chapter 9 Joining Data from Multiple Tables
Data Warehouse and the Star Schema CSCI 242 ©Copyright 2015, David C. Roberts, all rights reserved.
SQL Server xVelocity memory optimized Columnstore Index Performance Tuning Rapinder Jawanda Sr. Program Manager Microsoft Corporation.
04 | Grouping and Aggregating Data Brian Alderman | MCT, CEO / Founder of MicroTechPoint Tobias Ternstrom | Microsoft SQL Server Program Manager.
SQL Server Indexes Indexes. Overview Indexes are used to help speed search results in a database. A careful use of indexes can greatly improve search.
Information Technologies and Microsoft SQL Server Day 2 by Alper Özpınar
TPC-H Studies Joe Chang
Indexes and Views Unit 7.
CpSc 462/662: Database Management Systems (DBMS) (TEXNH Approach) Relational Schema and SQL Queries James Wang.
SQL Select Statement IST359.
Inventory Counting by Batch Making item counting easier by the click… Presented by: Claudia Musick Implementation Consultant TUESDAY, NOVEMBER 10 TH, 2015.
INTRODUCING SQL SERVER 2012 COLUMNSTORE INDEXES Exploring and Managing SQL Server 2012 Database Engine Improvements.
05 | SET Operators, Windows Functions, and Grouping Brian Alderman | MCT, CEO / Founder of MicroTechPoint Tobias Ternstrom | Microsoft SQL Server Program.
SQL: Single Table Queries SELECT FROM WHERE ORDER D. Christozov / G.Tuparov INF 280 Database Systems: Single Table Queries 1.
SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.
COMP 430 Intro. to Database Systems Grouping & Aggregation Slides use ideas from Chris Ré and Chris Jermaine. Get clickers today!
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
5 Trends in the Data Warehousing Space Source: TDWI Report – Next Generation DW.
--A Gem of SQL Server 2012, particularly for Data Warehousing-- Present By Steven Wang.
PASS Virtual Chapter presentation March 27, 2014.
Execution Plans Detail From Zero to Hero İsmail Adar.
Turbocharge your DW Queries with ColumnStore Indexes Susan Price Senior Program Manager DW and Big Data.
DB2 Application Development Managing SQL Complexity By Structuring Your SQL Robert Goodman May 11 th, 3:30 – 4:40 pm Session: B7.
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy.
Session Name Pelin ATICI SQL Premier Field Engineer.
IFS180 Intro. to Data Management Chapter 10 - Unions.
Brian Alderman | MCT, CEO / Founder of MicroTechPoint Pete Harris | Microsoft Senior Content Publisher.
Clustered Columnstore index deep dive
MySQL Subquery Source: Dev.MySql.com
02 | Advanced SELECT Statements
Using Relational Databases and SQL
Four Rules For Columnstore Query Performance
A developers guide to Azure SQL Data Warehouse
Blazing-Fast Performance:
Cardinality Estimator 2014/2016
ColumnStore Index Primer
TechEd /20/ :49 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
A developers guide to Azure SQL Data Warehouse
11/29/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
TechEd /2/2018 7:32 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
SQL – Entire Select.
Chapter 4 Summary Query.
Database systems Lecture 3 – SQL + CRUD
SQL Fundamentals in Three Hours
Sunil Agarwal | Principal Program Manager
CMPT 354: Database System I
Four Rules For Columnstore Query Performance
Contents Preface I Introduction Lesson Objectives I-2
Database Management System
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Using Columnstore indexes in Azure DevOps Services. Lessons learned.
Presentation transcript:

Turbocharge your Data Warehouse Queries with Columnstore Indexes Len Wyatt Program Manager Microsoft Corporation DBI313

demo Columnstores speed up queries

Overview of Columnstore Index

6 … C1 C2 C3 C5C6C4

7 Segments C1 C2 C3 C5C6C4 Row group

OrderDateKeyProductKeyStoreKeyRegionKeyQuantitySalesAmount

OrderDateKeyProductKeyStoreKeyRegionKeyQuantitySalesAmount OrderDateKeyProductKeyStoreKeyRegionKeyQuantitySalesAmount

OrderDateKey ProductKey StoreKey RegionKey Quantity SalesAmount OrderDateKey ProductKey StoreKey RegionKey Quantity SalesAmount

OrderDateKey ProductKey StoreKey RegionKey Quantity SalesAmount OrderDateKey ProductKey StoreKey RegionKey Quantity SalesAmount

StoreKey StoreKey RegionKey Quantity OrderDateKey OrderDateKey ProductKey ProductKey SalesAmount SalesAmount

StoreKey StoreKey RegionKey Quantity OrderDateKey OrderDateKey ProductKey ProductKey SalesAmount SalesAmount

15 bitmap of qualifying rows Column vectors Batch object

Make sure most of the work of the query happens in batch mode

Loading Columnstores Effectively

Optimizing database and index design

DateLicenseNumMeasure XYZ ABC DateLicenseIdMeasure LicenseIdLicenseNum 1XYZ123 2ABC777

Optimizing queries

Common workarounds

demo Example need for a workaround

Make sure most of the work of the query happens in batch mode

select m.Title, COUNT(p.IP) PurchaseCount from Media m left outer join Purchase p on p.MediaId=m.MediaId group by m.Title order by COUNT(p.IP) desc with T (Title, PurchaseCount) as ( select m.Title, COUNT(p.IP) PurchaseCount from Media m join Purchase p on p.MediaId=m.MediaId group by m.Title ) select distinct m.Title, ISNULL(T.PurchaseCount,0) as PurchaseCount from Media m left outer join T on m.Title=T.Title order by ISNULL(T.PurchaseCount,0) desc; 6.4 sec elapsed 55 CPU-seconds 0.2 sec elapsed 1.9 CPU-sec

select p.Date, count(*) from Purchase p where p.MediaId in (select MediaId from MediaStudyGroup) group by p.Date order by p.Date; --or-- select p.Date, count(*) from Purchase p where exists (select m.MediaId from MediaStudyGroup m where m.MediaId = p.MediaId) group by p.Date order by p.Date; select p.Date, count(*) from Purchase p join MediaStudyGroup m on p.MediaId = m.MediaId group by p.Date order by p.Date; 3.0 sec elapsed 32 CPU-seconds 0.05 sec elapsed 0.3 CPU-seconds

create view vPurchase as select * from Purchase union all select * from DeltaPurchase; select p.date, d.DayNumOfMonth, count(*) from vPurchase as p, Date d where p.Date = d.DateId group by p.date, d.DayNumOfMonth; select p.date, d.DayNumOfMonth, m.Genre, count(*) from vPurchase p, Date d, Media m where p.Date = d.DateId and m.MediaId = p.MediaId group by p.date, d.DayNumOfMonth, m.Genre Batch mode 0.1 sec elapsed Row mode 19 sec elapsed

with MainSummary (date, DayNumOfmonth, Genre, c) as ( select p.date, d.DayNumOfMonth, m.Genre, count(*) c from Purchase p, Date d, Media m where p.Date = d.DateId and m.MediaId = p.MediaId group by p.date, d.DayNumOfMonth, m.Genre ), DeltaSummary (date, DayNumOfmonth, Genre, c) as ( select p.date, d.DayNumOfMonth, m.Genre, count(*) c from DeltaPurchase p, Date d, Media m where p.Date = d.DateId and m.MediaId = p.MediaId group by p.date, d.DayNumOfMonth, m.Genre ), CombinedSummary (date, DayNumOfMonth, Genre, c) as ( --union all across the output of the two queries select * from MainSummary UNION ALL select * from DeltaSummary ) --group by to aggregate the data. select t.date, t.DayNumOfmonth, t.Genre, sum(c) as c from CombinedSummary as t group by t.date, t.DayNumOfmonth, t.Genre; Batch mode 0.3 sec elapsed

select count(*) from Purchase with CountByDate (Date, c) as ( select Date, count(*) from Purchase group by Date ) select sum(c) from CountByDate; 1.0 sec elapsed 15 CPU-seconds 0.06 sec elapsed 0.3 CPU-seconds

select p.Date, count(distinct p.UserId) as UserIdCount, count(distinct p.MediaId) as MediaIdCount from Purchase p, Media m where p.MediaId = m.MediaId and m.Category in ('Horror') group by p.Date; 26 sec elapsed 31 CPU-seconds

with DistinctMediaIds (Date, MediaIdCount) as ( select p.Date, count(distinct p.MediaId) as MediaIdCount from Purchase p, Media m where p.MediaId = m.MediaId and m.Category in ('Horror') group by p.Date ), DistinctUserIds (Date, UserIdCount) as ( select p.Date, count(distinct p.UserId) as UserIdCount from Purchase p, Media m where p.MediaId = m.MediaId and m.Category in ('Horror') group by p.Date ) select m.Date, m.MediaIdCount, u.UserIdCount from DistinctMediaIds m join DistinctUserIds u on m.Date=u.Date 0.5 sec elapsed 6 CPU-seconds

Summary