Thomas Kejser Senior Program Manager Microsoft Corp. Introducing Parallel Data Warehouse (The project formerly known as Madison)

2 Agenda The Typical problem with data warehouses MPP vs SMP SQL Server Parallel Data Warehouse Hardware architecture Query Processing Data Loading My email: tkejser@microsoft.comtkejser@microsoft.com

3 Introducing Parallel Data Warehouse The Typical Problem with Data Warehouses

4 SQL Data Warehouse – Yours? Big SAN Big 128-core Server Connected together What’s wrong with this picture?

5 System out of balance This server can consume 32 GB/Sec of I/O, but the SAN can only deliver 2 GB/Sec Even when the SAN is dedicated to the SQL Data Warehouse, which it often isn ’ t Lots of disks for Random IOPS BUT Limited controllers  Limited I/O bandwidth System is typically I/O bound Queries are slow, DBA complains about storage And he is right!...s Result: significant investment, not delivering performance

6 Balanced System: CPU Simple example: Assume TPCH query 2 is your average query Run the query on a test server with data fully cached in memory Execute parallel query using typical MAXDOP Observe 100% CPU on 4 cores Time the query and observe # pages read Per Core Consumption = (# Logical Reads* 8K)/(CPU Time)

7 You can get more sophisticated … Queries performing complex calculations, format conversions, multi-dimension hash joins, etc. will be more cpu-intensive i.e. complex queries will consume data at a slower per-core rate than simpler queries You may have to do a weighted average of typical queries Remember to take nightly ETL loads into consideration If any... Is this realistically going to happen in your company? We did the homework for you... Your milage will vary..

8 Introducing Parallel Data Warehouse Fast Track

9 Fast Track Data Warehouse Components Software: SQL Server 2008 Enterprise Windows Server 2008 Hardware: Tight specifications for servers, storage and networking ‘ Per core ’ building block Configuration guidelines: Physical table structures Indexes Compression SQL Server settings Windows Server settings Loading

10 The number we scale on … We ’ ve measured a mix of TPCH queries that reflect a ‘ prototype ’ Data Warehouse workload Concluded that SQL Sever 2008 on current x64 cores consume ~200 MB/Sec per core on average for this workload We use this as a basis for the published reference architectures Your mileage will vary!

11 Microsoft DW Solutions SSRSSSRSSSASSSAS SSISSSIS Microsoft & Partner Services

12 Symmetric Multi-Processing vs. Massively Parallel Processing HW advancements increasing ability to scale-up But scaling limited by design High end SMP very expensive Extremely high concurrency for simple workloads Less than 1-2 TB of data SMP will almost always be better. At higher sizes - depends HW advancements increasing ability to scale- out Scaling to 1 PB+ Scale out is relatively low cost Relatively high concurrency for complex workloads > 2TB up to 1 PB for DW workloads Data Warehousing (esp. VLDB, complex workloads) OLTP, Transactional, Data Warehousing MPPMPPSMPSMP

13 PDW: No Assembly Required Software Servers Storage arrays Network switches Cables Licenses Power distribution units Racks Comes fully assembled Software is installed at the factory Fully configured

14 Basic Building Blocks Compute Nodes Handles the CPU cycles required to answer queries Storage Nodes Stores data using Fiber Attached Disks. Scaled to support CPU with enough throughput Other nodes More about those later

15 Anatomy of a Compute Node Pre-configured For Each SQL Server Instance On Each Compute Node. Drives Configured As RAID1 To Avoid Appliance Failover for a Single Drive Failure IBM Compute Nodes Will Have 1 Lun (1 RAID1 Pair) Dell Compute Nodes Will Have 2 Lun ’ s (2 RAID1 Pairs) HP Compute Nodes Will Have 3 Luns ’ s (3 RAID1 Pairs) TempDB: Sort-work Area For Data Loading Into Clustered Index Tables Work Area for PDW Temporary Work Files Spill Area For Hash Joins Not Fitting Into Memory

16 Anatomy of a Storage Node Pre-configured 4 RAID10 Pairs for Primary User Data 1 RAID10 Pair for Database Logs 2 LUN ’ s Are Spread Across Each RAID Pair User Databases are Separate Physical SQL Server Databases Staging Database (Optional) Used for Loading & to Minimize Fragmentation

17 More Node Types Backup node: Stores backup files from the appliance Can be logged into by authorized Windows users Can be augmented with 3rd party H/W and S/W Landing Zone: Used as a holding place for data to be loaded Can be logged into by authorized Windows users Can be augmented with 3rd party H/W and S/W Management node: Runs the Windows domain controller (Active Directory) Used for deploying patches to all nodes in the appliance Holds images in case a node needs reimaging

18 Putting It All Together - PDW Control Node Failover Protection: Redundant Control Node Redundant Compute Node Cluster Failover Redundante Array of Inexpensive Databases Spare Node

19 Software Architecture SQL Server DW Authentication DW Configuration DW Schema DW Schema TempDB MPP Engine Data Movement Service IIS Compute Nodes Compute Node Query Tool SQL Server Data Movement Service User Data Admin Console MS BI (AS, RS) MS BI (AS, RS) Control Node Other 3 rd Party Tools OLEDB, ODBC, ADO.Net, JDBC DWSQL Internet Explorer Landing Zone Node Data Movement Service

20 Create Database CREATE DATABASE database_name WITH ( AUTOGROW = ON, REPLICATED_SIZE = 1024, DISTRIBUTED_SIZE = 16384, LOG_SIZE = 300 )

21 Date Dim D _ DATE _ SK D _ DATE _ ID D _ DATE D _ MONTH … Date Dim D _ DATE _ SK D _ DATE _ ID D _ DATE D _ MONTH … Item I _ ITEM _ SK I _ ITEM _ ID I _ REC _ START _ DATE I _ ITEM _ DESC … Item I _ ITEM _ SK I _ ITEM _ ID I _ REC _ START _ DATE I _ ITEM _ DESC … Store Sales Ss _ sold _ date _ sk Ss _ item _ sk Ss _ customer _ sk Ss _ cdemo _ sk Ss _ store _ sk Ss _ promo _ sk Ss _ quantity … Store Sales Ss _ sold _ date _ sk Ss _ item _ sk Ss _ customer _ sk Ss _ cdemo _ sk Ss _ store _ sk Ss _ promo _ sk Ss _ quantity … Promotion P _ PROMO _ SK P _ PROMO _ ID P _ START _ DAT E _ SK P _ END _ DATE _ SK … Promotion P _ PROMO _ SK P _ PROMO _ ID P _ START _ DAT E _ SK P _ END _ DATE _ SK … Store S _ STORE _ SK S _ STORE _ ID S _ REC _ START _ D ATE S _ REC _ END _ DAT E S _ STORE _ NAME … Store S _ STORE _ SK S _ STORE _ ID S _ REC _ START _ D ATE S _ REC _ END _ DAT E S _ STORE _ NAME … Customer C- C USTOMER _ SK C _ CUSTOMER _ I D C _ CURRENT _ AD DR … Customer C- C USTOMER _ SK C _ CUSTOMER _ I D C _ CURRENT _ AD DR … Customer Demographics C D _ DEMO _ SK C D _ GENDER C D _ MARITAL _ STAT US C D _ EDUCATION … Customer Demographics C D _ DEMO _ SK C D _ GENDER C D _ MARITAL _ STAT US C D _ EDUCATION … Database Distributed & Replicated Tables C C I I D D CD S S P P C C I I D D S S P P C C I I D D S S P P C C I I D D S S P P C C I I D D S S P P C C I I D D S S P P SS Distribution and Replication

22 Table Creation CREATE TABLE table_name [ ( { } [,...n ] ) [ AS SELECT select_criteria ] [ WITH ( ) ] [;] ::= column_name [ NULL | NOT NULL ] ::= type_name [ ( precision [, scale ] ) ] ::= { [ CLUSTER_ON ( column_name [,...n ] ) ], [ DISTRIBUTE_ON ( column_name ) ] | [ REPLICATE ], [ PARTITION_ON column_name ( RANGE { LEFT | RIGHT } FOR VALUES { [ boundary_value [,...n] ] ) ) ] } Type ClassTypes Supported Integerstinyint, smallint, int, bigint Floating pointfloat, real Characterchar, varchar, nchar, nvarchar Date & timedate, time, datetime, dateime2, datetimeoffset, timestamp, smalldatetime Fixed pointdecimal, money, smallmoney Binarybinary, varbinary (8192) Otheruniqueidentifier (?)

23 Create Table – Behind the Scenes Create Table store_sales with distribute_on (ss_item_sk) partition_on(ss_sold_date_sk) cluster_on (ss_sold_date_sk) 8K 8 Filegroups (one per core) - 1 Table per Filegroup 12 Partitions (ss_sold_date_sk) N-number of Pages Row

24 Physical File Layout (Per Compute Node)

25 MPP Query Processing Control Node Query Rewritten Into Steps That Run Efficiently On Compute Nodes ODBC/JDBC SQL92 with Analytical Extensions Distribution-incompatible Joins Resolved Using High Speed Dynamic Re-distribution Select location, year sum(b.sales_amt) from customer a, sales b where b.sales > 500 and a.custid = b.custid group by 2,1 order by 1,2

26 MPP Execution Plans The MPP engine creates parallel execution plans from client SQL The plans can include the following types of operations: SQL operations: used to pass SQL directly to SQL Server on 1 or more nodes. DMS operations: used to move data among the nodes in an appliance for further processing. Temp tables operations: used to stage data for further processing. Return operations: push data back to the client. Simple plans may include just one type of operation. Complex plans may include all of these operations. Plans are executed serially, one step at a time.

27 Date Dim D _ DATE _ SK D _ DATE _ ID D _ DATE D _ MONTH … Date Dim D _ DATE _ SK D _ DATE _ ID D _ DATE D _ MONTH … Item I _ ITEM _ SK I _ ITEM _ ID I _ REC _ START _ DATE I _ ITEM _ DESC … Item I _ ITEM _ SK I _ ITEM _ ID I _ REC _ START _ DATE I _ ITEM _ DESC … Store Sales Ss _ sold _ date _ sk Ss _ item _ sk Ss _ customer _ sk Ss _ cdemo _ sk Ss _ store _ sk Ss _ promo _ sk Ss _ quantity … Store Sales Ss _ sold _ date _ sk Ss _ item _ sk Ss _ customer _ sk Ss _ cdemo _ sk Ss _ store _ sk Ss _ promo _ sk Ss _ quantity … Promotion P _ PROMO _ SK P _ PROMO _ ID P _ START _ DAT E _ SK P _ END _ DATE _ SK … Promotion P _ PROMO _ SK P _ PROMO _ ID P _ START _ DAT E _ SK P _ END _ DATE _ SK … Store S _ STORE _ SK S _ STORE _ ID S _ REC _ START _ D ATE S _ REC _ END _ DAT E S _ STORE _ NAME … Store S _ STORE _ SK S _ STORE _ ID S _ REC _ START _ D ATE S _ REC _ END _ DAT E S _ STORE _ NAME … Customer C- C USTOMER _ SK C _ CUSTOMER _ I D C _ CURRENT _ AD DR … Customer C- C USTOMER _ SK C _ CUSTOMER _ I D C _ CURRENT _ AD DR … Customer Demographics C D _ DEMO _ SK C D _ GENDER C D _ MARITAL _ STAT US C D _ EDUCATION … Customer Demographics C D _ DEMO _ SK C D _ GENDER C D _ MARITAL _ STAT US C D _ EDUCATION … Sales table distributed on customer... And partitioned by time Example Schema

28 Distribution Compatible Query SELECT CustomerId, SUM(Amount) AS TotalSales, SUM(Quantity) AS TotalUnitsSold FROM Sales s JOIN Item i ON s.ItemId = i.ItemId WHERE SaleDate BETWEEN '2009-08-01' AND '2009-08-31‘ AND Description LIKE '%gadgets%' GROUP BY CustomerId ORDER BY CustomerId;

29 MPP Query Plan Step 1 – On each compute node: SELECT s.[customerid], sum(s.[amount]) AS totalsales, sum(s.[quantity]) AS totalunitssold FROM [tpch_3].[dbo].[h_sales_34] s JOIN [tpch_3].[dbo].item_37 I ON (s.[itemid] = i.[itemid]) WHERE (s.[saledate] BETWEEN '2009-08-01' AND '2009-08-31' and i.[description] like '%gadgets%') GROUP BY s.[customerid] ORDER BY s.[customerid];

30 Query 1 Processing Flow SQL Server DW Authentication DW Configuration DW Schema DW Schema TempDB Data Movement Service Compute Node 1 Query Tool SQL Server Data Movement Service User Data Control Node MPP Engine Parse SQL Validate & Authorize Build MPP Plan Execute Plan Return Data to Client Compute Node N SQL Server Data Movement Service User Data

31 Reshuffling the data SELECT SaleDate, SUM(Amount) AS TotalSales, SUM(Quantity) AS TotalUnitsSold FROM Sales s JOIN Item i ON s.ItemId = i.ItemId WHERE SaleDate BETWEEN '2009-08-01' AND '2009-08-31' AND Description LIKE '%gadgets%‘ GROUP BY SaleDate ORDER BY SaleDate;

32 MPP Query Plan Step 1 – Create temp table on control node CREATE TABLE [tempdb].[dbo].Q_[TEMP_ID_6760] ( saledate DATE, totalsales DECIMAL(38, 2), totalunitssold INTEGER ) WITH (DATA_COMPRESSION = PAGE); Step 2 – Run on each compute node SELECT s.[saledate], sum(s.[amount]) AS totalsales, sum(s.[quantity]) AS totalunitssold FROM [tpch_3].[dbo].[h_sales_34] s JOIN [tpch_3].[dbo].item_37 i ON (s.[itemid] = i.[itemid]) WHERE (s.[saledate] BETWEEN '2009-08-01' AND '2009-08-31' and i.[description] like '%gadgets%’) GROUP BY s.[saledate]

33 MPP Query Plan continued Step 3: SELECT [saledate], sum([totalsales]) AS totalsales, sum([totalunitssold]) AS totalunitssold FROM [tempdb].[dbo].Q_[TEMP_ID_6760] GROUP BY [saledate] ORDER BY [saledate] Step 4: DROP TABLE [tempdb].[dbo].Q_[TEMP_ID_6760];

34 Reshuffling – Query Processing Flow SQL Server DW Authentication DW Configuration DW Schema DW Schema TempDB Data Movement Service Compute Node Query Tool SQL Server Data Movement Service User Data Control Node MPP Engine Parse SQL Validate & Authorize Build MPP Plan Execute Plan Return Data to Client Compute Node SQL Server Data Movement Service User Data

35 Control Node Spare Node Landing Zone Node Node Landing Zone Node Node Text File Data Loading Tables Are Hash Distributed Or Replicated

36 Load File Bulk Insert Partitioned Staging Table (Heap) Insert-Select Partitioned Final Table (CIDX) Sort each BATCH in memory or TempDB Sort each partition In memory or TempDB Bulk Insert Phase Trace FlagsNone BATCHSIZECalculated TABLOCKON TempDBEntire BATCHSIZE for Sort TempDB Log Minimal StageDB Log Minimal ROLLBACK Commits per BATCHSIZE Rollback to last BATCH Only Trace Flags610 per NUMA Session MAXDOP1 Per NUMA Session TABLOCKOFF TempDBEntire PARTITION for sort TempDB Log Minimal UserDB LogTwice Data File Size ROLLBACK Commits Full TRANSACTION Rollback Full TRANSACTION Insert-Select Phase Data Loader Process

37 © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Thomas Kejser Senior Program Manager Microsoft Corp. Introducing Parallel Data Warehouse (The project formerly known as Madison)

Similar presentations

Presentation on theme: "Thomas Kejser Senior Program Manager Microsoft Corp. Introducing Parallel Data Warehouse (The project formerly known as Madison)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Thomas Kejser Senior Program Manager Microsoft Corp. Introducing Parallel Data Warehouse (The project formerly known as Madison)

Similar presentations

Presentation on theme: "Thomas Kejser Senior Program Manager Microsoft Corp. Introducing Parallel Data Warehouse (The project formerly known as Madison)"— Presentation transcript:

Similar presentations

About project

Feedback