Download presentation
Presentation is loading. Please wait.
Published byBilly Bellar Modified over 9 years ago
1
1 © Copyright 2009 EMC Corporation. All rights reserved. Data Warehousing Features in SQL Server 2008 James Rowland-Jones @jrowlandjones
2
2 © Copyright 2009 EMC Corporation. All rights reserved. Official DW Feature Set in SQL 2008 BuildManageDeliver Insight SQL Server RDBMS MERGE statement Change data capture (CDC) Minimally logged INSERT Backup compression Star join performance Faster parallel query on partitioned tables GROUPING SETS Resource governor Data compression Partition-aligned indexed views Integration Services Lookup performance Pipeline performance Analysis Services Backup MDX Query Performance: Block Computation Query & Write-back Performance Scalable Shared Database Reporting Services Reporting scalability Server scalability
3
3 © Copyright 2009 EMC Corporation. All rights reserved. JRJ’s DW Feature Set in SQL 2008 BuildManageDeliver Insight SQL Server RDBMS MERGE statement Change data capture (CDC) Minimally logged INSERT & TF 610 NEW Data Types Backup compression Star join performance Faster parallel query on partitioned tables Few Outer Rows Parallelism GROUPING SETS ISOWEEK IN DATEPART Resource governor Data compression Partition-aligned indexed views Partition index rebuilds Filtered Indexes Integration Services Lookup performance Pipeline performance Data Profiling Task Analysis Services Backup MDX Query Performance: Block Computation Query & Writeback Performance Scalable Shared Database Reporting Services Reporting scalability Server scalability
4
4 © Copyright 2009 EMC Corporation. All rights reserved. What We’ll Focus On BuildManageDeliver Insight SQL Server RDBMS MERGE statement Change data capture (CDC) Minimally logged INSERT & TF 610 NEW Data Types Backup compression Star join performance Faster parallel query on partitioned tables Few Outer Rows Parallelism GROUPING SETS ISOWEEK IN DATEPART Resource governor Data compression Partition-aligned indexed views Partition index rebuilds Filtered Indexes Integration Services Lookup performance Pipeline performance Data Profiling Task Analysis Services Backup MDX Query Performance: Block Computation Query & Writeback Performance Scalable Shared Database Reporting Services Reporting scalability Server scalability
5
5 © Copyright 2009 EMC Corporation. All rights reserved. Data Compression Enterprise Edition Only Row and Page Compression Compression Ratio 2 to 1 or 3 to 1 - 50% to 70% reduction in data Can be for a table, index or a subset of their partitions Estimate savings: exec sp_estimate_data_compression_savings Max row size plus compression overhead must not exceed 8060 bytes
6
6 © Copyright 2009 EMC Corporation. All rights reserved. Compression Alert
7
7 © Copyright 2009 EMC Corporation. All rights reserved. Monitoring Compression SQL Server, Access Methods Object Page compression attempts/sec Pages compressed/sec Compression Statistics for individual Partitions Dynamic Management Function sys.dm_db_index_operational_stats
8
8 © Copyright 2009 EMC Corporation. All rights reserved. DEMO TIME Resource Governor (Quickly) Data Compression
9
9 © Copyright 2009 EMC Corporation. All rights reserved. P & P PartitioningParallelism
10
10 © Copyright 2009 EMC Corporation. All rights reserved. Partitioning & Parallelism Partition Table Parallelism Few Outer Rows Parallelism Partition-Aligned Indexed Views SQL 2005 behaviour – needs to be dropped before switch Switch Partition Pulls across indexed view Rebuild index partition
11
11 © Copyright 2009 EMC Corporation. All rights reserved. What is a Partitioned Table? P1P4P3P2 SELECT SUM(Sales_Qty) as Sales_Qty, SUM(Sale_Amt) as Sales_Amount FROM SalesDB.dbo.Tbl_Fact_Sales WHERE date_id between '20050703' and '20050716'
12
12 © Copyright 2009 EMC Corporation. All rights reserved. The “Problem” in SQL 2005 RowsExecutesStmtText 11 SELECT SUM([Sales_Qty]) [Sales_Qty],SUM([Sale_Amt]) [Sales_Amount] FROM [SalesDB].[dbo].[Tbl_Fact_Sales] WHERE [date_id]>=@1 AND [date_id]<=@2 00 |--Compute Scalar(DEFINE:([Expr1002]=CASE WHEN [globalagg1008]=(0) THEN NULL ELSE [globalagg1010] END, [Expr1003]=CASE WHEN [globalagg1012]=(0) THEN NULL ELSE [globalagg1014] END)) 11 |--Stream Aggregate(DEFINE:([globalagg1008]=SUM([partialagg1007]), [globalagg1010]=SUM([partialagg1009]), [globalagg1012]=SUM([partialagg1011]), [globalagg1014]=SUM([partialagg1013]))) 21 |--Parallelism(Gather Streams) 212 |--Stream Aggregate(DEFINE:([partialagg1007]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1009]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1011]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt]), [partialagg1013]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt]))) 2057723512 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1006]) PARTITION ID:([PtnIds1006])) 212 |--Parallelism(Distribute Streams, Demand Partitioning) 21 | |--Constant Scan(VALUES:(((80)),((81)))) 205772352 |--Index Seek(OBJECT:([SalesDB].[dbo].[Tbl_Fact_Sales].[IX_Tbl_Fact_Sales_SKDteItmStrIDSalQtySalAmtDiscMkd] AS [ss]), SEEK:([ss].[SK_Date_ID] >= (20050703) AND [ss].[SK_Date_ID] <= (20050716)) ORDERED FORWARD PARTITION ID:([PtnIds1006]))
13
13 © Copyright 2009 EMC Corporation. All rights reserved. Partitioning & Parallelism Compared P1P4P3P2 P2P2 P1P4P3P2 P2P2 SQL Server 2005 SQL Server 2008
14
14 © Copyright 2009 EMC Corporation. All rights reserved. Work Around for SQL Server 2005 UNION SELECT SUM(Sales_Qty) as Sales_Qty, SUM(Sale_Amt) as Sales_Amount FROM SalesDB.dbo.Tbl_Fact_Sales WHERE date_id between '20050703' and '20050709' SELECT SUM(Sales_Qty) as Sales_Qty, SUM(Sale_Amt) as Sales_Amount FROM SalesDB.dbo.Tbl_Fact_Sales WHERE date_id between '20050710' and '20050716'
15
15 © Copyright 2009 EMC Corporation. All rights reserved. Few Outer Rows Parallelism SQL 2005 One thread given per page of rows on a nested loop join SQL 2008 One thread given per row on a nested loop join Good for Joins to Date Dim M$ internal DW Scale Benchmark perf increase by 30% SELECTd.Date_Desc,SUM(f.Sale_Amt*f.Sales_Qty) FROM Tbl_Fact_Store_Sales f JOIN Tbl_Dim_Date d ONf.sk_date_id = d.sk_date_id WHERE d.date_value between '10/1/2004' and '10/7/2004' GROUP BYd.Date_Desc
16
16 © Copyright 2009 EMC Corporation. All rights reserved. Work-Around’s for SQL Server 2005 STUFF YOUR ROW Add a JUNK Col on the Date dimension to force one row per page CLUSTER ON A GUID Add a column and populate with GUIDs to encourage Rows onto separate pages
17
17 © Copyright 2009 EMC Corporation. All rights reserved. Partition Aligned Indexed Views The Big Chore was “Sliding” a table with an indexed view on it. In 2005 this needed to be dropped In 2008 it does not
18
18 © Copyright 2009 EMC Corporation. All rights reserved. IT’S DEMO TIME Sliding Window with Indexed View in Place Rebuild Partitioned Index Filtered Indexes
19
19 © Copyright 2009 EMC Corporation. All rights reserved. STAR JOINS “Optimized” Bitmap Filters What is a Bitmap filter –In memory structure (no index overhead) –Created dynamically –Typically quite small in size Bitmap Filter SQL 2005 –What it was in 2005... –Hash or Merge JOIN Optimised Bitmap Filter SQL 2008 –Enterprise Edition –Parallel Query –Hash JOIN only –Fact table must have > 100 pages –Single Column join (No PK FK relationship requirement)(integer needed for optimized) –Dimension input cardinalities are smaller than fact input cardinalities –Look for Bitmap warning event for missed opportunities to use Bitmap
20
20 © Copyright 2009 EMC Corporation. All rights reserved. Minimally Logged Inserts & TF 610
21
21 © Copyright 2009 EMC Corporation. All rights reserved. Bulk Load Methods Compared
22
22 © Copyright 2009 EMC Corporation. All rights reserved. FOR THE FINAL TIME STAR JOINS Minimally Logged INSERTS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.