Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Management and Index Options for SQL Server Data Warehouses Atlanta MDF.

Similar presentations


Presentation on theme: "Data Management and Index Options for SQL Server Data Warehouses Atlanta MDF."— Presentation transcript:

1 Data Management and Index Options for SQL Server Data Warehouses Atlanta MDF

2 Lance England http://lance-england.com

3 Goal To present options to consider for your data warehouse. The options can be used separately or combined.

4 Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

5 Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

6 Data Warehouse Overview

7 Dimensional Model

8 Dimension Tables Many columns, denormalized for speed and simplicity Have both surrogate keys and business keys Often capture changes over time (slowly changing dimension)

9 Fact Tables A few columns that capture the measures of interest Many key columns to dimension tables Can grow very large To improve the scalability and manageability of large tables, we can partition the data. We can do this with partitioned views and/or partitioned tables.

10 Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

11 Partitioned Views (SQL 7) Multiple tables, but presented as a single logical entity with a view CHECK constraint on each table enables partition elimination in queries Use UNION ALL in view definition to prevent engine from attempting de-duplication

12 Partitioned Views (SQL 7) Pros Adding/dropping tables is easy Able to update statistics for a single partition (table) Can partition on multiple columns Cons Changing schema/indexes is multiplied by number of tables

13 Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

14 Partitioned Tables (SQL 2005) The TABLE is the single logical entity, with partitioning abstracted away by the storage engine Implemented with partition function and partition scheme Can participate in partition elimination

15 Partitioned Tables (SQL 2005) Pros ALTER only one table Uniform indexes help query analyzer Can load/drop partitions with partition switching Cons Cant update statistics at partition level Partition switching more complicated than adding/dropping tables Single partitioning column only

16 Partition Switching Quickly add/remove partitions with data Create a separate schema-identical table (with a CHECK constraint if switching IN) Minimally logged meta-data operation

17 SPLIT/MERGE Warning Avoid split or merge of populated partitions. Splitting or merging populated partitions can be extremely inefficient, as this may cause as much as 4 times more log generation, and also cause severe locking. http://sqlcat.com/sqlcat/b/top10lists/archive/2008/02/06/top-10-best- practices-for-building-a-large-scale-relational-data-warehouse.aspx

18 Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

19 Compression (SQL 2008) Trading CPU usage for less I/O (more rows per page on disk and cache) Object-level, not a database setting Two types: Row or Page compression Most opinions are to opt for page compression first, and scale back to row compression (or none) if CPU usage is unacceptable

20 Row Compression Removes unused space at data-type level – Variable-length storage for numbers and dates – Variable-length storage for CHAR, NCHAR Optimizes storage of NULL and zero

21 Page Compression 1.Row compression 2.Prefix compression – replaces repeated sequences in each column with a reference to the CI structure 3.Dictionary compression – replaces repeated sequences on the page with a reference to the CI structure

22 Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

23 Indexed Views (SQL 2005) Great for aggregating data Long list of restrictions (see link): When aggregating – Clustering key can only reference columns in GROUP BY – View definition must contain COUNT_BIG(*) Can participate in partition switching (with extra steps) http://msdn.microsoft.com/en-us/library/ms191432.aspx

24 Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

25 Filtered Indexes (SQL 2008) Index with a WHERE clause Filtering column does not have to be included in index Best for – Sparse columns (many NULLs) – Heterogeneous columns with categories of data (think FK columns) – Columns with ranges of values such as dollar amounts, time, and dates.

26 Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

27 Columnstore Indexes (SQL 2012) Pros Stores data by column; more efficient page retrieval Optimized compression Segment elimination (~1 million rows/per) Batch-mode evaluation – Query operators pass batch of 1000 instead of row-by-row (not all operators supported) – Affected by #of cores and MAXDOP setting

28 Columnstore Indexes Cons Table cant be updated once created Workaround choices: – Disable index/load/rebuild – Partition switching – Partitioned view Not good for single-record SEEK queries

29 Clustered Columnstore (SQL 2014) Less storage: Not a copy of data Updatable Improved batch mode support

30 Clustered Columnstore

31 Recap Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

32 Linkapalooza DATA WAREHOUSE Microsoft EDW Architecture, Guidance and Deployment Best Practices http://msdn.microsoft.com/en-us/library/hh147624.aspx The Data Loading Performance Guide http://msdn.microsoft.com/en-us/library/dd425070(v=sql.100).aspx Top 10 Best Practices for Building a Large Scale Relational Data Warehouse http://sqlcat.com/sqlcat/b/top10lists/archive/2008/02/06/top-10-best- practices-for-building-a-large-scale-relational-data-warehouse.aspx

33 Linkapalooza PARTITIONED TABLES Partitioned Table and Index Strategies Using SQL Server 2008 http://technet.microsoft.com/en-us/library/dd578580%28v=sql.100%29.aspx Strategies for Partitioning Relational Data Warehouses in Microsoft SQL Server http://technet.microsoft.com/en-us/library/cc966457.aspx Table Partitioning Resources http://www.brentozar.com/sql/table-partitioning-resources/ Query Processing Enhancements on Partitioned Tables and Indexes http://msdn.microsoft.com/en-us/library/ms345599.aspx

34 Linkapalooza DATA COMPRESSION Data Compression: Strategy, Capacity Planning and Best Practices http://msdn.microsoft.com/en-us/library/dd894051%28v=sql.100%29.aspx

35 Linkapalooza INDEXED VIEWS Creating Indexed Views (long list of restrictions) http://msdn.microsoft.com/en-us/library/ms191432.aspx Partition Switching with Indexed Views http://technet.microsoft.com/en-us/library/bb964715(v=sql.105).aspx FILTERED INDEXES Filtered Index Design Guidelines http://msdn.microsoft.com/en-us/library/cc280372(v=sql.100).aspx

36 Linkapalooza COLUMNSTORE INDEXES Books Online http://msdn.microsoft.com/en-us/library/gg492088.aspx Columnstore Internals http://rusanu.com/2012/05/29/inside-the-sql-server-2012-columnstore- indexes/ SQL Server Columnstore Performance Tuning http://social.technet.microsoft.com/wiki/contents/articles/4995.sql-server- columnstore-performance- tuning.aspx#Ensuring_use_of_the_Fast_Batch_Mode_of_Query_Execution

37 Linkapalooza CLUSTERED COLUMNSTORE INDEXES (SQL 2014) http://rusanu.com/2013/06/11/sql-server-clustered-columnstore-indexes-at- teched-2013/ TechEd Video http://channel9.msdn.com/Events/TechEd/NorthAmerica/2013/DBI- B322#fbid=rAFPjiEmlNt

38 Thanks! http://lance-england.com


Download ppt "Data Management and Index Options for SQL Server Data Warehouses Atlanta MDF."

Similar presentations


Ads by Google