Indexing Your Data Warehouse Troy Gallant, MTA. Agenda  A little about me  Indexing review  Enterprise Data Warehouse (EDW) vs. OLTP  EDW structure.

Slides:



Advertisements
Similar presentations
Data Management and Index Options for SQL Server Data Warehouses Atlanta MDF.
Advertisements

Big Data Working with Terabytes in SQL Server Andrew Novick
Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
Dimensional Modeling Business Intelligence Solutions.
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Building a Data Warehouse with SQL Server Presented by John Sterrett.
Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313.
Team Dosen UMN Physical DB Design Connolly Book Chapter 18.
SQL Azure Administration and Application Self-Servicing Michal Lesiczka Program Manager Microsoft Corporation Vinod Jagannathan Program Manager Microsoft.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
Module 9: Managing Schema Objects. Overview Naming guidelines for identifiers in schema object definitions Storage and structure of schema objects Implementing.
Module 8 Improving Performance through Nonclustered Indexes.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Dimensional model. What do we know so far about … FACTS? “What is the process measuring?” Fact types:  Numeric Additive Semi-additive Non-additive (avg,
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
1 Data Warehouses BUAD/American University Data Warehouses.
SQL Server Indexes Indexes. Overview Indexes are used to help speed search results in a database. A careful use of indexes can greatly improve search.
Table Indexing for the.NET Developer Denny Cherry twitter.com/mrdenny.
Data Management Console Synonym Editor
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
© Pearson Education Limited, Chapter 13 Physical Database Design – Step 4 (Choose File Organizations and Indexes) Transparencies.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
Denny Cherry twitter.com/mrdenny.
T-SQL: Simple Changes That Go a Long Way DAVE ingeniousSQL.com linkedin.com/in/ingenioussql.
Methodology – Physical Database Design for Relational Databases.
UNIT-II Principles of dimensional modeling
Indexes and Views Unit 7.
University of Sunderland COM 220 Lecture Ten Slide 1 Database Performance.
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 10 Database Management. Data and Information How are data and information related? p Fig Next processing data stored on disk Step.
Session 1 Module 1: Introduction to Data Integrity
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Business Intelligence Training Siemens Engineering Pakistan Zeeshan Shah December 07, 2009.
Chapter 3: Relational Databases
Virtual techdays INDIA │ august 2010 Filtered Indexes – The unexplored index … Vinod Kumar M │ Microsoft India Technology Evangelist – DB and BI.
SQL Server as a Data Warehousing Platform
Or How I Learned to Love the Cube…. Alexander P. Nykolaiszyn BLOG:
APRIL 13 th Introduction About me Duško Mirković 7 years of experience.
Execution Plans Detail From Zero to Hero İsmail Adar.
SQL Basics Review Reviewing what we’ve learned so far…….
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy.
Best Practices for Columnstore Indexes Warner Chaves SQL MCM / MVP SQLTurbo.com Pythian.com.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Temporal Databases Microsoft SQL Server 2016
Temporal Databases Microsoft SQL Server 2016
IBM DATASTAGE online Training at GoLogica
Applying Data Warehouse Techniques
The Ins and Outs of Indexes
Getting To Know Your Indexes
Data Warehouse Indexes
Applying Data Warehouse Techniques
The Ins and Outs of Indexes
The PROCESS of Queries John Deardurff
Microsoft SQL Server 2014 for Oracle DBAs Module 7
Applying Data Warehouse Techniques
The PROCESS of Queries John Deardurff
Four Rules For Columnstore Query Performance
The Ins and Outs of Indexes
Applying Data Warehouse Techniques
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Applying Data Warehouse Techniques
All about Indexes Gail Shaw.
The Ins and Outs of Indexes
Dmytro Polishchuk BI Developer DB Best Technologies
Presentation transcript:

Indexing Your Data Warehouse Troy Gallant, MTA

Agenda  A little about me  Indexing review  Enterprise Data Warehouse (EDW) vs. OLTP  EDW structure  EDW indexing  Too many / too few  Considerations  Dimension / fact indexing  Maintenance

Bio  15 years as a database professional  Last 2 yrs in NYC, all previous in Jax  Microsoft MTA certified  Speaker – 16x SQL Saturday, 4x JSSUG  Working on MS in IT Mgmt  Twitter:  LinkedIn:  Website: 

Indexing Review  Broad definition  What an index DOES do.  What an index DOESN’T do.

Types of Indexes  Heap*  Clustered  Non-clustered  Non-clustered w/ included columns  Unique  Full-text  Spatial  Filtered  XML  Columnstore

EDW vs. OLTP (pt. 1)  EDW definition  Single, complete, consistent  Decision-support  Integrate divergent information  Historical

EDW vs. OLTP (pt. 2)  Comparisons  Integrated data vs. application-specific  Current/Historical data vs. current data  Non-volatile vs. updated  Encoded vs. descriptive  Detailed/summarized vs. raw

EDW Structure  Source  Staging  Storage  Dimensions  Fact tables  Presentation

EDW Indexing (pt. 1)  Too few indexes  Data loads quickly  QRT suffers  Too many indexes  Data loads slowly  QRT improves  Storage requirements increase

EDW Indexing (pt. 2)  Major considerations  Warehouse type  Size of tables  Access  How?  Who?  What?  Storage requirements  Response-time expectations

EDW Indexing (pt. 3)  Dimensions  Clustered Index on business/natural key  Identifier from the source system  Enhances response time when this business key is used in a WHERE clause  NCI(s)  Surrogate key  Usually the primary key  Meaningful only to the source system  Will expedite loads  Other columns found to be accessed frequently in searches, sorting, or grouping  Consider columns included in a hierarchy

EDW Indexing (pt. 4)  Date & time dimensions  No business key  Consider a smart PK and cluster on it  YYYYMMDD  HHMMSSSS  A smart key will retain proper order and range queries will be simplified as you will need one less join because the PK already contains the date/time

EDW Indexing (pt. 5)  Type 2 SCD  Consider adding a 4-pt NCI that includes…  The business key  The record begin date  The record end date  The surrogate key  CREATE NONCLUSTERED INDEX MyDim_CoveringIndex ON (NaturalKEY, RecordStartDate) INCLUDE ( RecordEndDate, SurrogateKEY)  Can be very useful during ETL as well as for historical queries

EDW Indexing (pt. 6)  Fact table  Similar to indexing a dimension with an eye towards partitioning  Usually best to cluster on the date key or date/time key  If table is partitioned on a date column, use that column as the clustering key  Create NCI’s on each of the FK’s in the fact table  Consider combining the FK and date key (in that order) to enhance query response  Watch storage requirements

Modifying the Scheme  Over time your data warehouse will change to accommodate what’s happening in your organization  Use tried-and-true transactional methods for tuning indexes…  DTA  Execution plans  DMV’s  sys.dm_db_index_usage_stats  sys.dm_db_index_operational_stats  sys.dm_db_missing_index_details  sys.dm_db_missing_index_columns  sys.dm_db_missing_index_group_stats  sys.dm_db_missing_index_groups

Thank you!!! Twitter: LinkedIn: Web: