Download presentation
Presentation is loading. Please wait.
Published byShanna Thornton Modified over 8 years ago
1
Indexing Your Data Warehouse Troy Gallant, MTA
2
Agenda A little about me Indexing review Enterprise Data Warehouse (EDW) vs. OLTP EDW structure EDW indexing Too many / too few Considerations Dimension / fact indexing Maintenance
4
Bio 15 years as a database professional Last 2 yrs in NYC, all previous in Jax Microsoft MTA certified Speaker – 16x SQL Saturday, 4x JSSUG Working on MS in IT Mgmt Twitter: @GratefulDBA@GratefulDBA LinkedIn: https://www.linkedin.com/in/tgallanthttps://www.linkedin.com/in/tgallant Website: http://www.troygallant.comhttp://www.troygallant.com Email: tgallant@outlook.comtgallant@outlook.com
5
Indexing Review Broad definition What an index DOES do. What an index DOESN’T do.
7
Types of Indexes Heap* Clustered Non-clustered Non-clustered w/ included columns Unique Full-text Spatial Filtered XML Columnstore
8
EDW vs. OLTP (pt. 1) EDW definition Single, complete, consistent Decision-support Integrate divergent information Historical
9
EDW vs. OLTP (pt. 2) Comparisons Integrated data vs. application-specific Current/Historical data vs. current data Non-volatile vs. updated Encoded vs. descriptive Detailed/summarized vs. raw
11
EDW Structure Source Staging Storage Dimensions Fact tables Presentation
12
EDW Indexing (pt. 1) Too few indexes Data loads quickly QRT suffers Too many indexes Data loads slowly QRT improves Storage requirements increase
14
EDW Indexing (pt. 2) Major considerations Warehouse type Size of tables Access How? Who? What? Storage requirements Response-time expectations
16
EDW Indexing (pt. 3) Dimensions Clustered Index on business/natural key Identifier from the source system Enhances response time when this business key is used in a WHERE clause NCI(s) Surrogate key Usually the primary key Meaningful only to the source system Will expedite loads Other columns found to be accessed frequently in searches, sorting, or grouping Consider columns included in a hierarchy
18
EDW Indexing (pt. 4) Date & time dimensions No business key Consider a smart PK and cluster on it YYYYMMDD HHMMSSSS A smart key will retain proper order and range queries will be simplified as you will need one less join because the PK already contains the date/time
20
EDW Indexing (pt. 5) Type 2 SCD Consider adding a 4-pt NCI that includes… The business key The record begin date The record end date The surrogate key CREATE NONCLUSTERED INDEX MyDim_CoveringIndex ON (NaturalKEY, RecordStartDate) INCLUDE ( RecordEndDate, SurrogateKEY) Can be very useful during ETL as well as for historical queries
22
EDW Indexing (pt. 6) Fact table Similar to indexing a dimension with an eye towards partitioning Usually best to cluster on the date key or date/time key If table is partitioned on a date column, use that column as the clustering key Create NCI’s on each of the FK’s in the fact table Consider combining the FK and date key (in that order) to enhance query response Watch storage requirements
24
Modifying the Scheme Over time your data warehouse will change to accommodate what’s happening in your organization Use tried-and-true transactional methods for tuning indexes… DTA Execution plans DMV’s sys.dm_db_index_usage_stats sys.dm_db_index_operational_stats sys.dm_db_missing_index_details sys.dm_db_missing_index_columns sys.dm_db_missing_index_group_stats sys.dm_db_missing_index_groups
25
Thank you!!! Twitter: @GratefulDBA@GratefulDBA LinkedIn: https://www.linkedin.com/in/tgallanthttps://www.linkedin.com/in/tgallant Web: http://troygallant.comhttp://troygallant.com Email: tgallant@outlook.comtgallant@outlook.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.