Performance-Tuning for Extract/Transform/Load (ETL) Operations Presented by : Ken Stuber
About Ken Data Architect at Jewelers Mutual Insurance Has worked with SQL Server since 1997 Has supported Data Warehousing on SQL platform for 7+ years Missed MCM by one lab exam question. Grrr…. Will try again with whatever new advanced cert MS comes up
First… know your goals BASELINE SLAs Refresh Frequency Process Prioritization
Then … Design for Performance Data Grain Source Data Local Re-startable Parallel processing Merge or Rebuild Dims?
Database and Server Settings Recovery Model Isolation Max Memory Power Option MAXDOP
Getting to your Source Data If Local… Backup & Restore SAN Snap or Clone AA Readable 2 nd Snapshot of Mirror Replication Log Shipping CDC? Performance Trade-off of local vs Remote Always TEST!
Bulk Loading Data Reduce IO w/ Minimal Logging Methods sys.dm_io_virtual_file_stats TABLOCK vs Trace Flag 610
Functions CPU killer UDF vs CLR vs In-Line
Indexing Source Staging Dest.
Statistics Bad Stats = Catastrophic Performance Update after every build FULLSCAN if you can afford it
Compression? I/O Reduction at cost of CPU
Partitioning? Can Eliminate the L. of E.T.L. Increases Complexity Can Benefit Query performance too
Other Advanced Features Hekaton? Updatable Columnstore Indexes?
Hardware Considerations More RAM!!! IO or CPU bound? Know your IO Subsystem SSDs Multiple files TempDB local
Tuning existing ETL
Resources Microsoft CAT Teams Data Loading Performance Guide Microsoft CAT Teams Data Loading Performance Guide SQL Magazine: How to find your most expensive queriesHow to find your most expensive queries SAN Performance Tuning with SQLIO Columnstore Insert/Update Performance