Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance Tuning SSIS

Similar presentations


Presentation on theme: "Performance Tuning SSIS"— Presentation transcript:

1 Performance Tuning SSIS
Brian Knight, CEO Pragmatic Works Brian

2 About the Ugly Guy Speaking
SQL Server MVP Founder of Pragmatic Works Co-Founder of BIDN.com, SQLServerCentral.com and SQLShare.com Written more than a dozen books on SQL Server

3 Integration Services in Action
GeoSpatial Data: Semi structured Legacy data: binary files Application database SQL Server Integration Services GeoSpatial Components Custom source Standard sources Data-cleansing components Merges Data mining Warehouse Reports Mobile data Source = Use right tool for the job Transform = Smaller trips because I still had to come back for more dirt Destination = Bring the car closer because there’s less to run. Devin Integration is a seamless, manageable operation Source, prepare, & load data in single, auditable process Scale to handle heavy and complex data requirements Cube

4 Advanced Session

5 Today’s Problems with Integration
Integration today Increasing data volumes Increasingly diverse sources Requirements reached the Tipping Point Low-impact source extraction Efficient transformation Bulk loading techniques When my brother entered the field, he had pansy data to deal with like 100 MBs. Now though, we have real data problems. How many have dbs over a TB in your envrionment? That’s right. We now have to deal with terrabytes of data and keep in in synch live. My brother only had to deal with flat files and excel spreadsheets and we now have to deal with going to the mainframe directly without impacting the mainframe. Devin

6 Tuning Decisions Choose the right tool for the job
Don’t be afraid to use T-SQL Will parallelism work? Brian

7 Source Optimization Flat files – When available, use Fast Parse
OLE DB sources – Change network packet size Use T-SQL whenever possible in the OLE DB Source Joining NULL handling Where clauses

8 Impact of Compression on ETL
* Not official Microsoft results.

9 Connection manager tuning Flat file tuning OLE DB Source tuning
Tuning the Source Connection manager tuning Flat file tuning OLE DB Source tuning Brian – 10 min Demo

10 Transform Components The Pipeline presents the buffer to each downstream component Devin

11 SSIS Data Flow Architecture
11/17/2018 7:45 PM SSIS Data Flow Architecture Synchronous vs. Non Synchronous Cards example © 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

12 Case Study: Patterns Devin 83 seconds 105 seconds

13 Source Data Extraction
Extracting data from the source is expensive Efficient extraction is key to improving ETL performance Involves bulk loading data into staging areas or warehouse Time consuming & resource intensive Triggers (synchronous IO penalty) Timestamp columns (Schema changes) Complex queries (delayed IO penalty) Custom (ISV, mirror, snapshot, …) Incremental data load is key to efficient extraction Need to know what changed at source since a point in time Expensive lookups to determine changed columns Providing information up front about which columns changed will improve efficiency Devin

14 SQL Server 2008: Change Data Capture (CDC)
Information about what changed at the source Changes captured from the log asynchronously Enabled per table CDC APIs provide access to change data OLTP Change Tables Data Warehouse Brian

15 Traditional CDC with SSIS Integrating CDC in 2008
Change Data Capture Traditional CDC with SSIS Integrating CDC in 2008 Devin – 8-10 min Demo

16 Lookup Component Three modes of operation
Full Cache: for small lookup datasets No Cache: for volatile lookup datasets Partial Cache: for large lookup datasets Tradeoff memory vs. performance Use Cascaded Lookups Merge Join may be alternative Devin

17 SQL Server 2008: Lookup Transform
Hydrate cache files for large data sets Can reuse cache Can load cache during day and use in nightly ETL Brian

18 Cascading lookup optimizations Cache file lookup
Demo Cascading lookup optimizations Cache file lookup Devin – 5 min

19 Data Destinations Use “Fast Load” or SQL Server Destination
Table Lock on insert operations Trace flags for improvement Old principles still apply Devin

20 Destination Tuning Devin Demo

21 Building a Work Queue System
Create a work queue table. Create a loop to shift over the work queue constantly checking out work Spawn x times with a batch file

22 Demo Results Here is what our first run looked like with each task being processed in sequence by a single package instance.

23 Demo Results This is what our second run looked like with 2 processes working in parallel. As you can see, the tasks get completed in batches of two and the total demo run time drops in half from about 64 seconds to 36 seconds.

24 Demo Results And here is our third run with 4 processes working in parallel. The time for individual tasks has risen from 8 or 9 seconds to 13 or 14 seconds while the total run time has dropped from about 36 seconds to about 28 seconds.

25 Demo Results Finally, here is a run of the demo with 8 processes. As all tasks get worked on simultaneously, the time for each task has risen to about 27 seconds and the total run time is almost the same as the run with 4 processes. What’s happened here is that we’ve hit a disk I/O bottleneck as all 8 processes contend with each other to read their data files from the disk. To solve this problem, we would want to spread the files across separate disks and controllers or move to a faster disk technology.

26 Parallel Load Demo

27 Managing Resources Logging events to watch pipeline internals
PipelineExecutionPlan, PipelineExecutionTree, BufferSizeTuning System Monitor to track I/O issues Buffers In Use tracks how many buffers are presently being used Buffers Spooled tracks how many 10 mb buffers have been spooled to disk Brian

28 Measuring Performance
Perfmon Brian – 6 min

29 Location Consider the following configuration… Where should SSIS run?
11/17/2018 7:45 PM 11/17/2018 7:45 PM Location Consider the following configuration… Where should SSIS run? (Licensing issues aside) SQL Server 1 SQL Server 2 Brian SSIS Server © 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. 29 © 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

30 WSRM Windows System Resource Manager (WSRM) can throttle CPU and memory Creates a soft throttle Can be scheduled so SSIS gets priority on weekends and nights Only activates policy if resources begin to become constrained (about 70%) WSRM is free with Windows Server 2003 Enterprise Edition and included in Windows Server 2008 Brian

31 Creating a soft schedule cap
WSRM Creating a soft schedule cap Brian Demo

32 Summary Planning Use the right tool for the right job
Don’t underestimate the power of the whiteboard! Use the right tool for the right job Leverage the power of the engine Patterns and Practices Understand best practices But don’t be afraid to experiment Devin/Brian

33 The End Already? Questions @BrianKnight
@BrianKnight


Download ppt "Performance Tuning SSIS"

Similar presentations


Ads by Google