Download presentation
Presentation is loading. Please wait.
1
Designing SSIS Packages for Performance
Eleanor Stahura and Erin Dempster Designing SSIS Packages for Performance
2
Thank you Sponsors! Platinum Sponsor: Gold Sponsors:
Visit the Sponsor Booths Lots of Raffle Prizes! Get your parking paid via Sponsor Bingo Thank you Sponsors! Platinum Sponsor: Gold Sponsors:
3
PASSMN – News/Info Sponsors:
Thanks to all our sponsors of 2018! We need Sponsors for Nov/Dec 2018 and 2019! Special thanks to our annual sponsor: Board Member Elections in November/December: Your chance to help out the MN SQL community!
4
November 5th Through November 9th
Join the brightest data professionals focused on the Microsoft Data Platform! November 5th Through November 9th Pre-Conference Sessions – Monday/Tuesday Conference – Wednesday through Friday
5
SQLSaturday #796 – After Party
Location: 4th Floor of Mall of America Time: 6:30PM – 10PM There will be drinks and appetizers as well as free game cards and bowling! Hang out with some new friends you’ve made.
6
The Presenters
7
Eleanor Stahura 6 years' experience
Database, Data Warehouse & Report development SQL Server 2008 – 2016 SSIS SSAS SSRS / Power BI Current grad student at University of St. Thomas (MS Software Engineering)
8
Erin Dempster 15 years experience Certified since 2004 (MCDBA on SQL Server 2000) Transactional and Analytical developer Application developer (VB 6 and C# .Net) Database Administrator (SQL 2008 – 2014) Current grad student at Dakota State University (MS Analytics)
9
Outline
10
Outline Performance in SSIS Different Types of Blocking Dimension ETL Optimization Fact ETL Optimization
11
Scenario Operations needs to be able to track inventory by day. Incremental inventory extracts are available to be consumed. New customers are coming in every day. Other customers are updating their contact information. Reports need to reflect the current customers and their attributes.
12
Scenario Build an SSIS package to run today …and tomorrow …and next month …and next year …and the data volume is growing every day …and other things are running on the server
13
Building Strong SSIS Packages
More than just getting them to work
14
Why Does Performance Matter?
Most obvious: faster = better Grows more happily Makes less mess Plays better with others
15
Start Thinking Performance
16
Blocking The degree to which a single row of data can be processed independently from other rows
17
Types of SSIS Blocking
18
Fully Blocking Requires entire data stream Increased memory usage
Generally decreases performance Includes Sort Fuzzy Lookup Aggregate
19
Video - Fully Blocking
20
Execution Results – Sort and Merge
Time Elapsed: 1 minute 52 seconds Number of records: 2.67 million Memory used: 650MB What happened? 2 Sort transformations All records stored in memory
21
Semi Blocking Doesn’t require entire data stream, but
New thread(s) are created to run asynchronously Includes Merge Join Pivot/Un-pivot Union All
22
One of these is not like the others
Non-Blocking Data stream is processed as it’s received Minimizes memory utilization Generally (but not always) fast transformations Includes Conditional Split Derived Column Lookup Slowly Changing Dimension One of these is not like the others
23
Video – Slowly Changing Dimension
24
Execution Results – SCD Transform
Time Elapsed: 3 minutes 46 seconds Number of records: 166k records What happened? Each record is queried against the DB Inserts and updates occurring at the same time
25
Demos
26
Video – Non-Blocking Dim Package
27
Execution Results – Non-Blocking Dim
Time Elapsed: 18 seconds Previously 3 minutes 46 seconds Number of records: 166k records What happened? Lookup retrieved all records Updates moved out of data flow
28
Video – Non-Blocking Fact Package
29
Execution Results – Sort and Merge
Time Elapsed: 56 seconds Previously 1 minute 52 seconds Number of records: 2.67 million Memory used: 48MB What happened? Lookups loaded at the start In-memory comparison
30
Just because it can doesn’t mean it should
What does SSIS do best? Aggregate Fuzzy Grouping Fuzzy Lookup Row Sampling Sort
31
Clear the bottlenecks
32
Final Notes Practice. Then practice more. Test with larger data sets
Understand the larger system configurations
33
Questions
34
Thank You Elle Stahura Erin Dempster
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.