Designing SSIS Packages for Performance

Slides:



Advertisements
Similar presentations
Solving Problems in ETL using SSIS Allan Mitchell SQL Server MVP
Advertisements

Technical BI Project Lifecycle
Fast Track, Microsoft SQL Server 2008 Parallel Data Warehouse and Traditional Data Warehouse Design BI Best Practices and Tuning for Scaling SQL Server.
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
2 Overview of SSIS performance Troubleshooting methods Performance tips.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
Activity Running Time DurationIntro0 2 min Setup scenario 2 2 min SQL BI components & concepts 4 5 min Data input (Let’s go shopping) 9 7 min Whiteboard.
Course FAQ’s I do not have any knowledge on SQL concepts or Database Testing. Will this course helps me to get through all the concepts? What kind of.
SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.
for all Hyperion video tutorial/Training/Certification/Material Essbase Optimization Techniques by Amit.
Best Practices in Loading Large Datasets Asanka Padmakumara (BSc,MCTS) SQL Server Sri Lanka User Group Meeting Oct 2013.
Azure SQL DW – Elastic Data Analytics in the cloud Josh Sivey | Microsoft TSP #492 | Phoenix.
Carlos Bossy Quanta Intelligence SQL Server MCTS, MCITP BI CBIP, Data Mining Real-time Data Warehouse and Reporting Solutions.
Develop Business Intelligence Application with Microsoft SharePoint 2013 Author: Vo Duy Anh.
SQL Server Analysis Services Fundamentals
Dissecting the Data Flow: SSIS Transformations, Memory & the Pipeline
With Temporal Tables and More
Data Warehouse ETL By Garrett EDmondson Thanks to our Gold Sponsors:
Presented By: Jessica M. Moss
What’s new in SQL Server 2017 for BI?
Automated Enterprise-wide SQL Server Auditing
From MDS to SSRS - a short walkthrough
OVirt Data Warehouse 02/11/11 Yaniv Dary BI Software Engineer, Red Hat.
Power BI Performance Tips & Tricks
Introduction to SQL Server Analysis Services
Reading execution plans successfully
SQL Server Integration Services
Applying Data Warehouse Techniques
Where I am at: Swagatika Sarangi MDM Lead PASS Summit SQL Saturdays
Presented by: Warren Sifre
Free Dumps - Try Demo - Dumps4download
Auditing in SQL Server 2008 DBA-364-M
Business Intelligence for Project Server/Online
Swagatika Sarangi (Jazz), MDM Expert
SQL Server Analysis Services Fundamentals
SQL Server Analysis Services Fundamentals
Intro to Machine Learning
Save Time & Resources: Job Performance Tuning Strategies
Keys to a Successful Business Intelligence/Analytics Project
Welcome to SQL Saturday Denmark
Stop Wasting Time & Resources: Performance Tune Your Jobs
Maximizing SSMS for Developers and DBAs
Applying Data Warehouse Techniques
An Introduction to Data Warehousing
Query Optimization CS 157B Ch. 14 Mien Siao.
Applying Data Warehouse Techniques
Data Modeling and Prototyping
From MDS to SSRS - a short walkthrough
SSIS Project Deployment: The T-SQL Way
Applying Data Warehouse Techniques
What is New in SQL Server 2016 BI Stack
ETL Patterns in the Cloud with Azure Data Factory
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Applying Data Warehouse Techniques
How To Load A Fact Table Really, Really Fast
SSIS. FIRST EXPERIENCE. By Virginia Mushkatblat
Getting Data Where and When You Want it with SQL Server 2005
Introduction to PowerApps and Flow
Visual Data Flows – Azure Data Factory v2
Introduction to PowerApps and Flow
Visual Data Flows – Azure Data Factory v2
Dmytro Polishchuk BI Developer DB Best Technologies
Implementing ETL solution for Incremental Data Load in Microsoft SQL Server Ganesh Lohani SR. Data Analyst Lockheed Martin
SSDT, Docker, and (Azure) DevOps
Environment Automation
Creating a Marketing Dashboard with Power BI & Dax
DAX: Functions and Context That’s What It’s All About!
An Introduction to Partitioning
Presentation transcript:

Designing SSIS Packages for Performance Eleanor Stahura and Erin Dempster Designing SSIS Packages for Performance

Thank you Sponsors! Platinum Sponsor: Gold Sponsors: Visit the Sponsor Booths Lots of Raffle Prizes! Get your parking paid via Sponsor Bingo Thank you Sponsors! Platinum Sponsor: Gold Sponsors:

PASSMN – News/Info Sponsors: Thanks to all our sponsors of 2018! We need Sponsors for Nov/Dec 2018 and 2019! Special thanks to our annual sponsor: Board Member Elections in November/December: Your chance to help out the MN SQL community!

November 5th Through November 9th Join the brightest data professionals focused on the Microsoft Data Platform! November 5th Through November 9th Pre-Conference Sessions – Monday/Tuesday Conference – Wednesday through Friday

SQLSaturday #796 – After Party Location: 4th Floor of Mall of America Time: 6:30PM – 10PM There will be drinks and appetizers as well as free game cards and bowling! Hang out with some new friends you’ve made.

The Presenters

Eleanor Stahura 6 years' experience Database, Data Warehouse & Report development SQL Server 2008 – 2016  SSIS SSAS SSRS / Power BI Current grad student at University of St. Thomas (MS Software Engineering)

Erin Dempster 15 years experience Certified since 2004 (MCDBA on SQL Server 2000) Transactional and Analytical developer Application developer (VB 6 and C# .Net) Database Administrator (SQL 2008 – 2014) Current grad student at Dakota State University (MS Analytics)

Outline

Outline Performance in SSIS Different Types of Blocking Dimension ETL Optimization Fact ETL Optimization

Scenario Operations needs to be able to track inventory by day. Incremental inventory extracts are available to be consumed. New customers are coming in every day. Other customers are updating their contact information. Reports need to reflect the current customers and their attributes.

Scenario Build an SSIS package to run today …and tomorrow …and next month …and next year …and the data volume is growing every day …and other things are running on the server

Building Strong SSIS Packages More than just getting them to work

Why Does Performance Matter? Most obvious: faster = better Grows more happily Makes less mess Plays better with others

Start Thinking Performance

Blocking The degree to which a single row of data can be processed independently from other rows

Types of SSIS Blocking

Fully Blocking Requires entire data stream Increased memory usage Generally decreases performance Includes Sort Fuzzy Lookup Aggregate

Video - Fully Blocking

Execution Results – Sort and Merge Time Elapsed: 1 minute 52 seconds Number of records: 2.67 million Memory used: 650MB What happened? 2 Sort transformations All records stored in memory

Semi Blocking Doesn’t require entire data stream, but New thread(s) are created to run asynchronously Includes Merge Join Pivot/Un-pivot Union All

One of these is not like the others Non-Blocking Data stream is processed as it’s received Minimizes memory utilization Generally (but not always) fast transformations Includes Conditional Split Derived Column Lookup Slowly Changing Dimension One of these is not like the others

Video – Slowly Changing Dimension

Execution Results – SCD Transform Time Elapsed: 3 minutes 46 seconds Number of records: 166k records What happened? Each record is queried against the DB Inserts and updates occurring at the same time

Demos

Video – Non-Blocking Dim Package

Execution Results – Non-Blocking Dim Time Elapsed: 18 seconds Previously 3 minutes 46 seconds Number of records: 166k records What happened? Lookup retrieved all records Updates moved out of data flow

Video – Non-Blocking Fact Package

Execution Results – Sort and Merge Time Elapsed: 56 seconds Previously 1 minute 52 seconds Number of records: 2.67 million Memory used: 48MB What happened? Lookups loaded at the start In-memory comparison

Just because it can doesn’t mean it should What does SSIS do best? Aggregate Fuzzy Grouping Fuzzy Lookup Row Sampling Sort

Clear the bottlenecks

Final Notes Practice. Then practice more. Test with larger data sets Understand the larger system configurations

Questions

Thank You Elle Stahura Erin Dempster estahura@teamscs.com edempster@teamscs.com