Instrumenting, Monitoring and Auditing of SSIS ETL Solutions SQL Bits 2009 - Manchester Davide Mauri

Slides:



Advertisements
Similar presentations
SSIS Dataflow Performance Tuning 1 st October 2010 Jamie Thomson.
Advertisements

COM vs. CORBA.
Error Handling With Fusebox Presentation By Eron Cohen.
Welcome to Clienttrack and the World of HMIS
Moving Data Lesson 23. Skills Matrix Moving Data When populating tables by inserting data, you will discover that data can come from various sources.
Feature requests for Case Manager By Spar Nord Bank A/S IBM Insight 2014 Spar Nord Bank A/S1.
Virtual techdays INDIA │ 9-11 February 2011 virtual techdays Auditing Made Easy: Change Tracking and Change Data Capture Pinal Dave │ Technology Evangelist,
Lecture 2 Page 1 CS 236, Spring 2008 Security Principles and Policies CS 236 On-Line MS Program Networks and Systems Security Peter Reiher Spring, 2008.
Error Handling in SSIS Reza Rad SQL Server MVP, Author, DW / BI Architect.
Top 10 SSIS Best Practices Tim Mitchell Artis Consulting The World’s Largest Community of SQL Server Professionals.
Wouter Smit About the Speaker Wouter has been working in the data warehousing field for more than 10 years MCITP Professional Database Administrator.
05 | Configuration and Deployment Richard Currey | Senior Technical Trainer–New Horizons United George Squillace | Senior Technical Trainer–New Horizons.
SQL Server 2005 Integration Services Mike Taulty Developer & Platform Group Microsoft Ltd
Setting Up a Sandbox Presented by: Kevin Brunson Chief Technology Officer.
Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.
SSIS Over DTS Sagayaraj Putti (139460). 5 September What is DTS?  Data Transformation Services (DTS)  DTS is a set of objects and utilities that.
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
Las Vegas XNA Users Group. October 2007 MEETING! Presented by Charley Jones A+, MOUS, MCP, MCSA, MCSE, MCDBA, MCAD, MCT, PMP, ITIL MCTS: SQL Server 2005,
 Nate Locklin ◦ Database Analyst, PPG Industries ◦  Steve Tirone ◦ Data Warehouse Analyst, Amerinet ◦
COM vs. CORBA Computer Science at Azusa Pacific University September 19, 2015 Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department.
© 2006 IBM Corporation IBM WebSphere Portlet Factory Architecture.
LiveCycle Data Services Introduction Part 2. Part 2? This is the second in our series on LiveCycle Data Services. If you missed our first presentation,
1 Functions 1 Parameter, 1 Return-Value 1. The problem 2. Recall the layout 3. Create the definition 4. "Flow" of data 5. Testing 6. Projects 1 and 2.
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
DTS Conversion to SSIS Conversion Best Practices Mike Davis
(Chapter 10 continued) Our examples feature MySQL as the database engine. It's open source and free. It's fully featured. And it's platform independent.
IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.
Sage ACT! 2013 SDK Update Brian P. Mowka March 23, 2012 Template date: October 2010.
Introduction to Exception Handling and Defensive Programming.
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik materials by: Katy Wolstencroft University of Manchester.
Advanced ETL: Embedding Integration Services Ashvini Sharma Development Lead DAT411 Microsoft Corporation Sergei Ivanov Technical Lead DAT411 Microsoft.
DAT 360: DTS in SQL Server 2000 Best Practices Euan Garden Group Manager, SQL Server Microsoft Corporation.
Integration Services in SQL Server 2008 Allan Mitchell SQL Server MVP.
Extending SQL Server Integration Services Bret Stateham Training Manager Vortex Learning Solutions blogs.netconnex.com.
Task Analysis Methods IST 331. March 16 th
1. When things go wrong: how to find SQL error Sveta Smirnova Principle Technical Support Engineer, Oracle.
DAT 332 SQL Server 2000 Data Transformation Services (DTS) Best Practices Euan Garden Product Unit Manager SQL Server Development Microsoft Corporation.
1 Integration Services in SQL Server 2008 Allan Mitchell – SQLBits – Oct 2007.
ADAPTING YOUR ETL SOLUTION TO USE SSIS 2012 Presentation by Devin Knight
Advanced Tips And Tricks For Power Query
Building Data Integration Solutions with Integration Services Donald Farmer Group Program Manager Microsoft Corporation.
Aggregator  Performs aggregate calculations  Components of the Aggregator Transformation Aggregate expression Group by port Sorted Input option Aggregate.
Explore engage elevate Data Migration Without Tears Mike Feingold Empoint Ltd Tuesday 10th November 2015.
Jemini Joseph. About me Working in Microsoft BI field since Mostly consulting in SSIS Worked as programmer in Visual Basic before moving to BI
Brian Knight Founder Pragmatic Works BIN207 About the Speaker Brian is a SQL Server MVP Founder of Pragmatic Works Co-founder of SQLServerCentral.com.
Helping Your Data Warehouse Succeed: 10 Mistakes to Avoid in Data Integration Rafael Salas w:
Easy ETL with Andrzej Kukuła – Marcin Szeliga –
Copyright 2015 Varigence, Inc. Unit and Integration Testing in SSIS A New Approach Scott @varigence.
Pulling Data into the Model. Agenda Overview BI Development Studio Integration Services Solutions Integration Services Packages DTS to SSIS.
Template Package  Presented by G.Nagaraju.  What is Template Package?  Why we use Template Package?  Where we use Template Package?  How we create.
The Data Large Number of Workbooks Each Workbook has multiple worksheets Transaction worksheets have large (LARGE) number of lines (millions of records.
SQL Database Management
SSIS Templates, Configurations & Variables
How to be a SharePoint Developer
Installation The Intercompany Integration Solution for SAP Business One Version 2.0 for SAP Business One 9.1 Welcome to the course on the installation.
Presented By: Jessica M. Moss
What Is The SSIS Catalog and Why Do I Care?
Designing and Implementing an ETL Framework
Installation The Intercompany Integration Solution for SAP Business One Version 2.0 for SAP Business One 9.1 Welcome to the course on the installation.
Simplifying XEvents Management with dbatools
Auditing in SQL Server 2008 DBA-364-M
SQL Server May Let You Do It, But it Doesn’t Mean You Should
Populating a Data Warehouse
Get your ETL flow under statistical process control
ETL process management with TSQL
CS 240 – Advanced Programming Concepts
Security Principles and Policies CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.
SSIS Data Integration Data Warehouse Acceleration
SSIS Data Integration Data Warehouse Acceleration
Handling Data Errors in a Dataflow Task
Presentation transcript:

Instrumenting, Monitoring and Auditing of SSIS ETL Solutions SQL Bits Manchester Davide Mauri

EXEC sp_help ‘Davide Mauri’ MCDBA, MCAD, MCT Microsoft SQL Server MVP Works with SQL Server from 6.5 Works on BI from 2003 President of UGISS (Italian SQL Server UG) Solid Quality Mentors –Italian Subsidiary

Agenda ETL Story Logging SSIS in MS way Workarounds Logging SSIS in MY way –The developer’s corner Adding value to log data 3 © 2007 Solid Quality Mentors

ETL Story 4

ETL process grows in complexity Since package won’t be run from BIDS in production you need something to help to understand –What went wrong when package didn’t worked as expected And maybe this happens only at nighttime… –Monitor the performance of your package to forecast its ability to stay within a given timeframe 5

Logging SSIS in MS way Flexibility –You can decide what and where to log to –You have a lot of ready-to-be-used log providers –Is available out-of-the-box Well, you have to remember to activate it Nice things 6

Logging SSIS in MS way Logging needs to be set up within the package –If you need to change logging you need to edit the package in VS Too few information given –No Variables values –No Expressions results –No Information on Data Flow –Cannot handle very well chains of packages (>=2) Problems using Parent Package Variables to propagate logging configuration (eg: log file path) You just lose information when you have more than 2 packages in the chain Not-so-nice things 7

Logging SSIS in MS way DTExec seems to allow to control logging at runtime –Unfortunately you need to have a properly configured connection manager in advance Things that don’t work as expected  8

Logging SSIS in MS way Some new features added with SQL 2008 –SQLDumper Too much detailed information on one hand, and again to few on the other “Improved” things from 2005 to

Logging SSIS in MS way Doesn’t really offer an help to understand what’s went wrong –To few information given Hey, I’d like to log also Data Flow! I really have to do everything by hand? –This can take a lot of time! I want to change my logged data. How can I do it without have to open the package in BIDS and release-test-deploy it? –You can’t! Conclusions 10

Logging SSIS in MS way Use specific task (Script, Custom or Execute SQL) before and after each task you want to instrument Create an event-handler for each event you want to log (es: PreExecute, PostExecute) –Better if then you use a tool to create SSIS templates and standardize them Like MDDE (Metadata-driven ETL) – Workarounds 11

Logging SSIS in MS way DEMO 1. The usual way 12

Logging SSIS in MY way Basically I’d like to have all the information that BIDS give you, but outside BIDS. Now, if BIDS can, WE can –No magic here, just need to know the APIs! Just a little bit complex…but we’ll simplify things here The key is the Execute method of Package class –In particular the overloads that takes the IDtsEvent interface parameter Whose documentation is not very rich Learn from BIDS 13

Logging SSIS in MY way IDtsEvents is implemented by the base class DefaultEvents We have to create a custom event handler class deriving from DefaultEvents and then override all default event handlers Use an instance of the newly created class as a parameter for the Execute method on Package object –Now all events will be intercepted by our Event Handler! Developer’s corner 14

Logging SSIS in MY way The event handlers methods can call a custom method to log data –Beware! SSIS runtime make heavy use of threads –We have to deal with the fact that our class is used by different thread at the same time. We have to be sure that race conditions cannot occur We have to be fast to avoid to impact too much on performances –Log the minimum for all event except errors –Log everything we can for error They should never happen Developer’s corner 15

Logging SSIS in MY way All containers will raise events Inside each event handler method we can access to all runtime information for that container –Variables –Connections –Configurations –Properties And their expressions Developer’s corner 16

Logging SSIS in MY way Variables: use the Variables collection available in each container Connections: use ConnectionManagers collection available in Package class Configurations: use Configuration collection in Package class –The EnableConfiguration property also tells you if a Package will try to look for “default” configurations Developer’s corner 17

Logging SSIS in MY way Extracting properties is a bit tricky… –First we have to ask to the container its properties through the Properties collection of the IDTSPropertiesProvider interface –For each property we have to call the GetValue on the Property passing the object from which this property come from as a parameter (!!!) Developer’s corner 18

Logging SSIS in MY way Now, for Control Flow, we’re done. What about Data Flows? No specific native logging infrastructure...but BIDS is able to show us how may rows flows between components –…so these information are available somewhere! DataFlow is able to generate events through the FireCustomEvent method Developer’s corner 19

Logging SSIS in MY way Custom events are described by the EventInfo class –Every container has an EventInfos property (a collection of EventInfo) The key event here is the “OnPipelineRowsSent” data flow custom event –Here we have an array of objects that contains interesting things For this event the array contains 8 entries Developer’s corner 20

Logging SSIS in MY way OnPipilineRowSent payload –Source Object (eg: System.__ComObject) –DataFlow Object ID (eg: 140) –DataFlow Object Name (eg: OLE DB Source Output) –Object ID (eg: 134) –Object Name (eg: TransformationName) –Input Object Id (eg: 135) –Input Object Name (eg: Derived Column Input) –Row Count (eg: 744) Not documented in EventInfo Developer’s corner 21

Logging SSIS in MY way So, filtering on Custom Events we’re able to profile the entire DataFlow! –On buffer basis We can also count how many times a DataFlow has been invoked when placed into a For..Loop or For..Each container –Together with the knowledge of variables values this provide us information the impact of each iteration Developer’s corner 22

Logging SSIS in MY way DEMO 2. Show me the code! 23

Logging SSIS in MY way The result is DTLoggedExec –Current version beta Log everything needed –Package version –Variables values –Properties’ Expressions –Profile Dataflow DTLoggedExec 24

Logging SSIS in MY way Additional Features –Handle long package chains correctly –Supports the majority of DTExec options –Pluggable architecture Easy to create custom Log Providers In future will also be able to add custom Data Flows Profilers Supported platforms –Every platforms & architectures are supported 2005, 2008 X86, X64, IA64 DTLoggedExec 25

Logging SSIS in MY way DEMO 3. Test it! 26

Logging SSIS in MY way Profiled data from DataFlows packages can be huge…better to put it into a database With DTLoggedExec comes a full set of scripts and batch to create a specific database and to bulk load data –Actually only data profiled from DataFlows can be imported –In near future also data from CSV log provider will have its place here 99% done, testing in progress DTLoggedExec DB 27

Logging SSIS in MY way DEMO 4. Load profiled data 28

Logging SSIS in MY way Control Flow –Performance are affected by the amount of logging you decide to have Data Flow –Impact of performing dataflow profiling: < 5% DTLoggedExec can be improved to have even less impact if needed –Better buffering Performances ? 29

Logging SSIS in MY way DTLoggedExec is under Creative Commons license –Anyone can contribute Official homepage – –Wiki with documentation Download, source code, issues and forum – Support 30

Adding value “Native” Auditing –When, who and how a row has been imported in my DWH? Performance monitoring of a single package –Or Dataflow Performance monitoring over time Easy to monitor discarded rows –Very useful in dashboard Monitor SLA 31

Logging SSIS in MY way DEMO 5. Adding value 32

Question & Answers DTLoggedExec

References DTLoggedExec – Jamie Thomson, “Custom Logging Using Event Handlers” – 05/06/11/SSIS_3A00_-Custom-Logging-Using-Event- Handlers.aspxhttp://blogs.conchango.com/jamiethomson/archive/20 05/06/11/SSIS_3A00_-Custom-Logging-Using-Event- Handlers.aspx Andy Leonard, “ETL Instrumentation” – TL+Instrumentation/default.aspxhttp://sqlblog.com/blogs/andy_leonard/archive/tags/E TL+Instrumentation/default.aspx 34

Thanks! DTLoggedExec 35