Presentation is loading. Please wait.

Presentation is loading. Please wait.

Patterns and Best Practices in SSIS

Similar presentations


Presentation on theme: "Patterns and Best Practices in SSIS"— Presentation transcript:

1 Patterns and Best Practices in SSIS
or, how to keep your DBA happy with your crazy-ass ETL

2 It’s all about me... I’m a SpeakingMentor !

3 Obligatory LOTR reference…

4 Mantra C# is *not* the only way to initialise a variable
C# is *not* the only way to move files C# is *not* the only way to call a web service C# scripts are opaque to the SSIS runtime C#... <sigh> Mantra

5 Not a pipeline Data does not get passed to components (cough)
Components manipulate blocks of data (true) Not a pipeline

6 What you *think* happens….
Get data Replace nulls Conditional Split Merge Sort

7 What actually happens…
Get data Conditional Split Replace nulls Sort Merge

8 SSIS not being a pipeline

9 It’s all about speed… There are 2 transformation types:
Synchronous – fastest (streaming and row-based) Asynchronous – slower And three ‘modes’: Non-blocking Semi-blocking Full blocking It’s all about speed…

10 Non-blocking synchronous streaming transforms
Audit Cache Transform Character Map Conditional Split Copy Column Data Conversion Derived Column Multicast Percent Sampling Row Count Lookup Non-blocking synchronous streaming transforms

11 Non-blocking synchronous row transforms
DQS cleansing Export Column Import Column OLEDB Command Script Task SCD Lookup Non-blocking synchronous row transforms

12 Wait, there’s two ‘Lookup’s ?
Lookups are non-blocking streaming transforms when the ‘Full Cache’ option is used for the lookup data Using ‘Partial Cache’ or ‘No Cache’ options in the lookup make the Lookup a row-based transform, which is necessarily slower Wait, there’s two ‘Lookup’s ?

13 Semi-blocking asynchronous transforms
Data Mining Merge Merge Join Pivot Unpivot Union All Term Lookup Semi-blocking asynchronous transforms

14 Full blocking asynchronous transforms
Aggregate Fuzzy Grouping Fuzzy Lookup Row Sampling Sort Term Extraction Script Task Full blocking asynchronous transforms

15 Wait, there’s two ‘Script Task’s ??
Script tasks are non-blocking when they’re using an outside resource (i.e. not the data that you’re working on) They become blocking when they collect a dataset before sending it on to a destination Set the ‘SynchronousInputID’ property on the output columns to ‘None’ Wait, there’s two ‘Script Task’s ??

16 Large Data Sets BufferTempStoragePath BlobTempStoragePath
These can either be set in your package template, or injected into the .dtsx Large Data Sets

17 Really ? You can just inject stuff ?
Yes Find and Replace default entries with your custom requirement… any decent text editor will do. BLOBTempStoragePath="F:\astDisk\WithLoadsOfSpace_temp" bufferTempStoragePath="F:\astDisk\WithLoadsOfSpace_temp” Really ? You can just inject stuff ?

18 Things that make you go hmmm
Logging Package Checkpoints Location, location, location Expressions Parallel Operations ‘IsSorted=True’ property Auditing Bad Data Things that make you go hmmm

19 Here’s one I prepared earlier…
Time to dissect Here’s one I prepared earlier…

20 Take-aways – You should know..
Your data Your environment Transform types Ingest / Egress of data What you are trying to achieve When to give up and use C# Take-aways – You should know..


Download ppt "Patterns and Best Practices in SSIS"

Similar presentations


Ads by Google