A Reluctant User’s Guide Making Peace With SSIS A Reluctant User’s Guide Joshua Lynn SQLGuy@PocketJoshua.com
My Favorite Resources Andyleonard.blog www.mssqltips.com www.timmitchell.net Google => Stackoverflow.com
A Reluctant User’s Guide Part I – Getting Started Making Peace With SSIS A Reluctant User’s Guide Part I – Getting Started Joshua Lynn SQLGuy@PocketJoshua.com
Lessons from Robotics Engineering is Choosing Every choice comes with pros & cons Failure is not an option, it’s a requirement Programmers are the 5th Player You’re done playing when you deploy
SSIS Choices Lessons from Robotics Time Ease Weight Cost Speed Upkeep NASA Going to the Moon SSIS Choices
SQL Server Integration Service What is SSIS? Integration Services can extract and transform data from a wide variety of sources such as XML data files, flat files, and relational data sources, and then load the data into one or more destinations. - Microsoft
Ok, But What Is SSIS Really? Software tool for ETL and workflow ETL = Extraction Translation and Loading of Data Execution Environment Where packages are run Server or standalone Development tools Included with SQL server No additional license required The execution environment is independent of the SSIS server. If you own SQL server you own SSIS A SSIS package can be run without an SSIS server or SQL server – directly from Visual Studio or utilities There are benefits to using SSIS as a server service but there are also benefits to using it standalone
Where does it fit in the SQL Server World SSIS (as server) Data Sources & Destinations Diagram is as a traditional deployment as a server SSIS DB lives on SQL server and stores the package catalog & logging Data sources and destinations can be different versions of SQL server or not even SQL SSIS execution environment can interact with the OS: File system, executables, BAT & PS Scripts Yes linked servers can make connections to other data sources and T-SQl can do some of the same things.
Why Should You Care? A bridge between SQL Server db engine and the outside world TSQL can only do so much SSIS works differently then SQL (in memory by rows not sets) You already own it Security & Logging built in It extends what you can do with SQL server out of the box It keeps getting better If you’ve tried it before it may be better now
Why is it so annoying? Development – A mind of its own Deployment – Easy if you know what to do otherwise not so much Execution Black Box – Getting better Complicated History of product DTS (old not to be confused with dtsx) VS SSIS Dev tools: BIDS / SSDT-BI / SSDT Documentation and books are weak “Can’t I use {X} to do the same thing?” - Yes
Why I’m the wrong person to talk about SSIS I’m not an SSIS Expert Often Frustrated by SSIS Can do many things in TSQL often faster to develop and process
Why I’m the right person to talk about SSIS I’m not an SSIS Expert Often Frustrated by SSIS Can do many things in TSQL often faster to develop and process A Personal Journey My goal is to get you to think about SSIS in a way that can be useful to you without having to clear a high threshold to make it work Present at PSSUG tell your journey
What Does It Do Better Than TSQL? How I think about SSIS Output files Logistics Manipulate external resources Parallelism What Does It Do Better Than TSQL?
How I think about SSIS Third party tools make it much better. Visual Studio it can be a stand-alone tool for ad-hoc tasks with a UI Like a SQL script but with more reach There is no requirement to use all of it Removes the need for xp_cmdshell or custom solutions using .net CLR Allows integration between two worlds T-SQL and the OS /External applications Security and logging
Why Did I Use SSIS T-SQL great for data, bad at interacting with the rest of the world SSIS bridged the T-SQL data manipulation world with external resources
Why Did I Use SSIS Are there other solutions? Yes Security Logging Scheduling Data driven & parameters Extensible Purpose built for ETL Are they Better? Maybe, as always the answer is it depends Cost (Buy vs Build) Sustainability
Rules For Making Peace With SSIS How I came to terms with SSIS
Rules For Making Peace With SSIS Use SSIS for what its good at ETL - Data In/Out Maintenance: sure but there are other ways that may be better Workflow to bridge data and OS/App functions Data Import Wizard is SSIS Maintenance Plans are SSIS
Rules For Making Peace With SSIS Use what you need You don’t have to use all of it You don’t have to master all of it to gain a benefit Use the right tool for the right job especially when there are multiple ways to do the same thing
Rules For Making Peace With SSIS If there is a better way then use it There are multiple ways to do the same thing as SSIS as well as within SSIS. T-SQL and Sprocs can do the heavy lifting of ETL just fine SSIS can also be used for orchestration of a process, the workflow not just the data flow
Rules For Making Peace With SSIS Build static package first then make dynamic with variables
Rules For Making Peace With SSIS Work with Visual Studio and SSDT not against Dev environment has a mind of its own some times Let it do the heavy lifting when you can
Rules For Making Peace With SSIS Good Code Is Good Code Build ETL package first Build Workflow package to call ETL Package Use parameters and variables
SSIS Development
Development Essentials Install SSDT for Visual Studio Configure VS for SSIS Development Anatomy of SSIS Solution
How To Get Started SQL Server Dev Edition SSMS – SQL Server Management Studio Visual Studio 2015 or 2017 Community Edition SSDT – SQL Server Data Tools (stand alone installer) If something doesn’t work uninstall and try again VS Takes a long time to install
SSDT – SQL Server Data Tools SQL Server Data Tools is a modern development tool for building SQL Server relational databases, Azure SQL databases, Analysis Services (AS) data models, Integration Services (IS) packages, and Reporting Services (RS) reports. With SSDT, you can design and deploy any SQL Server content type with the same ease as you would develop an application in Visual Studio. For most users, SQL Server Data Tools (SSDT) is installed during Visual Studio installation. Installing SSDT using the Visual Studio installer adds the base SSDT functionality, so you still need to run the SSDT standalone installer to get AS, IS, and RS tools. SSDT is needed for the version of Visual Studio you are running SSDT for VS 2015 has a release number of 17.x SSDT for VS 2017 has a release number of 15.x Functionality varies a bit by VS version but more by target server of the SSIS package https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt?view=sql-server-2017
SSDT & Visual Studio 2019 SSIS now installed from Visual Studio Market Place SSRS & SSAS as of VS 2017 installable from VS Market Place https://techcommunity.microsoft.com/t5/SQL-Server-Integration- Services/New-Delivery-Model-for-SQL-Server-Data-Tools-in-Visual- Studio/ba-p/479289
Blue back labels are standard Visual Studio items Red backed are SSIS specific
To start over when the workspace gets messed up Visual Studio Layout To start over when the workspace gets messed up (and it will) Window Menu Reset Window Layout
Where is the SSIS Toolbox ?! SSIS Menu SSIS Toolbox Control Flow tab Right Click SSIS toolbox from context menu NOTE: Context Changes Context Menu
Context Changes Context Menu Control Flow Data Flow Design Surface Item Design Surface Item * Also True for Connection Managers
Anatomy of a SSIS Solution Solutions, Projects, Packages, Oh my!
Target Server vs Project Version Target Server = SSIS Platform Where the package is deployed and executes Not = The version of the server you are connecting to for data I/O ProjectVersion = XML Schema version of the dtsx in use, Can upgrade but not downgrade to the best of my knowledge VS15 & 17 may be the same ProjectVersion * Connection Managers have their own configuration for each type of data source/destination
Setting Target Server NOTE: Setting Target Server Changes Functionality VS 2015 & 2017 with SSDT supports development with backwards compatibility to SQL Server 2012 This is for the SSIS runtime environment not the connection Deployment to runtime environment must match Target server version Target Server version effects functionality, tasks, and components
Anatomy of a package, project, and dev environment Control Flow Data flow Parameters Variables Tasks Sources Destinations Connections Properties These are the SSIS concepts you need to understand to get started in development Security and Deployment are also needed
Control Flow vs Data Flow Control Flow = Sequence of workflow A simple ETL project would have one Control Flow Task that is the Data Flow to Extract data to a file destination Double Click on Data Flow Task to open the Data Flow Tab for that task OR Switch to Data Flow tab and select Data flow from dropdown Data Flow = Process of ETL for data
Sources, Destinations, & Connections Connections are how SSIS interacts with external data Sources & Destinations are SSIS’s adapters to the connection Example: Flat File connection (defines structure of flat file) Flat File Source to read from file Flat File Destination to write to a file
Sources, Destinations, & Connections The connection defines what SSIS can do with the data Example: SQLOLEDB connection (like SQL client connection string) Connection allows commands OLEDB source and destinations can execute tsql, sprocs, or CRUD data from a table but behave differently as used in data flow Not to be confused with
Dev Tip If you start with a source with data SSIS will pick the SSIS data types If you start with an empty destination SSIS will convert to the destination type for you True for SQL source but not destination True for text based files Smei-true for excel source and destination Also annotations and control flow labels
Demo 1: The useless Demo Extract from db to Flat file SSIS Demo: 01 - The useless Demo Why Useless Demo All data type handling was implicit and simple Output Matched source exactly Static references No Deployment New SSIS project Add Data Flow Task Add OLE DB Source New Connection Manager Server Name XPS15-JOSHUA Database = SSIS_Demo Add Flat File Destination Show without connecting workflow Connect the output of the source Double click flat file destination Add new connection manager C:\Temp\SSIS Demo\My First Output File From SSIS.txt Check Column names in the first data row Columns screen set column delimiter to tab Ok out Show target folder Run package Why is it useless: All data type handling was implicit and simple Output Matched source exactly Static references
SSIS Data Types SSIS Data Types https://docs.microsoft.com/en-us/sql/integration-services/data- flow/integration-services-data-types?view=sql-server-2017 http://wiki.melissadata.com/index.php?title=FAQ%3ASSIS%3AData _Type_Conversions SSIS uses its own data types regardless of source or destination Converting types is necessary sometimes as part of the data flow
Converting Data Demo Import Flat file (text) to DB with data types SSIS Demo: 02 - IMP - NOAA Hourly TAB to Table SSIS Demo: 03 - IMP - NOAA Hourly TAB to Table Convert Data Type Notice: Setting Data Type of source Implicit vs Explicit Conversion Replacing vs adding columns Keeping the name for output Derived vs Converted SQL Task vs SSIS (Do you trust the Data Source) Nulls as data in SSIS
If it can be named, name it with something meaningful Dev Tip If it can be named, name it with something meaningful {Also annotations, control flow labels, and groups} https://www.mssqltips.com/sqlservertip/3110/sql-server-integration-services-ssis-inline-documentation-methods/ Also annotations and control flow labels
SSIS Variables & Parameters External interface to package Can not be altered inside package Good for passing in values for runtime Variables Internal to package Values can be expression evaluated at runtime Values can change at runtime via task
SSIS Variables & Parameters VS T-SQL with Variables & Parameters {called from SSIS} {Related but not the same}
SSIS Variables & Parameters SSIS Demo: 04 - EXP - NOAA Hly to Flat File with Params.dtsx Notice: Parameters mapped to variables Parameters set as defaults but can be set externally Variables with expressions Possibility of using an expression Task Use of “\\” in path names to escape \ Expression for Connection String of Flat File Destination Use of File System task with Variable Ignore SSIS calling T-SQL with Parameters & Variables
Dev Tip To Duplicate Project: Right Click on Package to duplicate Copy from context menu Also annotations and control flow labels
Dev Tip To Duplicate Project: Right Click on SSIS Packages Folder Paste from context menu Notice change in in dtsx files DTS:DTSID Also annotations and control flow labels
Creating Templates Create package Save to: C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\CommonExtensions\Micros oft\SSIS\ProjectItems\DataTransformationProject\DataTransformat ionItems Restart Visual Studio Add New Item to the project
Recording & Renaming Output Break Connection from source to next component Clear output columns Add back and alias output from source in order Build new destination Fix broke bits in between Update Properties {This is one way but not the only way} SSIS Demo: 05 - Reorder Output (Copy 04) Check Connection string
Importing Nulls to db 3 things for importing Nulls Data Source set to treat missing data a NULL not default of Data Type Preserve null in any data conversions or derived fields Set db destination to treat NULLS as NULLS and not default 07 - IMP - NOAA Hourly CSV with Null Handeling
Execute SQL With SQL Parameters & Variables Allows SQL to be called with variables from SSIS runtime Use with T-SQL code or Stored Procedure Call Can be used for SQL Data Sources
Execute SQL With SQL Parameters & Variables https://docs.microsoft.com/en-us/sql/integration-services/map- query-parameters-to-variables-in-an-execute-sql-task?view=sql- server-2014
Execute SQL With SQL Parameters & Variables https://docs.microsoft.com/en-us/sql/integration-services/map- query-parameters-to-variables-in-an-execute-sql-task?view=sql- server-2014
Execute SQL With SQL Parameters & Variables https://docs.microsoft.com/en-us/sql/integration-services/map- query-parameters-to-variables-in-an-execute-sql-task?view=sql- server-2014
Execute SQL Task With SQL Parameters
Execute SQL With SQL Parameters & Variables 04 - EXP - NOAA Hly to Flat File with Params Look at Data Flow OLE DB data source Look at relationship between T-SQL Variables & SSIS Variables INPUT and OUTPUT Result set
Specifying Results from Stored Procedures 06 - EXP - NOAA Hly to Flat File with Stored Procedure Return With Results syntax needed when return data structure can not be determined https://www.mssqltips.com/sqlservertip/2356/overview-of-with-result-sets-feature-of-sql-server-2012/
Conditional Flow - Tasks There is no IF, CASE, or other conditional task flow component There are logical flow controls built in 08 - Controlling flow.dtsx
Controlling Flow - Data Conditional Split Data Flow Component Default pathway Conditional CASE like additional Pathways 09 - IMP - NOAA Hourly TAB to UnPivot Table.dtsx
Calling Sub-Packages Ability to run a package from another package Use Parameters
Looping Containers Allows for dynamic variation For each member of a set Until a condition is met A bit more like programming 10 - For Each Loop from Record set - EXP to many files.dtsx
Script Task & Component Good way to expand capabilities Access to C# and .NET without SQLCLR SCRIPT Component Source Destination Transformation 11 - Script Transform - EXP - NOAA Hly to Flat File.dtsx https://www.purplefrogsystems.com/blog/2011/07/pattern-matching-in-ssis-using-regular-expressions-and-the-script-component/
Dev Tip Debugging Errors and Warnings Output Window Execute package Play button Right Click package Set Break Points Execute Task Doesn’t work in Data Flow Errors and Warnings Output Window
Dev Tip In line activities are terrific in SSIS Conditionals UnPivot Transform Derived Fields Can only replace if same data type How to add static filed Conditionals Variables Split
Deployment Configurations Visual Studio Direct From SSMS PowerShell Deployment Wizard {Native Domain Only} Build project -> .ISPAC file From SSMS PowerShell Sprocs & TSQL