SQL Server Integration Services Deep Dive Warren Stevens-Baytopp BI Architect - GijimaAst

Slides:



Advertisements
Similar presentations
SSIS Dataflow Performance Tuning 1 st October 2010 Jamie Thomson.
Advertisements

Module 8 Importing and Exporting Data. Module Overview Transferring Data To/From SQL Server Importing & Exporting Table Data Inserting Data in Bulk.
Deep Dive into ETL Implementation with SQL Server Integration Services
Moving Data Lesson 23. Skills Matrix Moving Data When populating tables by inserting data, you will discover that data can come from various sources.
HOW TO MAKE A CLIMATE GRAPH CLIMATE GRAPHING ASSIGNMENT PT.2.
Top 10 SSIS Best Practices Tim Mitchell Artis Consulting The World’s Largest Community of SQL Server Professionals.
Copying, Managing, and Transforming Data With DTS.
Module 11: Data Transport. Overview Tools and functionality in Oracle and their equivalents in SQL Server for: Data transport out of the database Data.
Module 9: Transferring Data. Overview Introduction to Transferring Data Tools for Importing and Exporting Data in SQL Server Introduction to DTS Transforming.
ETL By Dr. Gabriel.
Performance Tuning SSIS. HR Departments are no fun. Don’t mention the stalking incident with Clay Aiken What happened in Vegas My prom date with a puppet.
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
What’s New in SSIS with SQL 2008 Bret Stateham Training Manager Vortex Learning Solutions blogs.netconnex.com.
Activity Running Time DurationIntro0 2 min Setup scenario 2 2 min SQL BI components & concepts 4 5 min Data input (Let’s go shopping) 9 7 min Whiteboard.
DTS Conversion to SSIS Conversion Best Practices Mike Davis
IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.
Data Management Console Synonym Editor
ETL Extract Transform Load. Introduction of ETL ETL is used to migrate data from one database to another, to form data marts and data warehouses and also.
DAT 360: DTS in SQL Server 2000 Best Practices Euan Garden Group Manager, SQL Server Microsoft Corporation.
Integration Services in SQL Server 2008 Allan Mitchell SQL Server MVP.
- Joiner Transformation. Introduction ►Transformations help to transform the source data according to the requirements of target system and it ensures.
1 Advanced Topics Using Microsoft SQL Server 2005 Integration Services Allan Mitchell – SQLBits – Oct 2007.
6 Copyright © 2009, Oracle. All rights reserved. Using the Data Transformation Operators.
Advanced Tips And Tricks For Power Query
Building Data Integration Solutions with Integration Services Donald Farmer Group Program Manager Microsoft Corporation.
02 | Data Flow – Extract Data Richard Currey | Senior Technical Trainer–New Horizons United George Squillace | Senior Technical Trainer–New Horizons Great.
2011 Calendar Important Dates/Events/Homework. SunSatFriThursWedTuesMon January
Chapter 2 Sets and Functions Section 2.2 Operations on Two Sets.
This is an example text TIMELINE PROJECT PLANNING DecOctSepAugJulyJuneAprilMarchFebJanMayNov 12 Months Example text Go ahead and replace it with your own.
SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.
Please note that the session topic has changed
Aggregator  Performs aggregate calculations  Components of the Aggregator Transformation Aggregate expression Group by port Sorted Input option Aggregate.
Creating Simple and Parallel Data Loads With DTS.
Best Practices in Loading Large Datasets Asanka Padmakumara (BSc,MCTS) SQL Server Sri Lanka User Group Meeting Oct 2013.
Aggregator Stage : Definition : Aggregator classifies data rows from a single input link into groups and calculates totals or other aggregate functions.
Dynamicpartnerconnections.com Development for performance Oleksandr Katrusha, Program manager
Know your data source well. Who am I? Nik – Shahriar Nikkhah Microsoft MVP 2010 – SQL Server MCITP SQL 2008 MCTS SQL 2008 and s:
Scripting Just Enough SSIS to be Dangerous. 6/13/2015 Visit the Sponsor tables to enter their end of day raffles. Turn in your completed Event Evaluation.
Explore engage elevate Data Migration Without Tears Mike Feingold Empoint Ltd Tuesday 10th November 2015.
Jemini Joseph. About me Working in Microsoft BI field since Mostly consulting in SSIS Worked as programmer in Visual Basic before moving to BI
ACIS Introduction to Data Analytics & Business Intelligence Business Intelligence Logical Functions Part 2.
Copyright 2015 Varigence, Inc. Unit and Integration Testing in SSIS A New Approach Scott @varigence.
Pulling Data into the Model. Agenda Overview BI Development Studio Integration Services Solutions Integration Services Packages DTS to SSIS.
SSIS ETL Data Resource Management. Create an ETL package using a wizard database server to database server The business goal of this ETL package is to.
Supervisor : Prof . Abbdolahzadeh
Designing and Implementing an ETL Framework
Informatica PowerCenter Performance Tuning Tips
SQL Server Integration Services
Baltimore.
Presented by: Warren Sifre
The Ins and Outs of Partitioned Tables
Due Dates subject to change. Watch for dashboard updates.
FY 2019 Close Schedule Bi-Weekly Payroll governs close schedule
2009 TIMELINE PROJECT PLANNING 12 Months Example text Jan Feb March
Let’s Talk About Variable Attributes
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
JUNE 2010 CALENDAR PROJECT PLANNING 1 Month MONDAY TUESDAY WEDNESDAY
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
2009 TIMELINE PROJECT PLANNING 12 Months Example text Jan Feb March
2015 January February March April May June July August September
Implementing ETL solution for Incremental Data Load in Microsoft SQL Server Ganesh Lohani SR. Data Analyst Lockheed Martin
Presentation transcript:

SQL Server Integration Services Deep Dive Warren Stevens-Baytopp BI Architect - GijimaAst

Merge data from heterogeneous data stores:  Text files  Mainframes  Spreadsheets  Multiple RDBMS Refresh data in data warehouses and data marts Cleanse data before loading to remove errors High-speed load of data into online transaction processing (OLTP) and online analytical processing (OLAP) databases Send status notifications on success/failure Build BI into a data transformation process without the need for redundant staging environments Automate data-administrative functions Integration Services Why ETL Matters

 Data sources can be diverse, including custom or scripted adapters  Transformation components shape and modify data in many ways.  Data is routed by rules or error conditions for cleansing and conforming.  Flows can be as complex as your business rules, but highly concurrent.  And finally data can be loaded in parallel to many varied destinations. SSIS Overview

So Where to Now… Data Sources –Excel Common problem - not all data coming through correctly By Default Excel will determine the column types based on a “Majority Type” rule. Overcome this by forcing a mixed type in the Data connector

Excel Demo

Data Sources Continued Data Sources –Verifying Connectivity / Availability ETL Tasks run through some of the steps and then fail on connectivity issues. Why Would you want to check for this? Use scripting task.

Scripting Demo

Scripting Demo Lets Test it

Scripting Demo

Data Sources Continued Data Sources –OLE DB Provider Selecting Table or View dropdown as a source. So what is the problem with this? Replace with what –Select * from [TABLENAME] – not much better or is it? –Select [field list] from [TABLENAME] – resource usage

If a table is selected –SSIS issues an OPENROWSET If a SQL statement is used –SSIS issues sp_executesql. OLE DB Provider Demo

Sourcing Data Common Requirement –Get all Data from one table that does not exist in another Get all rows from a staging table where the business key is not in the dimension table Conventional T-SQL

INSERT INTO DIM_DATE SELECT s.* FROM STG_DATE s LEFT OUTER JOIN DIM_DATE d ON s.DateID = d.DateID WHERE d.DateID IS NULL INSERT INTO DIM_DATE SELECT s.* FROM STG_DATE s WHERE DateID NOT IN (SELECT DISTINCT DateID FROM DIM_DATE d) Sourcing Data Conventional T-SQL

Sourcing Data Common Requirement –Get all Data from one table that does not exist in another Get all rows from a staging table where the business key is not in the dimension table Conventional T-SQL Using SSIS

Sourcing Data Using SSIS Merge Join –Same as first T-SQL Statement –Requires a Sort and Conditional Split Lookup –Using the SSIS functionality. –Less Coding –Uses the error output as the valid records. Speed Comparisons

Lookups Exact Matching Want data that matches a specific field. –Normal usage of Lookup Range Comparisons Want data that falls between 2 values –The Caching SQL Statement –Mapping of Parameters

Lookups – Range Comparisons

Date and Time Handling Date formatting –Construct a date string in the format “ YYYYMMDD HHMISS” –Get the month name –Get the formatted month and year in the form - mmm (yyyy) –Create a file using a date in the form yyyy-mm-dd –Create a file using a date in the form yyyy-mm-dd for yesterdays date –A simple yyyymmdd formatted – two ways of doing this

Date Formatting YYYYMMDD HHMISS (DT_STR,4,1252)DATEPART("yyyy",OldDate) + RIGHT("0" + (DT_STR,2,1252)DATEPART("mm",OldDate),2) + RIGHT("0" + (DT_STR,2,1252)DATEPART("dd",OldDate),2) + " " + RIGHT("0" + (DT_STR,2,1252)DATEPART("hh",OldDate),2) + RIGHT("0" + (DT_STR,2,1252)DATEPART("mi",OldDate),2) + RIGHT("0" + (DT_STR,2,1252)DATEPART("ss",OldDate),2)

Date Formatting Month Name (MONTH(NewDate) == 1 ? "January" : MONTH(NewDate) == 2 ? "February" : MONTH(NewDate) == 3 ? "March" : MONTH(NewDate) == 4 ? "April" : MONTH(NewDate) == 5 ? "May" : MONTH(NewDate) == 6 ? "June" : MONTH(NewDate) == 7 ? "July" : MONTH(NewDate) == 8 ? "August" : MONTH(NewDate) == 9 ? "September" : MONTH(NewDate) == 10 ? "October" : MONTH(NewDate) == 11 ? "November" : MONTH(NewDate) == 12 ? "December" : "InvalidMonth")

Date Formatting mmm (yyyy) (MONTH(OldDate) == 1 ? "Jan" : MONTH(OldDate) == 2 ? "Feb" : MONTH(OldDate) == 3 ? "Mar" : MONTH(OldDate) == 4 ? "Apr" : MONTH(OldDate) == 5 ? "May" : MONTH(OldDate) == 6 ? "Jun“ : MONTH(OldDate) == 7 ? "Jul" : MONTH(OldDate) == 8 ? "Aug" : MONTH(OldDate) == 9 ? "Sep" : MONTH(OldDate) == 10 ? "Oct" : MONTH(OldDate) == 11 ? "Nov" : MONTH(OldDate) == 12 ? "Dec" : "ERR") + " (" + DT_WSTR,4)YEAR(OldDate) + ")"

Date Formatting Text file with YYYY-MM-DD "C:\\Temp\\ErrorCodes\\" + (DT_WSTR,4)YEAR(NewDate) + "-" + RIGHT("0" + (DT_WSTR,2)MONTH(NewDate), 2) + "-" + RIGHT("0" + (DT_WSTR,2)DAY( NewDate), 2) + ".txt"

Date Formatting Same but for Yesterday "C:\\Temp\\ErrorCodes\\" + (DT_WSTR,4)YEAR(DATEADD("dd", -1, OldDate)) + "-" + RIGHT("0" + (DT_WSTR,2)MONTH(DATEADD("dd", -1, OldDate)), 2) + "-” + RIGHT("0" + (DT_WSTR,2)DAY(DATEADD("dd", -1, OldDate)), 2) + ".txt“ dateadd function: In each portion of the check

Date Formatting A Simple YYYYMMDD (DT_WSTR,4)YEAR(OldDate) + RIGHT("0" + (DT_WSTR,2)MONTH(OldDate), 2) + RIGHT("0" + (DT_WSTR,2)DAY(OldDate), 2) OR (DT_WSTR,8) ((YEAR(OldDate) * 10000) + (MONTH(OldDate) * 100) + DAY(OldDate))

Some Performance Tuning Tips Only Select the Columns that you need Use a SQL Server Destination instead of an OLE DB Destination If using an OLE DB Destination – use the table or view with fast load data access mode. Use standardized naming conventions Where possible, filter your data in the Source Adapter rather than using a Conditional Split transform component LOOKUP components will generally work quicker than MERGE JOIN components where the 2 can be used for the same task Use caching in your LOOKUP components where possible. It makes them quicker. Just watch that you are not grabbing too many resources. Use Sequence containers to organize package structure into logical units of work.

For More Information… Visit TechNet at Learn more about SSIS on MSDN at Great information available at for Whitepapers and downloads of custom transformationshttp://download.microsoft.com Jamie Thomson’s Blog Donald Farmer’s Blog

Thank you to our Partners for their support of TechDays 2007