Easy ETL with Andrzej Kukuła – Marcin Szeliga –

Slides:



Advertisements
Similar presentations
Creating a Meta Data Driven SSIS Solution with Biml
Advertisements

Websydian Anne-Marie Arnvig Manager, Websydian Communications & Relations.
Websydian products.
Data Manager Business Intelligence Solutions. Data Mart and Data Warehouse Data Warehouse Architecture Dimensional Data Structure Extract, transform and.
James Serra – Data Warehouse/BI/MDM Architect
Technical BI Project Lifecycle
SSIS Field Notes Darren Green Konesans Ltd. SSIS Field Notes After years of careful observation and recording of the Species SSIS, Genus ETL, in both.
© 2004 Visible Systems Corporation. All rights reserved. 1 (800) 6VISIBLE Holistic View of the Enterprise Business Development Operations.
Building Enterprise Applications Using Visual Studio ®.NET Enterprise Architect.
Object Oriented System Development with VB .NET
Business Intelligence Michael Gross Tina Larsell Chad Anderson.
Top 10 SSIS Best Practices Tim Mitchell Artis Consulting The World’s Largest Community of SQL Server Professionals.
A tour of new features introducing LINQ. Agenda of LINQ Presentation We have features for every step of the way LINQ Fundamentals Anonymous Functions/Lambda.
 ETL: Extract Transformation and Load  Term is used to describe data migration or data conversion process  ETL may be part of the business process repeated.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.
SPONSORS. Microsoft PowerPivot for SQL Server, Excel 2010, and SharePoint 2010 Michael Herman Syntergy, Inc.
SSIS Over DTS Sagayaraj Putti (139460). 5 September What is DTS?  Data Transformation Services (DTS)  DTS is a set of objects and utilities that.
Introduction to .Net Framework
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
Database Design for DNN Developers Sebastian Leupold.
Using Microsoft ACCESS to develop small to medium applications on campus.
Lesley Bross, August 29, 2010 ArcGIS 10 add-in glossary.
Activity Running Time DurationIntro0 2 min Setup scenario 2 2 min SQL BI components & concepts 4 5 min Data input (Let’s go shopping) 9 7 min Whiteboard.
COMP 410 & Sky.NET May 2 nd, What is COMP 410? Forming an independent company The customer The planning Learning teamwork.
Codeigniter is an open source web application. It occupies a very small amount of space in the memory and is most useful for developers who aim to develop.
SEATTLE BI MEETUP BI & BIG FISH April 2 nd, 2014 Emre Motan.
COLD FUSION Deepak Sethi. What is it…. Cold fusion is a complete web application server mainly used for developing e-business applications. It allows.
HDNUG 27-March-2007 SQL Server 2005 Suite as a Business Intelligence Solution.
DTS Conversion to SSIS Conversion Best Practices Mike Davis
Soup-2-Nuts Alaska Department of Fish & Game Commercial Fisheries October, 2011.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
L8 - March 28, 2006copyright Thomas Pole , all rights reserved 1 Lecture 8: Software Asset Management and Text Ch. 5: Software Factories, (Review)
3 Copyright © 2009, Oracle. All rights reserved. Accessing Non-Oracle Sources.
ADAPTING YOUR ETL SOLUTION TO USE SSIS 2012 Presentation by Devin Knight
Software Reuse Course: # The Johns-Hopkins University Montgomery County Campus Fall 2000 Session 4 Lecture # 3 - September 28, 2004.
Please note that the session topic has changed
Metadata-driven Automatic Package Creation with Notes from the field.
Integrating and Extending Workflow 8 AA301 Carl Sykes Ed Heaney.
Easy ETL with Thank you to our AWESOME sponsors!
Copyright © 2006, Oracle. All rights reserved. Czinkóczki László oktató Using the Oracle Warehouse Builder.
Developing SQL/Server database in Visual Studio Introducing SQL /Server Data Tools Peter Lu.Net Practices Director Principle Architect Nexient March 19.
Helping Your Data Warehouse Succeed: 10 Mistakes to Avoid in Data Integration Rafael Salas w:
Copyright 2015 Varigence, Inc. Unit and Integration Testing in SSIS A New Approach Scott @varigence.
Building Your ETL Framework with Biml Meagan Longoria March 19, 2016.
SQL Server 2016 Integration Services (SSIS)
Session Name Pelin ATICI SQL Premier Field Engineer.
Advanced BIML topics Be a W.I.S.E. A.S.S. Me ! Self-employed BI consultant Author Trainer MCT
An Introduction to the magical world of BIML!
Building Enterprise Applications Using Visual Studio®
ETL Design - Stage Philip Noakes May 9, 2015.
BIML: Step by Step Julie Smith.
What’s new in SQL Server 2017 for BI?
Andrzej Kukuła Easy ETL with and SSIS.
Populating a Data Warehouse
Populating a Data Warehouse
Populating a Data Warehouse
Simon Kingaby #SimonKingaby
Populating a Data Warehouse
Populating a Data Warehouse
Orchestration and data movement with Azure Data Factory v2
Populating a Data Warehouse
ETL Automation using Biml
Business Intelligence
Orchestration and data movement with Azure Data Factory v2
Using Biml to Automate the Generation of SSIS Packages
SSIS Data Integration Data Warehouse Acceleration
SSIS Data Integration Data Warehouse Acceleration
SSIS Data Integration Data Warehouse Acceleration
Resources.
Presentation transcript:

Easy ETL with Andrzej Kukuła – Marcin Szeliga –

Agenda  What’s ETL  SSIS – pros and cons  Example case  Traditional approach  Novel approach  Biml and BimlScript  Benefits

ETL  Extract, Transform, Load  Data extraction from an OLTP systems, denormalization, conversion, modelling and loading into Data Warehouse, Operational Data Store, Data Mart  Also ELT, ETLT  Performance reasons  Historic data handling (SCD)  Better analysis capabilities Image from:

SQL Server Integration Services  Well known technology for ETL, integration, interfacing, data movement automation  First choice for staging/data loading process  Ability to perform sophisticated data manipulation  Runtime and Integrated Services Catalog  Integrated with SQL Server  Lots of components  Very good performance

Problems with SSIS  Package design process in SSDT-BI  Manual  Not generic  Not developer-friendly  Doesn’t allow code reuse – repeatable work of implementing many similar packages  No support for version control  Frustrating, slow, boring, painful, error-prone, expensive  At low level  DTSX is almost unreadable and unmodifiable

Problems with SSIS  Metadata management is difficult  substantial development time is spent on solving dreaded metadata issues  is this something we should really focus on?...

More problems with SSIS  Automatic generation of packages is difficult  DTS API is not trivial  EzAPI ditto, also not updated  Generation of target DTSX’ XML manually is close to impossible in reasonable time  Manually generated packages won’t run or won’t open in SSDT most of the time (the infamous message „ Package Load error 0xC in CPackage::LoadFromXML ”)

Executive Problems with SSIS  ETL process is  Slow  Expensive  Difficult to implement  Difficult to maintain  Difficult to adapt to changes  One new column introduced in one source table can take weeks or even months to implement Image from:

Example Package

Example DTSX Is this reasonable amount of code to accomplish the task?…

What if…? …we changed rules of the game, and instead of creating SSIS packages by hand, just give a computer recipes on how to make the packages for us?  Recipe language easy to learn and use  With smart default values and default behavior  Be able to use programming language to make recipes more dynamic, easier adapting to changes in databases and business requirements  Don’t bother with metadata (most of the time)

What if…?  Have full power of.NET framework available  Organize recipes into templates and libraries for multiple use  Build SSIS packages automatically  in repeatable way  Use version control to track code changes  Use CI and CD to automate deployments

Enter Is this reasonable amount of code to accomplish the task?

What’s Biml?  Business Intelligence Markup Language  A really easy XML-based language to describe BI assets  Connections  Tables, Views  SSIS Packages, SSIS Projects  Dimensions, Measure Groups, Cubes  and more…  Available straight in SSDT-BI for free!  All you need to begin is BIDS Helper add-in  With Biml Intellisense Demo!

But wait, there’s more!  The ability to instrument Biml with C#/VB.NET code  Full power of.NET framework and all available libraries at your disposal  Supports including and calling other Biml files, and referencing external.NET assemblies  Allows even extending built-in Biml.NET classes  Available in SSDT with BIDS Helper

How does it work? Biml + BimlScript source files C#/VB compiler RootNode propagation Single, in-memory, expanded, compiled and merged Biml Code generator …

RootNode  The model of all assets in the project (how we want the database and packages to look like)  Used within C#/VB code  Read/write RootNode

BimlScript Features  „Layered” expansion based on „tiers”  Makes it easy to prepare resources before using them (e.g. fetch from metadata store)  Tier n+1 sees everything in tier n in already expanded form  No limit on number of tiers Tier 1 Tier 2

BimlScript Features  Just to name a few  Ability to dynamically fetch database schema during generation of packages  Ease of implementation of custom metadata-based processing logic  Automatic data type conversions during Data Flow Task  SCD handling during loading of dimensions  SQL Server’s built-in extended properties proven to be good metadata storage  Can be easily adapted to new/changed requirements  Spectacular effects need just a few lines of C# code Demo!

More features  Transformers and „Frameworks”  Ability to change the way how code is generated  DDL generation  MSBuild integration  CI with TFS Build, TeamCity immediately possible  Ability to „reverse engineer” DTSX packages into Biml

How to benefit it?  BI Project Decision Makers  Faster initial delivery  Lower cost of change  Immediate Biml/Mist ROI  BI Architects  Reusable Design Patterns with Biml/BimlScript  One project compatible with SQL Server  Easily manage large BI code base, tasks, and issues using TFS  Plan for BI Continuous Integration/Continuous Delivery

How to benefit it?  ETL developers  Fast Biml learning curve  Generate your DTSX faster with Biml instead of drag’n’drop  Embrace DRY in BI development  Use proper version control to manage your sources  BI consultants  Increased productivity  Build your Biml/BimlScript code library and reuse it in different projects  No runtime license costs for customers

Q&A

Sponsors