Implementing ETL solution for Incremental Data Load in Microsoft SQL Server Ganesh Lohani SR. Data Analyst Lockheed Martin ganeshlohani@Hotmail.com.

Slides:



Advertisements
Similar presentations
By: Jose Chinchilla July 31, Jose Chinchilla MCITP: SQL Server 2008, Database Administrator MCTS: SQL Server 2005/2008, Business Intelligence DBA.
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
BY LECTURER/ AISHA DAWOOD DW Lab # 3 Overview of Extraction, Transformation, and Loading.
Introduction to ETL Using Microsoft Tools By Dr. Gabriel.
C6 Databases.
Moving Data Lesson 23. Skills Matrix Moving Data When populating tables by inserting data, you will discover that data can come from various sources.
Technical BI Project Lifecycle
Motif Space Database Design Kiranjit Sidhu. 2 Outline  Schema Design  Content of Database  Functionality  Future Plans.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
ETL Design and Development Michael A. Fudge, Jr.
ETL By Dr. Gabriel.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
Sayed Ahmed Logical Design of a Data Warehouse.  Free Training and Educational Services  Training and Education in Bangla: Training and Education in.
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
SQL Server 2008 for Developers John
IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
DAT 360: DTS in SQL Server 2000 Best Practices Euan Garden Group Manager, SQL Server Microsoft Corporation.
Advanced Accounting Information Systems Day 10 answers Organizing and Manipulating Data September 16, 2009.
02 | Data Flow – Extract Data Richard Currey | Senior Technical Trainer–New Horizons United George Squillace | Senior Technical Trainer–New Horizons Great.
NSF DUE ; Wen M. Andrews J. Sargeant Reynolds Community College Richmond, Virginia.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Metasolv-OCDM Connector Metasolv OCDM. What is the MSS Adapter for Oracle Communications Data Model? The Oracle Communications Metasolv and Solution Adapter.
MIS 451 Building Business Intelligence Systems Data Staging.
INCREMENTAL AGGREGATION After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation.
SLOWLY CHANGING DIMENSIONS Features vs. Performance Benjamin Sigursteinsson Miracle Iceland.
Carlos Bossy Quanta Intelligence SQL Server MCTS, MCITP BI CBIP, Data Mining Real-time Data Warehouse and Reporting Solutions.
Understanding Core Database Concepts Lesson 1. Objectives.
With Temporal Tables and More
DBM 380 AID Focus Dreams/dbm380aid.com
Data Virtualization Demoette… Caching – Database – Multi Table
Still a Toddler but growing fast
Data Warehousing/Loading the DW—Topics
Antonio Abalos Castillo
Data warehouse and OLAP
Example of a page header
SQL Server Integration Services
Data Warehouse.
Applying Data Warehouse Techniques
Where I am at: Swagatika Sarangi MDM Lead PASS Summit SQL Saturdays
Presented by: Warren Sifre
Free Dumps - Try Demo - Dumps4download
SSIS Demo Michael A. Fudge, Jr.
Swagatika Sarangi (Jazz), MDM Expert
Traveling in time with SQL Server 2017
Populating a Data Warehouse
MANAGING DATA RESOURCES
Database Vs. Data Warehouse
BRK2279 Real-World Data Movement and Orchestration Patterns using Azure Data Factory Jason Horner, Attunix Cathrine Wilhelmsen, Inmeta -
Unidad II Data Warehousing Interview Questions
Applying Data Warehouse Techniques
Populating a Data Warehouse
Typically data is extracted from multiple sources
Populating a Data Warehouse
Applying Data Warehouse Techniques
Designing SSIS Packages for Performance
Creating and Managing Database Tables
Data Warehousing Concepts
Applying Data Warehouse Techniques
Topic 11 Lesson 1 - Analyzing Data in Access
Understanding Core Database Concepts
Custom Auditing in SSIS
Applying Data Warehouse Techniques
Data Warehousing/Loading the DW—Topics
Dimension Load Patterns with Azure Data Factory Data Flows
Dmytro Polishchuk BI Developer DB Best Technologies
Presentation transcript:

Implementing ETL solution for Incremental Data Load in Microsoft SQL Server Ganesh Lohani SR. Data Analyst Lockheed Martin ganeshlohani@Hotmail.com

Business Case/Requirement Northwind operates 10 call centers in the United states. It runs the business 24 hours 7 days a week. Management is looking to build a Data Warehouse for a single source of reporting data for all call center data The report/Dashboard should be refreshed every 15 minutes. It means we need to load the new data into Data Warehouse every 15 minutes Suppose you are working for the company as a SQL/BI/ETL Developer Your manager ask you the Question: What are the different ways to do incremental data load using SQL and SSIS in SQL Server environment? How do you respond to your Manager?

Agenda/Learning Outcome You will be able to answer the following question after attending this session What are the methods to do incremental data load in on premise SQL Server environment? The following methods will be discussed in this session: Left Join Merge Statement Look Up Transformation Merge Join Transformation Slowly Changing Dimension (SDC) Change Data Capture (CDC) Demo on some of these methods

Operation Data Store (ODS) Let’s Get Started ETL (Extract Transform and Load) is an essential task of a Business Intelligence Developer, especially when someone is working in Data Warehouse environment. A simple diagram of a Business Intelligence Environment Report/Dashboard Source1 Operation Data Store (ODS) Source2 Source3 Data Warehouse/Data Mart

Operation Data Store (ODS) What is ETL ETL is the process of moving data from point A to point B Some kind of Transformations between two points The terms Source, Transformation, and Destination are used in ETL language SSIS is a tool used for ETL process Source: SQL table Excel file Flat File Transformation: Look Up Derived Column Conditional Split Destination: SQL Table Flat File ( Text and CSV) Excel Source1 SQL Table Operation Data Store (ODS) Source2 Excel file Source3 Text File

Two Common Types of Data Load Pattern Full Load A simple ETL process Deletes destination data and Loads Source data into destination Typically used in Initial data load in DW environment and small data set It takes more time if the source data set is large History is lost Incremental Load Relatively a complex ETL process but requires less time to process data It Processes only new or updated records Typically used in DW environment for larger data sets History is maintained Most of the time, date Time stamp is used to load the incremental data Point A(Source): Call Center Data Point B(Destination): Data Warehouse

Incremental Data Load: Method 1 SQL Left Join: Left join returns all the values from the left table, and matched values from the right table Use Execute SQL Task in SSIS to implement this method Source and destination tables must be in the same server

Incremental Data Load: Method 2 Merge Join Transformation: The Merge Join transformation lets us join data from more than one data source The Merge Join transformation is similar to performing a join in a TSQL Use this method if we are joining data from different data sources in the SSIS pipeline Data must be shorted in order to use Merge Transformation

Incremental Data Load: Method 3 SQL Merge Statement: Combination of three SQL statements: INSERT, UPDATE and DELETE. INSERT - when there is data in source table and not in target table  UPDATE - the data in source table is matched with target table but any entry other than the primary key is not matched DELETE - there is data in target table and not in source table

Incremental Data Load: Method 4 Look Up Transformation: The Lookup transformation performs lookups by joining data in input columns with columns in a reference dataset. The reference dataset can be a cache file, an existing table or view, a new table, or the result of an SQL query.  If there is no matching entry in the reference dataset, no join occurs. If there are multiple matches in the reference table, the Lookup transformation returns only the first match returned by the lookup query.  Useful in Dimension Data Load and small dataset

Incremental Data Load: Method 5 Slowly changing Dimension (SCD) Transformation: Some data attributes change Over time Slowly Changing Dimensions (SCD) are dimensions that have data that slowly changes. For example, you may have a Dimension in your database that tracks the sales records of your company's salespeople. The Type 1 methodology overwrites old data with new data, and therefore does not track historical data at all. This is most appropriate when correcting certain types of data errors, such as the spelling of a name. The Type 2 method tracks historical data by creating multiple records in the dimensional tables with separate keys. The Type 3 method tracks changes using separate columns.

Incremental Data Load: Method 6 Change Data Capture: Another way to do incremental data load if the system supports Change Data Capture technology Available in SQL Server 2008 and newer version Need to Enable Change Data Capture on a Database and tables Need to have SQL Server Agent Started and running for CDC to work correctly Change data capture records insert, update, and delete activity in separate table.

Demo 1. Full Load 2. Incremental Load SQL Left Outer Join SSIS Merge Join Transformation SQL Merge Statement SSIS Look Up Transformation SSIS Slowly Changing Dimension Change Data Capture

Conclusion What are Incremental Data Load methods available using SQL and SSIS? Left Join Merge Statement Look Up Transformation Merge Join Transformation Slowly Changing Dimension (SDC) Change Data Capture Question: What is the best way to do the incremental data load in SQL Server? The answer is: It depends: Data Source Complexity of Business rules and data transformations Project Time Line Hardware/Software environment ( SQL Server, SSIS Server) Company Policy and Procedure Developer skillsets

Q & A Thank you for attending the session !!! Questions: What feedback you have for me? Questions: ganeshlohani@Hotmail.com (Ganesh Lohani)