Presentation is loading. Please wait.

Presentation is loading. Please wait.

Incrementally Moving to the Cloud Using Biml

Similar presentations


Presentation on theme: "Incrementally Moving to the Cloud Using Biml"— Presentation transcript:

1 Incrementally Moving to the Cloud Using Biml
Scott Currie Varigence

2 Agenda Azure Data Factory Cloud Data Movement Workflows (in general)
What is Azure Data Factory? Scenarios for using ADF with Biml ADF in Biml Azure Feature Pack for SSIS Cloud Data Movement Workflows (in general)

3 What is Azure Data Factory?
“Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data. Just like a manufacturing factory that runs equipment to take raw materials and transform them into finished goods, Data Factory orchestrates existing services that collect raw data and transform it into ready-to-use information.”

4 Ehhhhhhhh… Think of Azure Data Factory (as it currently stands) as being like SQL Agent in the cloud Data movement is pretty useful Data transformation? Not so much Amount of configuration is rather heavy Most of the development must be done by hand with JSON

5 Let’s take a look at… Azure Data Factory

6 Scenarios for Using ADF with Biml
Equivalent of simple staging for on-premises / cloud hybrid scenarios Orchestration of AzureDW workflows Orchestration and autogeneration of big data workflows Hadoop U-SQL Failover and surge strategies

7 Azure Data Factory in Biml
Let’s take a look at… Azure Data Factory in Biml

8 Biml Workflow

9 Azure Feature Pack for SSIS
Connection Managers Azure Storage Connection Manager Azure Subscription Connection Manager Tasks Azure Blob Upload Task Azure Blob Download Task Azure HDInsight Hive Task Azure HDInsight Pig Task Azure HDInsight Create Cluster Task Azure HDInsight Delete Cluster Task Data Flow Components Azure Blob Source Azure Blob Destination Azure Blob Enumerator Foreach Azure Blob Enumerator

10 Cloud Data Movement Workflows (in general)
Migrating Data Options Real World Migration Scenario Migrating Data to the Cloud

11 Migrating Data Options (Azure)
Load Data with Azure Data Factory Move data from OnPrem to Azure Storage Blob to SQL Data Warehouse Load data with PolyBase in SQL Data Warehouse Load data into Azure Storage Blob using AzCopy Load data into SQL Data Warehouse using PolyBase Load data with BCP in SQL Data Warehouse Import data into SQL Data Warehouse using BCP

12 Loading Data Blob Storage Data Factory SQL DWH Amazon S3 Snowball
AWS Import/Export Amazon Redshift Bucket with Objects Snowball Amazon S3 Technical metadata Technical metadata (ETL process metadata, back room metadata, transformation metadata) is a representation of the ETL process. It stores data mapping and transformations from source systems to the data warehouse and is mostly used by data warehouse developers, specialists and ETL modellers. Most commercial ETL applications provide a metadata repository with an integrated metadata management system to manage the ETL process definition. The definition of technical metadata is usually more complex than the business metadata and it sometimes involves multiple dependencies. The technical metadata can be structured in the following way: Source Database - or system definition. It can be a source system database, another data warehouse, file system, etc. Target Database - Data Warehouse instance Source Tables - one or more tables which are input to calculate a value of the field Source Columns - one or more columns which are input to calculate a value of the field Target Table - target DW table and column are always single in a metadata repository. Target Column - target DW column Transformation - the descriptive part of a metadata entry. It usually contains a lot of information, so it is important to use a common standard throughout the organisation to keep the data consistent.

13 Load data with PolyBase in SQL Data Warehouse
Blob Storage PolyBase SQL DWH Technical metadata Technical metadata (ETL process metadata, back room metadata, transformation metadata) is a representation of the ETL process. It stores data mapping and transformations from source systems to the data warehouse and is mostly used by data warehouse developers, specialists and ETL modellers. Most commercial ETL applications provide a metadata repository with an integrated metadata management system to manage the ETL process definition. The definition of technical metadata is usually more complex than the business metadata and it sometimes involves multiple dependencies. The technical metadata can be structured in the following way: Source Database - or system definition. It can be a source system database, another data warehouse, file system, etc. Target Database - Data Warehouse instance Source Tables - one or more tables which are input to calculate a value of the field Source Columns - one or more columns which are input to calculate a value of the field Target Table - target DW table and column are always single in a metadata repository. Target Column - target DW column Transformation - the descriptive part of a metadata entry. It usually contains a lot of information, so it is important to use a common standard throughout the organisation to keep the data consistent.

14 Load data with BCP in SQL Data Warehouse
bcp DimDate2 in C:\Temp\DimDate2.txt -S <Server Name> -d <Database Name> -U <Username> -P <password> -q -c -t ',' BCP Data Migration Wizard SQL DWH Technical metadata Technical metadata (ETL process metadata, back room metadata, transformation metadata) is a representation of the ETL process. It stores data mapping and transformations from source systems to the data warehouse and is mostly used by data warehouse developers, specialists and ETL modellers. Most commercial ETL applications provide a metadata repository with an integrated metadata management system to manage the ETL process definition. The definition of technical metadata is usually more complex than the business metadata and it sometimes involves multiple dependencies. The technical metadata can be structured in the following way: Source Database - or system definition. It can be a source system database, another data warehouse, file system, etc. Target Database - Data Warehouse instance Source Tables - one or more tables which are input to calculate a value of the field Source Columns - one or more columns which are input to calculate a value of the field Target Table - target DW table and column are always single in a metadata repository. Target Column - target DW column Transformation - the descriptive part of a metadata entry. It usually contains a lot of information, so it is important to use a common standard throughout the organisation to keep the data consistent.

15 Show me the code already!!!
Aren’t you the guy that lives codes in presentations? Show me the code already!!!

16 I tried that, but it was slow

17 The Need for Speed 10 terabytes of data will take more than 10 days to transfer over a dedicated 100 Mbps connection.

18 Real World Example of Moving to the Cloud
Technical metadata Technical metadata (ETL process metadata, back room metadata, transformation metadata) is a representation of the ETL process. It stores data mapping and transformations from source systems to the data warehouse and is mostly used by data warehouse developers, specialists and ETL modellers. Most commercial ETL applications provide a metadata repository with an integrated metadata management system to manage the ETL process definition. The definition of technical metadata is usually more complex than the business metadata and it sometimes involves multiple dependencies. The technical metadata can be structured in the following way: Source Database - or system definition. It can be a source system database, another data warehouse, file system, etc. Target Database - Data Warehouse instance Source Tables - one or more tables which are input to calculate a value of the field Source Columns - one or more columns which are input to calculate a value of the field Target Table - target DW table and column are always single in a metadata repository. Target Column - target DW column Transformation - the descriptive part of a metadata entry. It usually contains a lot of information, so it is important to use a common standard throughout the organisation to keep the data consistent.

19 Load Data with SSIS (ETL)
STUFF Technical metadata Technical metadata (ETL process metadata, back room metadata, transformation metadata) is a representation of the ETL process. It stores data mapping and transformations from source systems to the data warehouse and is mostly used by data warehouse developers, specialists and ETL modellers. Most commercial ETL applications provide a metadata repository with an integrated metadata management system to manage the ETL process definition. The definition of technical metadata is usually more complex than the business metadata and it sometimes involves multiple dependencies. The technical metadata can be structured in the following way: Source Database - or system definition. It can be a source system database, another data warehouse, file system, etc. Target Database - Data Warehouse instance Source Tables - one or more tables which are input to calculate a value of the field Source Columns - one or more columns which are input to calculate a value of the field Target Table - target DW table and column are always single in a metadata repository. Target Column - target DW column Transformation - the descriptive part of a metadata entry. It usually contains a lot of information, so it is important to use a common standard throughout the organisation to keep the data consistent.

20 Load Data with SSIS (ETL) without STUFF
AdoNet Postgres OleDb Technical metadata Technical metadata (ETL process metadata, back room metadata, transformation metadata) is a representation of the ETL process. It stores data mapping and transformations from source systems to the data warehouse and is mostly used by data warehouse developers, specialists and ETL modellers. Most commercial ETL applications provide a metadata repository with an integrated metadata management system to manage the ETL process definition. The definition of technical metadata is usually more complex than the business metadata and it sometimes involves multiple dependencies. The technical metadata can be structured in the following way: Source Database - or system definition. It can be a source system database, another data warehouse, file system, etc. Target Database - Data Warehouse instance Source Tables - one or more tables which are input to calculate a value of the field Source Columns - one or more columns which are input to calculate a value of the field Target Table - target DW table and column are always single in a metadata repository. Target Column - target DW column Transformation - the descriptive part of a metadata entry. It usually contains a lot of information, so it is important to use a common standard throughout the organisation to keep the data consistent.

21 Load Data Pattern without STUFF
UTF-8 10X Blob Storage PolyBase SQL DWH Technical metadata Technical metadata (ETL process metadata, back room metadata, transformation metadata) is a representation of the ETL process. It stores data mapping and transformations from source systems to the data warehouse and is mostly used by data warehouse developers, specialists and ETL modellers. Most commercial ETL applications provide a metadata repository with an integrated metadata management system to manage the ETL process definition. The definition of technical metadata is usually more complex than the business metadata and it sometimes involves multiple dependencies. The technical metadata can be structured in the following way: Source Database - or system definition. It can be a source system database, another data warehouse, file system, etc. Target Database - Data Warehouse instance Source Tables - one or more tables which are input to calculate a value of the field Source Columns - one or more columns which are input to calculate a value of the field Target Table - target DW table and column are always single in a metadata repository. Target Column - target DW column Transformation - the descriptive part of a metadata entry. It usually contains a lot of information, so it is important to use a common standard throughout the organisation to keep the data consistent.

22 New Load Data Pattern without STUFF
Technical metadata Technical metadata (ETL process metadata, back room metadata, transformation metadata) is a representation of the ETL process. It stores data mapping and transformations from source systems to the data warehouse and is mostly used by data warehouse developers, specialists and ETL modellers. Most commercial ETL applications provide a metadata repository with an integrated metadata management system to manage the ETL process definition. The definition of technical metadata is usually more complex than the business metadata and it sometimes involves multiple dependencies. The technical metadata can be structured in the following way: Source Database - or system definition. It can be a source system database, another data warehouse, file system, etc. Target Database - Data Warehouse instance Source Tables - one or more tables which are input to calculate a value of the field Source Columns - one or more columns which are input to calculate a value of the field Target Table - target DW table and column are always single in a metadata repository. Target Column - target DW column Transformation - the descriptive part of a metadata entry. It usually contains a lot of information, so it is important to use a common standard throughout the organisation to keep the data consistent.

23 Questions?

24 Load Data with Biml Pattern without STUFF
UTF-8 Blob Storage PolyBase SQL DWH Technical metadata Technical metadata (ETL process metadata, back room metadata, transformation metadata) is a representation of the ETL process. It stores data mapping and transformations from source systems to the data warehouse and is mostly used by data warehouse developers, specialists and ETL modellers. Most commercial ETL applications provide a metadata repository with an integrated metadata management system to manage the ETL process definition. The definition of technical metadata is usually more complex than the business metadata and it sometimes involves multiple dependencies. The technical metadata can be structured in the following way: Source Database - or system definition. It can be a source system database, another data warehouse, file system, etc. Target Database - Data Warehouse instance Source Tables - one or more tables which are input to calculate a value of the field Source Columns - one or more columns which are input to calculate a value of the field Target Table - target DW table and column are always single in a metadata repository. Target Column - target DW column Transformation - the descriptive part of a metadata entry. It usually contains a lot of information, so it is important to use a common standard throughout the organisation to keep the data consistent.


Download ppt "Incrementally Moving to the Cloud Using Biml"

Similar presentations


Ads by Google