Download presentation
Presentation is loading. Please wait.
Published byRosa Clark Modified over 8 years ago
1
Loading Data in Azure Azure Data Factory
2
What is Azure Data Factory? Azure Data Factory is a cloud service that orchestrates, manages, and monitors the integration and transformation of structured and unstructured data from on-premises and cloud sources at scale.
3
What is Azure Data Factory? I’d call it PaaS
4
Most like…. SSIS DTS Informatica Between other cloud services and On Prem Sources, Destinations, Transformations
5
Is it just SSIS in the Cloud? 5
6
Another kind of MVP Minimally Viable Product Big Data Scenario Emphasis on new tech, JSON based 6
7
Where Portal.azure.com New>Data+Analytics>Data Factory
8
Azure Pricing Cloud/On Prem Activities Data Movement Units https://azure.microsoft.com/en-us/pricing/details/data-factory/ https://azure.microsoft.com/en-us/documentation/articles/data- factory-copy-activity-performance/#cloud-data-movement-units
9
Data Movement Units The cloud data movement unit is a measure that represents the power (combination of CPU, memory and network resource allocation) of a single unit in the Azure Data Factory service that is used to perform a cloud-to-cloud copy operation. Configurable
10
Three Main Elements Linked Services – Think Connection Managers Datasets—Schemas Think mapping of Data Flows Pipeline –Think Data Flows Activities –Types of Data Flows 10
11
Getting around ADF Interface 11
12
Main Dev Environments Author and Deploy (Portal) Copy Data (Portal, preview) Diagram Monitor and Manage Visual Studio 12
13
Author and Deploy 13
14
Copy Data (Wizardish) 14 New Tab in Browser
15
Monitor and Manage 15 New Tab in Browser
16
Diagram 16
17
Visual Studio Extension
18
JSON pronounced Jay-Sahn JavaScript Object Notation http://json.org/
19
JSON JSON is built on two structures: A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array. { } An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence. [ ] JavaScript Object Notation http://json.org
20
JSON in ADF, Dataset Example { "name": "OnPremActorSrce", "properties": { "published": false, "type": "SqlServerTable", "linkedServiceName": "NorthWindStg", "typeProperties": { "tableName": "Actor" }, "availability": { "frequency": "Day", "interval": 1 }, "policy": { "externalData": { "retryInterval": "00:01:00", "retryTimeout": "00:10:00", "maximumRetry": 3 }
21
JSON specific to ADF https://msdn.microsoft.com/en- us/library/azure/dn835050.aspx
22
Data Gateways & ADF 22 Supplies key Install Gateway on each On Prem resource (server, laptop, etc) A resource can only store one key for use by ADF, so that usually means there can be only data factory
23
Data Management Gateway Configuration Manager http://www.microsoft.com/en-us/download/details.aspx?id=39717 Instructions on use: https://azure.microsoft.com/en-us/documentation/articles/data- factory-move-data-between-onprem-and-cloud/#using-the-data-gateway-step-by- step-walkthroughhttps://azure.microsoft.com/en-us/documentation/articles/data- factory-move-data-between-onprem-and-cloud/#using-the-data-gateway-step-by- step-walkthrough For on prem machines. Load the Gateway on the machine. Then go to the Azure Data Factory. Create the Linked Service Gateway there. Get the key from the ADF linked service, copy and paste it into the final step of the Gateway setup on the On Prem Machine. The Gateway is for the entire server. The entire machine. The Linked service will use that gateway for other things and must be configured for each service i.e. Sql databases. Be patient. Refresh rate is slow and can make it seem like it didn’t work when it did.
24
Slices Each unit of data consumed and produced by an activity run is called a data slice. They have StartTime and EndTime and those are accessible to the pipeline activity via ADF System Variables: "sqlReaderQuery": "$$Text.Format('select * from MyTable where timestampcolumn >= \\'{0:yyyy-MM-dd HH:mm}\\' AND timestampcolumn < \\'{1:yyyy-MM-dd HH:mm}\\'', WindowStart, WindowEnd)"
25
Using Slices http://blogs.msdn.com/b/bigdatasupport/archive/2016/01/24/incremental-data-load- from-azure-table-storage-to-azure-sql-using-azure-data-factory.aspx http://blogs.msdn.com/b/bigdatasupport/archive/2016/01/24/incremental-data-load- from-azure-table-storage-to-azure-sql-using-azure-data-factory.aspx25
26
Visual Studio Extension Azure SDK 2.7 and above for Visual Studio 2013 You get templates You can reverse engineer You can connect to your factory and deploy from VS Came out JULY 22, 2015 ENABLES SOURCE CONTROL!
27
Resources Simple SIMPLE tutorial. https://azure.microsoft.com/en- us/documentation/articles/data-factory-get-started/https://azure.microsoft.com/en- us/documentation/articles/data-factory-get-started/ Wee Hyong Tok’s webcast https://info.microsoft.com/Webnar-Introduction-to- Azure-Data-Factory.html Reza Rad’s blog http://www.radacad.com/blog Understanding Azure Storage: https://azure.microsoft.com/en- us/documentation/videos/azure-storage-5-minute-overview/https://azure.microsoft.com/en- us/documentation/videos/azure-storage-5-minute-overview/
28
Loading ADL with ADF
29
https://azure.microsoft.com/en-us/blog/creating- big-data-pipelines-using-azure-data-lake-and- azure-data-factory/
30
Loading ADL with ADF
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.