Custom Activities in Azure Data Factory

Slides:



Advertisements
Similar presentations
Microsoft Visual Basic 2012 CHAPTER ONE Introduction to Visual Basic 2012 Programming.
Advertisements

The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
From Virtualization Management to Private Cloud with SCVMM 2012 Dan Stolts Sr. IT Pro Evangelist Microsoft Corporation
Windows Azure poDRw_Xi3Aw.
Andy Roberts Data Architect
Getting to know U-SQL Azhagappan Arunachalam.  Sr Database Architect 
Copyright © New Signature Who we are: Focused on consistently delivering great customer experiences. What we do: We help you transform your business.
Inspirirani ljudima. Ugasite mobitele. Hvala.. Paolo Pialorsi Senior Consultant PiaSys ( Publishing apps for SharePoint 2013 on Microsoft.
AZ PASS User Group Azure Data Factory Overview Josh Sivey, Solution Partner October
Getting to know U-SQL Azhagappan Arunachalam.  Sr Applications Database Architect 
Others Talk, We Listen. Managing Database Projects in Visual Studio 2013.
9/24/2017 7:27 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Mile Hi Power BI User Group
SQL Database Management
Detecting Web Attacks Using Multi-Stage Log Analysis
Platform as a Service (PaaS)
Agenda:- DevOps Tools Chef Jenkins Puppet Apache Ant Apache Maven Logstash Docker New Relic Gradle Git.
SQL 2016 R Services a.k.a. leveraging your local data lake
Platform as a Service (PaaS)
Data Platform and Analytics Foundational Training
Data Virtualization Tutorial: Introduction to SQL Script
Data Virtualization Demoette… Custom Java Procedures
Tulika Chaudharie / Harikharan Krishnaraju
Working With Azure Batch AI
Introduction to Visual Basic 2008 Programming
ADF & SSIS: New Capabilities for Data Integration in the Cloud
Platform as a Service.
Logo here Module 3 Microsoft Azure Web App. Logo here Module Overview Introduction to App Service Overview of Web Apps Hosting Web Applications in Azure.
Introduction to .NET Framework Ch2 – Deitel’s Book
Building Analytics At Scale With USQL and C#
CE-105 Spring 2007 Engr. Faisal ur Rehman
Deploying and Configuring SSIS Packages
Cloudy with a Chance of Data
Azure Automation and Logic Apps:
PaaS - Development Stefan Geiger Gerry
Cloudy with a Chance of Data
Dev Test on Windows Azure Solution in a Box
CIS16 Application Development – Programming with Visual Basic
Azure Data Factory + SSIS: Migrating your ETLs to the Cloud
Orchestration and data movement with Azure Data Factory v2
SSIS in the Cloud Integration Runtime in Azure Data Factory V2
SQL SERVER TRANSACTION LOG INSIDE
Managing Services with VMM and App Controller
Azure Data Factory + SSIS: Migrating your ETLs to the Cloud
Analytics in the Cloud using Microsoft Azure
Serverless Architecture in the Cloud
SSDT and Database Project Basics
Azure Data Factory + SSIS: Migrating your ETLs to the Cloud
Azure Data Factory v2: What’s new?
Building and running HPC apps in Windows Azure
Building ETL/ELT Workloads with Azure Data Factory V2
5 Azure Services Every .NET Developer Needs to Know
Orchestration and data movement with Azure Data Factory v2
02 – Cloud Services Bret Stateham | Senior Technical Evangelist​
Understanding Azure Data Engineering Options Finding Clarity in a Vast & Changing Landscape Cameron Snapp.
Azure Data Factory + SSIS: Migrating your ETLs to the Cloud
ETL Patterns in the Cloud with Azure Data Factory
Azure Data Factory V2 Templates
Server & Tools Business
Databricks and End-to-End Processes Demo Links & Help
Azure Data Factory V2: SSIS in the Cloud or Not?
Deep Dive Into SSIS in ADF
Cloudy with a Chance of Data
Michael French Principal Consultant 5/18/2019
Beyond orchestration with Azure Data Factory
Continuous Integration and Delivery (CI/CD) in Azure Data Factory
SSDT, Docker, and (Azure) DevOps
Visual Data Flows – Azure Data Factory v2
Visual Data Flows – Azure Data Factory v2
Presentation transcript:

Custom Activities in Azure Data Factory Presented by Jared Zagelbaum Senior Consultant, Blue Granite

Introduction About me: Microsoft Data Platform since 2008 Azure since Azure MCSE Data & Analytics Microsoft Certificate in Data Science (R) Senior Consultant with Blue Granite Recent projects (last 6 months) – Manufacturing, Logistics Technologies implemented - Power BI, SQL DW, ADF, SSIS (BIML), SSAS, SQL Server, Azure Data Lake, DevOps / CI Lesson descriptions should be brief.

Objectives Understand when to use a custom activity Know how to go about creating one Save you some pain with undocumented things I’ve encountered Appreciate the scope of what you can really do with ADF v2 orchestrating Azure services Example objectives At the end of this lesson, you will be able to: Save files to the team Web server. Move files to different locations on the team Web server. Share files on the team Web server.

Agenda Azure Prerequisites for Custom Activities Overview of Azure Batch Implementation of custom activities in Azure Data Factory (v1 and v2) Review the use cases for custom activities in Azure Data Factory ADFv1 Deep Dive Setting up development environment for ADFv1 custom activities Developing a custom activity for ADF v1 Deployment and Debugging ADFv2 Deep Dive Developing a custom activity for ADF v2 (much more fun version) How presentation will benefit audience: Adult learners are more interested in a subject if they know how or why it is important to them. Presenter’s level of expertise in the subject: Briefly state your credentials in this area, or explain why participants should listen to you.

Azure Batch Azure Batch creates and manages a pool of compute nodes (virtual machines), installs the applications you want to run, and schedules jobs to run on the nodes. There is no cluster or job scheduler software to install, manage, or scale. There is no additional charge for using Batch. You only pay for the underlying resources consumed, such as the virtual machines, storage, and networking. Batch works well with intrinsically parallel (also known as "embarrassingly parallel") workloads-- where the applications can run independently, and each instance completes part of the work.

Custom Activities Compared ADF v1 vs v2 Differences version 2 Custom Activity version 1 (Custom) DotNet Activity How custom logic is defined By providing an executable By implementing a .Net DLL Execution environment of the custom logic Windows or Linux Windows (.Net Framework 4.5.2) Executing scripts Supports executing scripts directly (for example "cmd /c echo hello world" on Windows VM) Requires implementation in the .Net DLL Dataset required Optional Required to chain activities and pass information Pass information from activity to custom logic Through ReferenceObjects (LinkedServices and Datasets) and ExtendedProperties (custom properties) Through ExtendedProperties (custom properties), Input, and Output Datasets Retrieve information in custom logic Parses activity.json, linkedServices.json, and datasets.json stored in the same folder of the executable Through .Net SDK (.Net Frame 4.5.2) Logging Writes directly to STDOUT Implementing Logger in .Net DLL Custom Activities Compared ADF v1 vs v2 ADFv1 Execution restricted to single activity run (no opportunity to scale within an activity definition) ADFv2 Can run parallel / scale out easily via control activities Can run packaged executables if callable from command line (Linux or Windows)– not just scripts! Must use cloud hosted integration runtime and Azure batch

Key Takeaways… ADFv1 ADFv2 Custom (.Net) activities are designed to interact with datasets that require specific access methods / transformation rules. Azure Batch is used as an anonymizer of resources more than for its actual potential to scale. Requires .Net 4.5.2, IDotNetActivity interface, and NuGet Package Microsoft.Azure.Management.DataFactories – if you need a custom activity, you’re basically building it from scratch ADFv2 Run any executable- self compiled, script, or packaged executable (with command arguments)…Windows or Linux OS. Control activities leverage the full power of Azure Batch to scale out parallel workloads “No holds barred” – not expected to produce or transform a dataset

Use cases for custom activities ADFv1 ADFv2 You need to access a source or service not supported with native components You need to perform a specific compute task on “small data” ADFv1 use cases You want to run an SMP application based on conditions / wall clock and possibly have the output of the application trigger additional actions You want batch processes logging all to a common system You are filling in the holes unsupported in current SSIS lift and shift: https://docs.microsoft.com/en- us/sql/integration-services/lift-shift/ssis- azure-validate-packages

ADFv1 Deep Dive

ADFv1 Adding custom code to projects and deployment is fairly easy with Data Factory Tools for Visual Studio 2015 Debugging .Net class library requires additional work Developing pipelines and activities is all JSON Debugging in ADFv1 is centralized

ADFv2 Deep Dive

ADFv2 Use any development environment you want, heck, even any framework as long as its SMP based No slick tooling for deployment like in v1 = slightly more work Debugging locally doesn’t require much refactoring if any Developing pipelines and activities is helped by visual editor initially Debugging in ADFv2 is buggy– its still in preview!

Session Summary ADFv1 and v2 use Azure batch to run custom activities v1 mostly for convenience v2 fer reelz yo– you can use it to run parallel tasks at enormous scale (along with your Azure bill) ADFv1 had a dream of things being nice, neat, tumbling windows where custom activities had a certain place in this tiny little world ADFv2 lets you run pretty much any workload and access pretty much any data source without restriction to platform or scale, and orchestrates everything into a single service. Kids, you can drive the car now.

Evaluation Thanks for attending and filling out the evaluations for all the sessions you go to today– they really matter to the presenters!