Azure Data Factory + SSIS: Migrating your ETLs to the Cloud Presentation Azure Data Factory + SSIS: Migrating your ETLs to the Cloud Jose Chinchilla, MCSE
Jose Chinchilla, MCSE Data Analytics Lead, AgileThought jose Jose Chinchilla, MCSE Data Analytics Lead, AgileThought jose.chinchilla@agilethought.com linkedin.com/in/josechinchilla @sqljoe
Agenda Azure Data Factory (ADF) Integration Runtime (IR) ETL migration scenarios to ADFv2 Demo: Configuring ADF-SSIS IR Deploying, executing and monitoring an SSIS Project to ADF Executing a Copy Activity
What is Azure Data Factory (ADF)?
Azure Data Factory ADFv1 and ADFv2 (current) Data Integration / ETL web service 70+ data source connectors Drag and drop UI to author data pipelines Schedule (trigger), run and monitor pipeline executions SSIS-like Control and Data Flow (preview)
Azure Data Factory ~ SSIS + SQL Agent
Azure Data Factory (ADF) Feature ADFv1 ADFv2 GUI development Limited SSIS-like Activities Limited, required custom More out-of-the-box Linked services SSIS Package execution Not supported* Full support On-prem sources Requires VPN or ER Self-hosted IR Compare versions: https://docs.microsoft.com/en-us/azure/data-factory/compare-versions
Azure Data Factory (ADF) Feature ADFv1 ADFv2 GUI development Limited SSIS-like Activities Limited, required custom More out-of-the-box Linked services SSIS Package execution Not supported* Full support On-prem sources Requires VPN or ER Self-hosted IR Compare versions: https://docs.microsoft.com/en-us/azure/data-factory/compare-versions
ADFv2 Features Pipelines Activities Datasets (Source & Sink) Connections Linked Services Integration Runtime Triggers
ADFv2 Pipeline with single activity Linked Service Linked Service ADFv2 Pipeline Copy Activity Source Dataset Sink Dataset
Integration Runtime
Integration Runtime Provides data integration capabilities across different network environments Data movement: Move data between data stores in public network and data stores in private network (on-premises or virtual private network). It provides support for built-in connectors, format conversion, column mapping, and performant and scalable data transfer. Activity dispatch: Dispatch and monitor transformation activities running on a variety of compute services such as Azure HDInsight, Azure Machine Learning, Azure SQL Database, SQL Server, and more. SSIS package execution: Natively execute SQL Server Integration Services (SSIS) packages in a managed Azure compute environment. https://docs.microsoft.com/en-us/azure/data-factory/concepts-integration-runtime
Integration Runtime Azure-SSIS Deploy, monitor and manage SSIS packages Integration Services Catalog (SSISDB) Azure Azure to Azure or other public networks Always one by default aka AutoResolve Self-Hosted Azure to private networks On-prem or private virtual network Linked Self-Hosted Shared IR with other ADFv2
https://docs. microsoft https://docs.microsoft.com/en-us/azure/data-factory/concepts-integration-runtime
https://docs. microsoft https://docs.microsoft.com/en-us/azure/data-factory/concepts-integration-runtime
ETL Migration Scenarios
Common Migration Strategies Rehost "as is" (lift and shift) Replatform (lift, tinker and shift) Refactor Rearchitect Rebuild Replace More intrusive
Migration goals Eliminate infrastructure overhead Reduce costs Scale up/down as needed Eliminate re-writing and re-architecting ETLs …what else ?
ETL Migration Scenarios Lift & shift SSIS packages Deploy, monitor and run SSIS packages from ADF-SSIS IR Change environment variable values (connection managers, credentials, etc.) Lift and shift is a strategy for moving an application or operation from one environment to another – without redesigning the app. In the lift-and-shift approach, certain workloads and tasks can be moved from on-premises storage to the cloud… https://whatis.techtarget.com/definition/lift-and-shift
ETL Migration Scenarios Replatform / Rearchitect Run stored procedure based ETLs using ADF CopyActivity Stage on-prem data in a Data Lake or Blob storage
Demo
Q&A
www. linkedin.com/company/AgileThought @AgileThought Stay Connected www.agilethought.com www. linkedin.com/company/AgileThought @AgileThought If you have questions or would like more information, feel free to contact me via email jose.chinchilla@agilethought.com
Links and References Azure Data Factory Documentation https://docs.microsoft.com/en-us/azure/data-factory/ Create a trigger that runs a pipeline in response to an event https://docs.microsoft.com/en-us/azure/data-factory/how-to-create-event- trigger