Azure Data Factory + SSIS: Migrating your ETLs to the Cloud Presentation Azure Data Factory + SSIS: Migrating your ETLs to the Cloud + Jose Chinchilla, MCSE
Jose Chinchilla, MCSE Data Analytics Lead, AgileThought jose Jose Chinchilla, MCSE Data Analytics Lead, AgileThought jose.chinchilla@agilethought.com linkedin.com/in/josechinchilla @sqljoe
Agenda Azure Data Factory (ADF) Integration Runtime (IR) ETL migration scenarios to ADFv2 Demo: Configuring the ADF-SSIS Integration Runtime Deploying, executing and monitoring an SSIS Project to ADF Executing a Copy Activity
What is Azure Data Factory (ADF)?
Azure Data Factory 3 versions: ADFv1 ADFv2 (current) ADFv2 with Data Flow (preview) Data Integration / ETL web service 70+ data source connectors Drag and drop UI to author data pipelines Schedule (trigger), run and monitor pipeline executions SSIS-like Control and Data Flow (preview)
Azure Data Factory (ADF) Feature ADFv1 ADFv2 GUI development Limited SSIS-like Activities Limited, required custom More out-of-the-box Linked services SSIS Package execution Not supported* Full support On-prem sources Requires VPN or ER Self-hosted IR Data Flows Not Supported Preview Compare versions: https://docs.microsoft.com/en-us/azure/data-factory/compare-versions
ADFv2 Features Pipelines Activities Data Flows (preview) Datasets (Source & Sink) Connections Linked Services Integration Runtime Triggers
ADFv2 Pipeline with Copy Data activity Linked Service Linked Service ADFv2 Pipeline Copy Activity Source Dataset Sink Dataset
ADFv2 Pipeline with Data Flow activity
Integration Runtime
Integration Runtime Provides data integration capabilities across different network environments Data movement: Move data between data stores in public network and data stores in private network (on-premises or virtual private network). It provides support for built-in connectors, format conversion, column mapping, and performant and scalable data transfer. Activity dispatch: Dispatch and monitor transformation activities running on a variety of compute services such as Azure HDInsight, Azure Machine Learning, Azure SQL Database, SQL Server, and more. SSIS package execution: Natively execute SQL Server Integration Services (SSIS) packages in a managed Azure compute environment. https://docs.microsoft.com/en-us/azure/data-factory/concepts-integration-runtime
Integration Runtime Self-Hosted Azure-SSIS Azure to private networks On-prem or private virtual network Azure-SSIS Deploy, monitor and manage SSIS packages Integration Services Catalog (SSISDB) Linked Self-Hosted Shared IR with other ADFv2 Azure Azure to Azure or other public networks Always one by default aka AutoResolve Integration Runtime
https://docs. microsoft https://docs.microsoft.com/en-us/azure/data-factory/concepts-integration-runtime
https://docs. microsoft https://docs.microsoft.com/en-us/azure/data-factory/concepts-integration-runtime
ETL Migration Scenarios
Common Migration Strategies Rehost "as is" (lift and shift) Replatform (lift, tinker and shift) Refactor Rearchitect Rebuild Replace More effort
Migration goals Eliminate infrastructure overhead Reduce costs Scale up/down as needed Eliminate re-writing and re-architecting ETLs …what else ?
ETL Migration Scenarios Lift & shift SSIS packages Deploy, monitor and run SSIS packages from ADF-SSIS IR Change environment variable values (connection managers, credentials, etc.) Lift and shift is a strategy for moving an application or operation from one environment to another – without redesigning the app. In the lift-and-shift approach, certain workloads and tasks can be moved from on-premises storage to the cloud… https://whatis.techtarget.com/definition/lift-and-shift
ETL Migration Scenarios Replatform / Re-architect Run stored procedure based ETLs using ADF CopyActivity Stage on-prem data in a Data Lake or Blob storage
Demo
Q&A
www. linkedin.com/company/AgileThought @AgileThought Stay Connected www.agilethought.com www. linkedin.com/company/AgileThought @AgileThought If you have questions or would like more information, feel free to contact me via email jose.chinchilla@agilethought.com
Links and References Azure Data Factory Documentation https://docs.microsoft.com/en-us/azure/data-factory/ Create a trigger that runs a pipeline in response to an event https://docs.microsoft.com/en-us/azure/data-factory/how-to-create-event- trigger ADFv2 with Data Flow samples https://github.com/kromerm/adfdataflowdocs/