Michael French Principal Consultant 5/18/2019

Slides:



Advertisements
Similar presentations
Platinum Sponsors Titanium Sponsors. ETL Tool (SSIS, etc) EDW (SQL Svr, Teradata, etc) Extract Original Data Load Transformed Data Transform BI Tools.
Advertisements

OM. Brad Gall Senior Consultant
Intro to Datazen.
Andy Roberts Data Architect
AZ PASS User Group Azure Data Factory Overview Josh Sivey, Solution Partner October
Internal Modern Data Platform Somnath Data Platform Architect.
Energy Management Solution
Connected Infrastructure
4/18/2018 6:56 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
5/9/2018 7:28 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS.
Connected Living Connected Living What to look for Architecture
Data Platform and Analytics Foundational Training
Examine information management in Cortana Intelligence
Parcel Tracking Solution Parcel Tracking What to look for Architecture
Data-driven serverless apps with Azure functions
Data-driven serverless apps with Azure functions
Using a Gateway to Leverage On-Premises Data in Power BI
Orchestrating Data and Services with Azure Data Factory
Using a Gateway to Leverage On-Premises data in Power BI
ADF & SSIS: New Capabilities for Data Integration in the Cloud
Incrementally Moving to the Cloud Using Biml
Connected Living Connected Living What to look for Architecture
Connected Infrastructure
Building Analytics At Scale With USQL and C#
Remote Monitoring solution
Energy Management Solution
IBM DATASTAGE online Training at GoLogica
Using a Gateway to Leverage On-Premises Data in Power BI
Add intelligence to Dynamics AX with Cortana Intelligence suite
Exploring Azure Event Grid
9/21/2018 3:41 AM BRK3180 Architect your big data solutions with SQL Data Warehouse & Azure Analysis Services Josh Caplan & Matt Usher Program Managers.
Business Intelligence for Project Server/Online
Populating a Data Warehouse
Populating a Data Warehouse
BRK2279 Real-World Data Movement and Orchestration Patterns using Azure Data Factory Jason Horner, Attunix Cathrine Wilhelmsen, Inmeta -
Welcome! Power BI User Group (PUG)
Populating a Data Warehouse
Near Real Time ETLs with Azure Serverless Architecture
Azure Data Factory + SSIS: Migrating your ETLs to the Cloud
Orchestration and data movement with Azure Data Factory v2
SSIS in the Cloud Integration Runtime in Azure Data Factory V2
Populating a Data Warehouse
THR1171 Azure Data Integration: Choosing between SSIS, Azure Data Factory, and Azure Databricks Cathrine Wilhelmsen, | cathrinew.net.
Azure Data Factory + SSIS: Migrating your ETLs to the Cloud
Analytics in the Cloud using Microsoft Azure
Technical Capabilities
Azure Data Factory + SSIS: Migrating your ETLs to the Cloud
5 Azure Services Every .NET Developer Needs to Know
Introduction to Dataflows in Power BI
Orchestration and data movement with Azure Data Factory v2
Power BI – Introduction to Dataflows
Understanding Azure Data Engineering Options Finding Clarity in a Vast & Changing Landscape Cameron Snapp.
Azure Data Factory + SSIS: Migrating your ETLs to the Cloud
ETL Patterns in the Cloud with Azure Data Factory
Azure Data Factory V2 Templates
TN19-TCI: Integration and API management using TIBCO Cloud™ Integration
Databricks and End-to-End Processes Demo Links & Help
Office 365 Development July 2014.
Azure Data Factory V2: SSIS in the Cloud or Not?
Wimmer Solutions Team Justin Barbara Meg SQL and PowerBI Developer
Building Windows Store Apps with Windows Azure Mobile Services
Data Wrangling for ETL enthusiasts
The Modern Data Warehouse and Azure
Beyond orchestration with Azure Data Factory
Get your data flowing with Data Flows! and...umm...dataflows.
Paul Larsen The Value of Hybrid Integration
Visual Data Flows – Azure Data Factory v2
Dimension Load Patterns with Azure Data Factory Data Flows
Visual Data Flows – Azure Data Factory v2
Architecture of modern data warehouse
Presentation transcript:

Michael French Principal Consultant 5/18/2019 Event Driven ELT Michael French Principal Consultant 5/18/2019

About Me Principal Consultant, Pragmatic Works 20+ years of IT Experience B.S. of Applied Mathematics, Kent State University SQLSaturday Presenter Community Volunteer

Goals INTRODUCTION Architecture Overview Life of a File Demo Migrating from Traditional Architectures Architecture Overview How do I function without SSIS? Life of a File Event Driven ELT Demo INTRODUCTION

Audit, Balance & Control Traditional Data Architecture for BI Programs Audit, Balance & Control Data Governance Source Extract & Load Raw Data Store Transform Structure Semantic Layer Data Delivery 1 2 3 4 5 6 7 Source 1 Source 2 Source 3 Source 4 On-Prem SQL Server Source 5 Source 6 API Call SMFT SSIS Azure Sql DB Azure Sql DB Views SSAS Power BI Link to traditional data architecture https://www.delorabradish.com/modeling-for-bi/your-bi-blueprint-road-to-a-successful-bi-implementation Link to Azure data architecture https://www.delorabradish.com/modeling-for-bi/data-architecture-for-azure-bi-programs

Why Migrate to Azure? Cost (scale up, scale down) Offset Limited Local IT Resources Event Based File Ingestion Unstructured Data Large Data Volumes Near Real Time Requirements Data Science Capabilities Development Time to Production Support for large audiences Mobile Collaboration File based history (SCD2 equivalent)

Azure Function ABS Watcher Azure Data Architecture for BI Programs Subject area OLAP Model SFTP AI, ML Tools Logical Model + Metadata Dashboards Workbooks Reports API Calls Self-hosted Integration Runtime Azure Logic App SFTP File Watcher Data Pull or Push Temporary Store Multi-file Consolidation To Data Models Source Raw Data Store Transform & Load Enterprise Data Science Source 1 Source 3 Cloud On-Prem 4 Source 5 Source 2 Dimensional model Semantic Layer Delivery Azure Logic App & SQL Server Procedure event logging to Cosmos DB or Azure SQL Database Azure Function ABS Watcher Permanent Current File + Deltas (Separate New Update, Delete) Files Standardized Data Store Generate Current Version File Separate Delta Analyze Visualize Azure Blob Storage Data Bricks Azure Data Lake PolyBase t-SQL Spark AAS Power BI 10 Unstructured Cosmos DB 8 9 Source 6 … Azure Data Factory Pipeline Ingestion “Orchestrators” PBI Logs Azure SQL DW Azure SQL DB Source 7 1 2 3 4 5 6 7 11 12 13 Link to traditional data architecture https://www.delorabradish.com/modeling-for-bi/your-bi-blueprint-road-to-a-successful-bi-implementation Link to Azure data architecture https://www.delorabradish.com/modeling-for-bi/data-architecture-for-azure-bi-programs

Azure Data Architecture ~ Traditional Comparison Subject area OLAP Model SFTP Dashboards Workbooks Reports API Calls Self-hosted Integration Runtime Azure Logic App SFTP File Watcher Data Pull or Push Temporary Store Multi-file Consolidation To Data Models Source Raw Data Store Transform & Load Enterprise Source 1 Source 3 Cloud On-Prem 4 Source 5 Source 2 Dimensional model Semantic Layer Delivery Azure Logic App & SQL Server Procedure event logging to Cosmos DB or Azure SQL Database Standardized Data Store Analyze Visualize Azure Blob Storage Databricks Azure Data Lake PolyBase t-SQL Spark AAS Power BI 10 Cosmos DB 8 9 Source 6 … Azure Data Factory Pipeline Ingestion “Orchestrators” PBI Logs Azure SQL DW Azure SQL DB Source 7 1 2 3 4 5 6 7 11 12 13 SSIS SQL DB Tabular PBI Traditional 

Azure Function ABS Watcher Azure Data Architecture ~ Value Add Subject area OLAP Model SFTP AI, ML Tools Logical Model + Metadata Dashboards Workbooks Reports API Calls Self-hosted Integration Runtime Azure Logic App SFTP File Watcher Data Pull or Push Temporary Store Multi-file Consolidation To Data Models Source Raw Data Store Transform & Load Enterprise Data Science Source 1 Source 3 Cloud On-Prem 4 Source 5 Source 2 Dimensional model Semantic Layer Delivery Azure Logic App & SQL Server Procedure event logging to Cosmos DB or Azure SQL Database Azure Function ABS Watcher Permanent Current File + Deltas (Separate New Update, Delete) Files Standardized Data Store Generate Current Version File Separate Delta Analyze Visualize Azure Blob Storage Databricks Azure Data Lake PolyBase t-SQL Spark AAS Power BI 10 Unstructured Cosmos DB 8 9 Source 6 … Azure Data Factory Pipeline Ingestion “Orchestrators” PBI Logs Azure SQL DW Azure SQL DB Source 7 1 2 3 4 5 6 7 11 12 13 SSIS SQL DB Tabular PBI Traditional 

Talking Points Life of a File Azure Data Factory Orchestrator Listening for new Files in Azure Logic Apps Preprocessing in Azure Blob Storage Current & Historical Files in Azure Data Lake Azure Data Warehouse Ingestion Life of a File

Azure Logic App ~ SFTP Listener Push from Source Source Data Pull or Push Azure Logic App SFTP File Watcher Raw Data Store Temporary Data Store SFTP File Watchers 2 3 Logic App SFTP File Watcher SFTP File Added or Changed Logic App Log Event SFTP File Found Azure Database Stored Proc Log File Found Logic App Log Event & Call ADF Pipeline Azure Data Factory SFTP Orchestrator Azure Blob Storage SFTP Source 5 Source 6 Event Hub Send Event Azure Blob Storage 1 2 3 Azure Data Factory

Azure Logic App SFTP File Watcher Azure Data Factory Orchestrator Scheduled Pull from Source (traditional SSIS) API Calls Self-hosted Integration Runtime Azure Logic App SFTP File Watcher Data Pull or Push Temporary Store Source Raw Data Store Source 1 Source 3 Cloud On-Prem 4 Source 2 Azure Blob Storage … Logic App Log Event After every activity!! Event Hub Send Event 2 ~ ADF Orchestrator 3 Azure Blob Storage Azure Database Stored Proc Get Start Date 1 2 3 Azure Data Factory Triggered Pipeline Update Run Date Copy Dataset Azure Data Factory

Azure Blob Storage ~ Preprocessing 1 2 3 4 No Deletes Needed Source Azure Data Factory Azure Blob Storage finalContainer Azure Function Unapproved Departments Must Delete Cleansed CSV File Temporary Data Store Raw Data Store 2b 2c Azure Blob Storage tempContainer HDInsight of ADFgen2 Delete.py /or/ Pipeline Azure SQL Database Azure Function ABS Watcher 5a 5b 5c 5d 5e 6 If found Logic App Cosmos DB Logic App Data Factory Data Lake Store HDInsight Full or incremental load parameter passed to ADL Orchestrator Cosmos DB Azure Blob Storage

Azure Data Factory Orchestrator Scheduled Pull from Source 1 2 3 4 No Preprocessing Needed Source Azure Data Factory Azure Blob Storage finalContainer Azure Function Temporary Data Store Raw Data Store Azure SQL Database Azure Function ABS Watcher 5a 5b 5c 5d 5e 6 If found Logic App Cosmos DB Logic App Data Factory Data Lake Store HDInsight Full or incremental load parameter passed to ADL Orchestrator Cosmos DB Azure Blob Storage

Same Song, Second Verse Some ingestion method 1 2 3 4 Raw Data Store 1 2 3 4 Some ingestion method Azure Blob Storage finalContainer Azure Function Temporary Data Store Raw Data Store Azure Function ABS Watcher 5a 5b 5c 5d 5e 6 If found Logic App Cosmos DB Logic App Data Factory Data Lake Store HDInsight Full or incremental load parameter passed to ADL Orchestrator Cosmos DB Azure Blob Storage

Azure Function ABS Watcher Azure Data Lake Ingestion For all Sources Temporary Data Store Raw Data Store Generate Current Version File + Separate Delta Files Transform & Load Current File + Deltas (Separate New Update, Delete) Files Standardized Data Store 3 4 – ABS File Watcher (Root Container) 5 6 Azure Blob Azure Function ABS File Added or Changed Logic App Log Event ABS File Found Logic App Log Event & Call ADF Pipeline Azure Data Factory ADL Orchestrator Azure Data Lake Store Azure Function ABS Watcher Event Hub Send Event Azure Blob Storage Data Bricks Azure Data Lake 3 4 5 6 Azure Data Factory

Azure Data Factory Orchestrator ADL Orchestrator Pipeline Ingestion Pipeline AsIs Pipeline PySpark Create row-level checksum Create delta files Create AsIs Files All ADF Metadata Logging Logic App Log Event Success Failure Event Hub Send Event or Azure Data Lake Store Separate New, Changed & Deleted Files Single “AsIs” Current File Source For Azure Blob One Orchestrator Pipeline For all Sources

Azure Data Warehouse Ingestion For all Sources Current File + Deltas (Separate New Update, Delete) Files Standardized Data Store Transform & Load Enterprise Data Store Multi-file Consolidation To Data Models 3NF Schema Subject area specific integrated Data Hub With historical tracking OLAP Schema 6 7 8 Azure Data Lake Store Azure Data Factory Orchestrator Execute series of Stored Procedures Azure SQL Data Warehouse External Tables Azure SQL Data Warehouse 3NF Tables Event Hub Send Event Azure SQL Data Warehouse Logging Tables 8 9 Azure SQL DB or ADW Azure Data Lake PolyBase t-SQL 6 7 8 and/or 9 Azure Data Factory

Cloud Tools Tool Purpose 1 Azure Logic Apps SFTP "watcher“ Event logging Blob storage and data lake delete methodologies Notifications Automatic emails Cosmos DB document upload and deletions 2 Azure Function Azure Blob Storage "listener" 3 Azure Event Hub event handling 4 Azure Blob Storage temporary work space 5 Azure Data Factory Process flow orchestrators Data copy QA methodologies

Cloud Tools (continued) Purpose 6 Databricks Data processing and write to Azure Data Lake Other pre-processing data requirements 7 HD Insight Originally implemented, but replaced with Databricks 8 Azure Data Lake Delta files -- change data capture at the file level Current “AsIs” files Data science self-service Power BI self-service 9 Cosmos DB SQL API Logging ELT metadata 10 Azure Key Vault Supports Dev/QA/Prod Migration

Cloud Tools (continued) Purpose 11 Azure SQL Database ELT metadata 12 Azure SQL Data Warehouse Both Inmon and Kimball data stores (loosely speaking) 13 Azure Analysis Services Tabular semantic layer 14 Power BI Reporting and self-service

Development Tools Tool Purpose 1 Visual Studio Python project Auto generate the file-level metadata for complete file ingestion to Azure Data Lake 2 Visual Studio Azure Data Warehouse project Team Foundation Server source code control for Azure Data Warehouses 3 Visual Studio Logic App Project Team Foundation Server or GIT source code control for Azure Logic Apps 4 Visual Studio Database Project Team Foundation Server or GIT source code control for Azure SQL Databases 5 GIT Hub Source code control for Azure Data Factory and Databricks

Demo

Have Any Questions?

Additional Resources Azure Messaging Services Azure Every Day http://blog.pragmaticworks.com/choosing-the-right-tools-for-elt-workloads-in- the-cloud Colleague Site http://www.delorabardish.com Contact Me MFrench@PragmaticWorks.com

app Training Delivery Options Bootcamps Workshops On-Demand Training Week long deep-dive Workshops One-day training primer On-Demand Training Web-based subscription training

65 Power BI -Managed Services-2019 User support Ecosystem Management 25 Power BI -Managed Services-2019 User support Skills and Development Ecosystem Management Plan, Configure, Remediate Systems Monitor Daily validation of your Power BI ecosystem.