Cloud Systems of Intelligence design patterns Making it real for Product Engineering Session: BRK3321 Larry Persaud – Principal Lead : AI+R Abhinav Mithal – Senior Manager : AI + R September 26th, 2017
here? Why should you care? AI Powered Organizations are getting ahead – Fast! Why are you here? Why should you care? Massive Compute, Powerful Algorithms, DNN in Cloud Massive amount of info fueling powerful applications How to lead and sustain digital transformation
Business is being transformed by three trends Cloud Intelligence Big Data
Successful Companies Achieve Broad Market Adoption Find It Nail It Scale It Beachhead Early Adopters Broad Market Solution Design Patterns Path toward Market Maker Products
Data Science Solution Workflow Data Sources Ingest Prepare Analyze Serve Consume Understand Business Goals and ?’s Debug, Fix, Enhance, etc. Discover & Gather Data Ingest Data Under-stand Data Transform Data Create Model Deploy Model Monitor & Maintain Model Apps Services Respond to changes/lessons Data Engines Share Results with Business Owners Collaboration & Version Control Documentation
Building Systems of Intelligence on Azure Information Management Big Data Stores Machine Learning and Analytics Intelligence People Data Sources Azure Storage Data Lake Store Machine Learning Cognitive Services Data Factory SQL Data Warehouse Data Lake Analytics Bot Framework Apps Web Mobile Bots Data Catalog Apps HDInsight (Hadoop and Spark) Event Hubs Cosmos DB Cortana Kafka for HDI Sensors and devices Stream Analytics Dashboards & Visualizations Automated Systems Analysis Services Power BI Intelligence Data Action 6
Demo: Interactive Voice Response Bot
Starting with Customers
Industry and Use Cases Industry Use Cases Manufacturing Finance Retail Quality Assurance Predictive Maintenance Fault detection Manufacturing Real time fraud prevention Financial Forecasting Financial Analysis Finance Personalized Ad Recommendations Computer Vision AI Bots Retail Insurance fraud prevention Patient Population Management Equipment PM etc. Health Care
Design Patterns Design Patterns Use Cases Modern DW (e.g. telemetry reporting, revenue reporting) Data Prep & Integration for Data Driven SaaS Data Prep for ML Scoring (e.g. Predictive Maintenance, Prod Recommendations) Batch Big Data Processing @Scale IoT: Streaming vehicle/driver insights & reporting Real time fraud prevention Stream Processing @ Scale IoT: Device monitoring / anomaly detection Personalized Ad Recommendations Fraud Detection Computer Vision Cloud Experimentation and Operationalization AI Bots
Several Architectures… ` Several Architectures… (too many doing the same thing)
Big Data Processing @ Scale Approach Prepare & Transform Ingest Store Anglyze Stage Consume Import/Export Data @ Rest Cosmos DB ADF Blob ADF BI/Exploration Tools AzCopy ADLS HDI Spark HDI Spark SQL DW Application Code/Store Kalfa SQL DW ADLA ADLA SQL DB Archive(Blob, ADLS, etc.) Streaming Data IOT Hub ASA Cosmos DB [Batch] VM [Batch] VM AS Events Hubs HDI/Storm SQL DW SQL DW SQL/IaaS Service Bus Event Proc. Host AML SQL Orcas MRS Orch. & Monitor across stages: Data Flow
Another Approach Prepare & Transform Ingest Store Anglyze Stage Consume Import/Export Data @ Rest ADF Blob ADF Cosmos DB BI/Exploration Tools AzCopy ADLS HDI Spark HDI Spark SQL DW Application Code/Store Kalfa SQL DW ADLA ADLA SQL DB Archive(Blob, ADLS, etc) Streaming Data IOT Hub ASA Cosmos DB [Batch] VM [Batch] VM AS Events Hubs HDI/Storm SQL DW SQL DW SQL/laaS Service Bus Event Proc. Host AML SQL Orcas MRS Orch. & Monitor across stages: Data Flow
Yet Another Approach Prepare & Transform Ingest Store Anglyze Stage Consume Import/Export Data @ Rest ADF Cosmos DB Blob ADF BI/Exploration Tools AzCopy ADLS HDI Spark HDI Spark SQL DW Application Code/Store Kalfa SQL DW ADLA ADLA SQL DB Archive(Blob, ADLS, etc) Streaming Data IOT Hub ASA Cosmos DB [Batch] VM [Batch] VM AS Events Hubs HDI/Storm SQL DW SQL DW SQL/laaS SQL Orcas Service Bus Event Proc. Host AML MRS Orch. & Monitor across stages: Data Flow Mode Original Transactions Processing (Spark) Distribution (to SQLDB – P15) Stream 7 Million Rows 10min @ 5 node 4-5 mins Batch @ Scale 300 Million Rows 42min @ 12 node 2 Hr (ADF) 8 Hr (Direct from Spark) Batch – @ Massive Scale 3000 Million Rows 52min @ 50 node 20+ Hours (ADF) Note: Dark Blue boxes form a optimal path for the big data processing at scale
Pattern 1: Batch Big Data + ML @ Scale
What is Big Data Processing Stage Processes Connect and collect data from various data sources Capture data from real time data streams Data Ingestion Schedule and monitor Data collection. Add structure of raw data Combine data in various format and structure Create a common data schema for down stream consumption Harmonize the data Prepare data aggregates and summary Create time series lags Create feature vectors Featurize the Data Create ML predictions Create Alerts Retrain the ML models Analyze and Predict
Big Data Processing @ Scale Lambda Architecture Big Data Processing @ Scale Use Cases Built a modern DW Data preparation for data-driven SaaS Apps Data preparation for ML scoring Prepare and transform any data, on demand or reoccurring Azure Services The Pattern Data Factory Event Hub Stream Analytics Data Lake Store / Blob HDInsight / ADLA / Batch SQL DW / SQL DB / AS / Cosmos DB Ingest: move data from any source (on prem or cloud) reliably at scale Store: enables large scale batch processing, low $/GB (vs. traditional DWs), support [un and/or semi]structured data Prepare & Transform: normalize & clean data @ scale. Shape data to target (e.g. Dims & Facts) Analyze: Enrich data (via joins, ML model execution, etc) Stage: “consumption ready” data into the optimal store for the scenario (e.g. DW for BI, OLTP store for app, etc.) Default path: Prepare & Transform Ingest Store Anglyze Stage Consume Disk Imp/Exp Data @ Rest ADF Blob HDI Spark HDI Spark ADF DW / AS BI/Exploration Tools Streaming Data Events Hubs ASA VM Cosmos DB Application Code/Store Blob Archive(Blob, ADLS, etc) Orch. & Monitor across stages: ADF 18
Pattern 2: Cognitive Search (AI Powered Content Match)
IVR Bot Architecture Ingest Prepare Analyze Publish Consume normalize, clean text analysis bot logic incoming calls Language Understanding(LUIS) Bot Connector Skype Client Language Understanding Intelligent System) 5 1 2 Bing Speech Web App speech-to-text bot SDK 4 Real Time 3 Azure SQL Azure Search 6 Cosmos DB Batch ETL product inventory product matching + synonyms session state
AI Stack for Intelligent Match and Search Technologies Cognitive APIs, AML Azure Search + Custom Analyzers Bot Framework + Azure web Apps Skype Client 1 Incoming query 6 Final search results Query/Result Orchestration (Azure Web App) 3 Query enrichment Preprocess search query Result enrichment Post-process matching documents Execute query 4 Matched documents 2 Enrich query Enrich/rank results 5 Cognitive APIs (one or more) Technical Properties Scalable e2e system indexing 300MM documents with 60QPS. Pluggable query enrichment and Result enrichment techniques to solve cognitive search and match and rank problems Custom Recognizer Speech-to-text LUIS Detect product entities Azure ML Classification, topics, etc. LUIS Detect product entities Custom Recognizer Speech-to-text Azure Search Analyzers Custom text processing Keyword match Textual similarity Filter Structured metadata
Data and AI
Cloud AI Stack AI Applications Services Frameworks Processing Cognitive Services AML Web Services BOT Framework Inferencing Frameworks Docker Machine Learning Toolkits CNTK Tensorflow ML Server Scikit-Learn Other Libs. Spark, SQL, Other Engines Processing Model & Experimentation Management Spark DSVM AI Batch Training ACS PROSE Data Wrangling Infrastructure EDGE Storage CPUs FPGA COSMOS DB ADLS SQL DB SQL DW BLOB GPUs IOT
Data & AI | Building AI Cloud System of Intelligence APIs Frameworks COGNITIVE SERVICES | BOT FRAMEWORK | AML WEB SERVICE Batch ML Toolkits Interface & Interop Services CNTK Tensorflow ML Server Scikit-Learn Trigger-based Machine Learning Model Management & Experimentation ML Experiment &Train Processing Spark AI Batch DSVM Real-time Storage Data Preparation BLOB SQL DB ADLS COSMOS DB Infrastructure SQL DW OTHER Batch Data Stream Data … Data Ingest
Conclusion / Key Takeaways
Key Takeaways MSFT recommended architectures MSFT 1st party enterprise solutions MSFT solution journey AI powered orgs are reaping benefits in cost reduction, improved quality, increasing the bottom line.
9/12/2018 10:21 PM © 2017 Microsoft Corporation. All rights reserved. Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Objective Product Building Guide: “Find, Nail, Scale” Customer Journey for Data Big and Small Deep Dive AI Solutions AI Stack, Data and Product Journey Conclusion
Pattern Design Processes Ingest Transform Prepare Analyze Serve Consume Pattern Design Processes