ML in Azure Databricks Mahesh Balija 4/15/2019 1:36 PM Cloud Solution Architect © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Azure Databricks Higgs Boson Challenge Azure ML Service Agenda
Machine Learning Process Raw Data Vectors Model Model Model Vectorization Modelling Testing Explainability Operationalize Vectors Model Accuracy Trust
Scale-out ML
Databricks ML
Spark ML Concepts
Spark ML Pipelines Estimator Fit ML Pipeline Transform Transformer
Datastructures Dataset Dataframe Scale-Out Pandas DataFrame Single-Node RDD Numpy Arrays
Azure Databricks & Scikit-Learn Source: http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html
Azure Machine Learning: Technical Details
Data Science Challenges
Azure ML service Workspace Key Artifacts Models Images Experiments Deployment Pipelines Data Stores Compute Target
AML Compute Targets
Compute Targets
AML Deployment Choices
Deploy Models
Auto-ML
Hyper-Parameter Tuning Process Re-process the Feature Vectors Algorithm Selection Tune the config Test model accuracy Step 3 Step 2 Step 1
Hyper-parameter tuning Auto ML Sit back & Relax Select Algorithm Choose Compute Target Hyper-parameter tuning Best Model Produced! Configure Run Test Auto ML https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train
Model Explainability
(Husky Vs Wolf) Because of above Model Predicts Husky as Wolf
Azure ML SDK https://docs.microsoft.com/en-us/python/api/overview/azure/ml/?view=azure-ml-py APIs
Azure Machine Learning Service Deploy Train on Cloud Or On-Prem Project Brainwave Cloud-based Databricks (or) Azure Notebooks Databricks / AML Compute ACI/AKS On-Prem (Local) DSVM/ DLVM Jupyter Notebook Server (Or) IDE’s Azure IoT Edge
Data Science Project Workflow Create Azure ML Service Workspace Create Azure ML Service Workspace Create Azure ML Service Workspace Create Azure ML Service Workspace Register the Model in Model Registry Data Science Project Workflow Configure compute plane Configure compute plane Configure compute plane Configure compute plane Check-in container image Azure Portal/Python SDK Azure Portal/Python SDK Azure Portal/Python SDK Azure Portal/Python SDK Azure ML SDK Azure ML SDK Azure ML SDK Azure ML SDK Model Versioning Model Versioning ACR Setup development Environment Setup development Environment Setup development Environment Setup development Environment Containerize the model Containerize the model Containerize the model Docker container Docker container Azure ML SDK Azure ML SDK Azure ML SDK Run the experiments Run the experiments Deploy into Production
Best of Databricks and AML Service
Higgs Boson Dataset
God Particle
Tau Particle S B
Training Data EventId,DER_mass_MMC,DER_mass_transverse_met_lep,DER_mass_vis,DER_pt_h,DER_deltaeta_jet_jet,DER_mass_jet_jet,DER_prodeta_jet_jet,DER_deltar_tau_lep,DER_pt_tot,DER_sum_pt,DER_pt_ratio_lep_tau,DER_met_phi_centrality,DER_lep_eta_centrality,PRI_tau_pt,PRI_tau_eta,PRI_tau_phi,PRI_lep_pt,PRI_lep_eta,PRI_lep_phi,PRI_met,PRI_met_phi,PRI_met_sumet,PRI_jet_num,PRI_jet_leading_pt,PRI_jet_leading_eta,PRI_jet_leading_phi,PRI_jet_subleading_pt,PRI_jet_subleading_eta,PRI_jet_subleading_phi,PRI_jet_all_pt,Weight,Label 100000,138.47,51.655,97.827,27.98,0.91,124.711,2.666,3.064,41.928,197.76,1.582,1.396,0.2,32.638,1.017,0.381,51.626,2.273,-2.414,16.824,-0.277,258.733,2,67.435,2.15,0.444,46.062,1.24,-2.475,113.497,0.00265331133733,s Sources https://archive.ics.uci.edu/ml/datasets/HIGGS https://www.kaggle.com/c/higgs-boson/data
4/15/2019 1:36 PM Its Demo Time! © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Call for Action Subject Link Azure Databricks https://azure.microsoft.com/en-us/services/databricks/ Azure ML Service https://azure.microsoft.com/en-us/services/machine-learning-service/ Azure ML SDK https://docs.microsoft.com/en-us/python/api/overview/azure/ml/intro?view=azure-ml-py Azure Citadel https://azurecitadel.com/data-ai/azure-databricks-workshop/ Git Hub Labs https://github.com/mabalija/MachineLearningNotebooks AML Service Labs https://github.com/mabalija/AML-service-labs
We’d love your feedback 4/15/2019 1:36 PM We’d love your feedback Aka.ms/SQLBits19 © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.