A platform for the Complete Machine Learning Lifecycle

Slides:



Advertisements
Similar presentations
Introduction to Maven 2.0 An open source build tool for Enterprise Java projects Mahen Goonewardene.
Advertisements

Agenda  Why Azure Resource Manager  What has already been enabled  Questions/Feedback.
Cross Platform Mobile Backend with Mobile Services James
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar.
Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Workforce Scheduling Release 5.0 for Windows Implementation Overview OWS Development Team.
Build and Deployment Process Understand NCI’s DevOps and continuous integration requirements Understand NCI’s build and distribution requirements.
 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.
Dato Confidential 1 Danny Bickson Co-Founder. Dato Confidential 2 Successful apps in 2015 must be intelligent Machine learning key to next-gen apps Recommenders.
Azure ML in SSIS An introduction to Azure Machine Learning Through the eyes of an SSIS developer David Söderlund – SolidQ Nordic
Canadian Bioinformatics Workshops
Autodesk Dev Days 2015 The road ahead DevDays 2015
Containers as a Service with Docker to Extend an Open Platform
PROTECT | OPTIMIZE | TRANSFORM
Dockerize OpenEdge Srinivasa Rao Nalla.
OpenLegacy Training Day Four Introduction to Microservices
ITCS-3190.
BigDL Deep Learning Library on HDInsight
Deploy, Manage, and Scale Your Apps with OpsWorks, Elastic Beanstalk, and CodeDeploy Part 1 – Elastic Beanstalk © 2017 Amazon Web Services, Inc. and.
Infrastructure Orchestration to Optimize Testing
Working With Azure Batch AI
Google App Engine Mandeep Singh (37926)
Vidcoding Introduces Scalable Video and TV Encoding in the Cloud at an Affordable Price by Utilizing the Processing Power of Azure Batch MICROSOFT AZURE.
Amazon Storage- S3 and Glacier
Docker Birthday #3.
Overall Architecture and Component Model
Evaluating state of the art in AI
Cloud Data platform (Cloud Application Development & Deployment)
Platform as a Service.
Logo here Module 3 Microsoft Azure Web App. Logo here Module Overview Introduction to App Service Overview of Web Apps Hosting Web Applications in Azure.
Database Testing in Azure Cloud
OpenNebula Offers an Enterprise-Ready, Fully Open Management Solution for Private and Public Clouds – Try It Easily with an Azure Marketplace Sandbox MICROSOFT.
September 11, Ian R Brooks Ph.D.
Using docker containers
High Performance Data Scientist
Open Source Toolkit for Turn-Key AI Cluster (Introduction)
Machine Learning Platform Life-Cycle Management
Principal Product Manager Oracle Data Science Platform
Data science and machine learning at scale, powered by Jupyter
Introduction to Apache
Using JDeveloper.
: Infrastructure for Complete Machine Learning Lifecycle
Module 01 ETICS Overview ETICS Online Tutorials
Azure Enables Mobility, Easy Sync and Share, and Allows Companies to Retain Data Control MINI-CASE STUDY “Azure provides the full stack of technology that.
Overview of big data tools
In this session… Introduce what we’re talking about
Saranya Sriram Developer Evangelist | Microsoft
Using the Microsoft AI Platform for next generation applications
Technical Capabilities
Serverless Architecture in the Cloud
How to Improve Releasing Efficiency via i18N/L10n Test Automation.
Presented by : Chirag Dani & Dhaval Shah
Last.Backend is a Continuous Delivery Platform for Developers and Dev Teams, Allowing Them to Manage and Deploy Applications Easier and Faster MICROSOFT.
CONTINUOUS INTEGRATION –WHY WE DO IT?
Configuration management suite
Agenda Need of Cloud Computing What is Cloud Computing
IST346: Virtualization and Containerization
Node.js Test Automation using Oracle Developer Cloud- Simplified
Server & Tools Business
Enol Fernandez & Giuseppe La Rocca EGI Foundation
DBOS DecisionBrain Optimization Server
A DevOps process for deploying R to production
Deploying machine learning models at scale
Docker for DBAs SQL Saturday 8/17/2019.
SQL Server 2019 Bringing Apache Spark to SQL Server
Visual Data Flows – Azure Data Factory v2
Visual Data Flows – Azure Data Factory v2
Presentation transcript:

A platform for the Complete Machine Learning Lifecycle Corey Zumar March 27th, 2019

Outline Overview of ML development challenges How MLflow tackles these challenges MLflow components Demo How to get started - Quick overview of MLflow. - Some of the concepts were covered in Matei's keynote talk on Machine Leaning and AI.

Machine Learning Development is Complex

ML Lifecycle Data Prep Raw Data Training Deploy Tuning Scale Tuning Delta μ λ θ Tuning Scale Data Prep Model Exchange μ λ θ Tuning Raw Data Training Scale Scale Deploy Governance Scale 0. ML dev process: raw data -> prep (ETL and feature computation) -> training -> deployment aspects to ML dev cycle that makes it complex Various tools. Best tool for each problem. Multiple tools. Comparison to tradition s/w development. Various steps tuning and optimization: play with longer windows for feature, remember learning rate… Scale: use all the data to improve results Larger teams, isolation => model and data format for exchange. Changes in data storage format -> trickle in ETL changes -> …. Large teams, friction. Governance of models and data: various models in testing. Audit an older model – observe data, retune it???

Custom ML Platforms Can we provide similar benefits in an open manner? Facebook FBLearner, Uber Michelangelo, Google TFX Standardize the data prep / training / deploy loop: if you work with the platform, you get these! Limited to a few algorithms or frameworks Tied to one company’s infrastructure Can we provide similar benefits in an open manner? How are the other big players in Machine Learning scaling out? Invested in platform team POSITIVES : simplification by standardization. A set of tools for data prep, training and deploy. NEGATIVE : limited models. Inflexibility of algorithms. Tied to company’s infra. Hard to open it to other users or open source these systems.

Introducing Open machine learning platform Works with any ML library & language Runs the same way anywhere (e.g. any cloud) Designed to be useful for 1 or 1000+ person orgs Mlflow brings about an innovation – as an open ML platform. Work with any ML library/algorithm, language of choice. Not boxed into limited set of tools - Design philosophy: API first, modular. Submit runs in the same way anywhere. Local machine, any type of cloud Reduce complexity and learning curve. Make it easy for 1 person to use it and scalable to 100K of user in an org.

MLflow Components Tracking Projects Models Standard packaging format for reproducible ML runs Folder of code + data files with a “MLproject” description file MLflow Components Tracking Record and query experiments: code, configs, results, …etc Projects Packaging format for reproducible runs on any platform Models General model format that supports diverse deployment tools

Key Concepts in Tracking Parameters: key-value inputs to your code Metrics: numeric values (can update over time) Artifacts: arbitrary files, including data and models Source: training code that ran Version: version of the training code Tags and Notes: any additional information

MLflow Tracking Tracking Server Tracking APIs (REST, Python, Java, R) UI API Tracking APIs (REST, Python, Java, R)

Record and query experiments: code, configs, results, …etc Standard packaging format for reproducible ML runs Folder of code + data files with a “MLproject” description file MLflow Tracking Tracking Record and query experiments: code, configs, results, …etc import mlflow with mlflow.start_run(): mlflow.log_param("layers", layers) mlflow.log_param("alpha", alpha) # train model mlflow.log_metric("mse", model.mse()) mlflow.log_artifact("plot", model.plot(test_df)) mlflow.tensorflow.log_model(model)

MLflow backend stores Entity Store Artifact Repository FileStore (local filesystem) SQLStore (via SQLAlchemy) REST Store Artifact Repository S3 backed store Azure Blob storage Google Cloud storage DBFS artifact repo

Demo Goal: Classify hand-drawn digits Instrument Keras training code with MLflow tracking APIs Run training code as an MLflow Project Deploy an MLflow Model for real-time serving

MLflow Projects Motivation Diverse set of training tools Result: ML code is difficult to productionize. Diverse set of environments

MLflow Projects Project Spec Local Execution Remote Execution Code Config Remote Execution Dependencies Data Project: specify code & dependencies Soundtrack: Kubernetes, or EC2, or Databricks & logos

MLflow Projects Packaging format for reproducible ML runs Any code folder or GitHub repository Optional MLproject file with project configuration Defines dependencies for reproducibility Conda (+ R, Docker, …) dependencies can be specified in MLproject Reproducible in (almost) any environment Execution API for running projects CLI / Python / R / Java Supports local and remote execution

Example MLflow Project my_project/ ├── MLproject │ │ │ │ │ ├── conda.yaml ├── main.py └── model.py ... conda_env: conda.yaml entry_points: main: parameters: training_data: path lambda: {type: float, default: 0.1} command: python main.py {training_data} {lambda} $ mlflow run git://<my_project>

Demo Goal: Classify hand-drawn digits Instrument Keras training code with MLflow tracking APIs Run training code as an MLflow Project Deploy an MLflow Model for real-time serving

MLflow Models Motivation ML Frameworks Inference Code Batch & Stream Scoring Serving Tools

MLflow Models ML Frameworks Inference Code Model Format Flavor 1 Flavor 2 Batch & Stream Scoring Standard for ML models ML Frameworks Serving Tools

MLflow Models Packaging format for ML Models Any directory with MLmodel file Defines dependencies for reproducibility Conda environment can be specified in MLmodel configuration Model creation utilities Save models from any framework in MLflow format Deployment APIs CLI / Python / R / Java

Example MLflow Model mlflow.tensorflow.log_model(...) run_id: 769915006efd4c4bbd662461 time_created: 2018-06-28T12:34 flavors: tensorflow: saved_model_dir: estimator signature_def_key: predict python_function: loader_module: mlflow.tensorflow my_model/ ├── MLmodel │ │ │ │ │ └ estimator/ ├─ saved_model.pb └─ variables/ ... Usable with Tensorflow tools / APIs Usable with any Python tool mlflow.tensorflow.log_model(...)

Model Flavors Example PyTorch Train a model mlflow.pytorch.log_model() predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe) Flavor 1: Pyfunc Model Format Flavor 2: PyTorch model = mlflow.pytorch.load_model(…) with torch.no_grad(): model(input_tensor)

Model Flavors Example predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe)

Demo Goal: Classify hand-drawn digits Instrument Keras training code with MLflow tracking APIs Run training code as an MLflow Project Deploy an MLflow Model for real-time serving

Get started with MLflow pip install mlflow to get started Find docs & examples at mlflow.org tinyurl.com/mlflow-slack

0.9.0 Release MLflow 0.9.0 was released this week! Major features: Tracking server supports SQL via SQLAlchemy Pluggable tracking server backends Docker environments for Projects Custom python models

Ongoing MLflow Roadmap UI scalability improvements (1.0) X-coordinate logging for metrics & batched logging (1.0) Fluent API for Java and Scala (1.0) Packaging projects with build steps (1.0+) Better environment isolation when loading models (1.0) Improved model schemas (1.0) ✔

Thank you!