A platform for the Complete Machine Learning Lifecycle

Slides:

Advertisements

Similar presentations

Introduction to Maven 2.0 An open source build tool for Enterprise Java projects Mahen Goonewardene.

Advertisements

Agenda  Why Azure Resource Manager  What has already been enabled  Questions/Feedback.

Cross Platform Mobile Backend with Mobile Services James

A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.

Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar.

Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.

Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.

Workforce Scheduling Release 5.0 for Windows Implementation Overview OWS Development Team.

Build and Deployment Process Understand NCI’s DevOps and continuous integration requirements Understand NCI’s build and distribution requirements.

 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.

Dato Confidential 1 Danny Bickson Co-Founder. Dato Confidential 2 Successful apps in 2015 must be intelligent Machine learning key to next-gen apps Recommenders.

Azure ML in SSIS An introduction to Azure Machine Learning Through the eyes of an SSIS developer David Söderlund – SolidQ Nordic

Canadian Bioinformatics Workshops

Autodesk Dev Days 2015 The road ahead DevDays 2015

Containers as a Service with Docker to Extend an Open Platform

PROTECT | OPTIMIZE | TRANSFORM

Dockerize OpenEdge Srinivasa Rao Nalla.

OpenLegacy Training Day Four Introduction to Microservices

BigDL Deep Learning Library on HDInsight

Deploy, Manage, and Scale Your Apps with OpsWorks, Elastic Beanstalk, and CodeDeploy Part 1 – Elastic Beanstalk © 2017 Amazon Web Services, Inc. and.

Infrastructure Orchestration to Optimize Testing

Working With Azure Batch AI

Google App Engine Mandeep Singh (37926)

Vidcoding Introduces Scalable Video and TV Encoding in the Cloud at an Affordable Price by Utilizing the Processing Power of Azure Batch MICROSOFT AZURE.

Amazon Storage- S3 and Glacier

Docker Birthday #3.

Overall Architecture and Component Model

Evaluating state of the art in AI

Cloud Data platform (Cloud Application Development & Deployment)

Platform as a Service.

Logo here Module 3 Microsoft Azure Web App. Logo here Module Overview Introduction to App Service Overview of Web Apps Hosting Web Applications in Azure.

Database Testing in Azure Cloud

OpenNebula Offers an Enterprise-Ready, Fully Open Management Solution for Private and Public Clouds – Try It Easily with an Azure Marketplace Sandbox MICROSOFT.

September 11, Ian R Brooks Ph.D.

Using docker containers

High Performance Data Scientist

Open Source Toolkit for Turn-Key AI Cluster (Introduction)

Machine Learning Platform Life-Cycle Management

Principal Product Manager Oracle Data Science Platform

Data science and machine learning at scale, powered by Jupyter

Introduction to Apache

Using JDeveloper.

: Infrastructure for Complete Machine Learning Lifecycle

Module 01 ETICS Overview ETICS Online Tutorials

Azure Enables Mobility, Easy Sync and Share, and Allows Companies to Retain Data Control MINI-CASE STUDY “Azure provides the full stack of technology that.

Overview of big data tools

In this session… Introduce what we’re talking about

Saranya Sriram Developer Evangelist | Microsoft

Using the Microsoft AI Platform for next generation applications

Technical Capabilities

Serverless Architecture in the Cloud

How to Improve Releasing Efficiency via i18N/L10n Test Automation.

Presented by : Chirag Dani & Dhaval Shah

Last.Backend is a Continuous Delivery Platform for Developers and Dev Teams, Allowing Them to Manage and Deploy Applications Easier and Faster MICROSOFT.

CONTINUOUS INTEGRATION –WHY WE DO IT?

Configuration management suite

Agenda Need of Cloud Computing What is Cloud Computing

IST346: Virtualization and Containerization

Node.js Test Automation using Oracle Developer Cloud- Simplified

Server & Tools Business

Enol Fernandez & Giuseppe La Rocca EGI Foundation

DBOS DecisionBrain Optimization Server

A DevOps process for deploying R to production

Deploying machine learning models at scale

Docker for DBAs SQL Saturday 8/17/2019.

SQL Server 2019 Bringing Apache Spark to SQL Server

Visual Data Flows – Azure Data Factory v2

Visual Data Flows – Azure Data Factory v2

Presentation transcript:

A platform for the Complete Machine Learning Lifecycle Corey Zumar March 27th, 2019

Outline Overview of ML development challenges How MLflow tackles these challenges MLflow components Demo How to get started - Quick overview of MLflow. - Some of the concepts were covered in Matei's keynote talk on Machine Leaning and AI.

Machine Learning Development is Complex

ML Lifecycle Data Prep Raw Data Training Deploy Tuning Scale Tuning Delta μ λ θ Tuning Scale Data Prep Model Exchange μ λ θ Tuning Raw Data Training Scale Scale Deploy Governance Scale 0. ML dev process: raw data -> prep (ETL and feature computation) -> training -> deployment aspects to ML dev cycle that makes it complex Various tools. Best tool for each problem. Multiple tools. Comparison to tradition s/w development. Various steps tuning and optimization: play with longer windows for feature, remember learning rate… Scale: use all the data to improve results Larger teams, isolation => model and data format for exchange. Changes in data storage format -> trickle in ETL changes -> …. Large teams, friction. Governance of models and data: various models in testing. Audit an older model – observe data, retune it???

Custom ML Platforms Can we provide similar benefits in an open manner? Facebook FBLearner, Uber Michelangelo, Google TFX Standardize the data prep / training / deploy loop: if you work with the platform, you get these! Limited to a few algorithms or frameworks Tied to one company’s infrastructure Can we provide similar benefits in an open manner? How are the other big players in Machine Learning scaling out? Invested in platform team POSITIVES : simplification by standardization. A set of tools for data prep, training and deploy. NEGATIVE : limited models. Inflexibility of algorithms. Tied to company’s infra. Hard to open it to other users or open source these systems.

Introducing Open machine learning platform Works with any ML library & language Runs the same way anywhere (e.g. any cloud) Designed to be useful for 1 or 1000+ person orgs Mlflow brings about an innovation – as an open ML platform. Work with any ML library/algorithm, language of choice. Not boxed into limited set of tools - Design philosophy: API first, modular. Submit runs in the same way anywhere. Local machine, any type of cloud Reduce complexity and learning curve. Make it easy for 1 person to use it and scalable to 100K of user in an org.

MLflow Components Tracking Projects Models Standard packaging format for reproducible ML runs Folder of code + data files with a “MLproject” description file MLflow Components Tracking Record and query experiments: code, configs, results, …etc Projects Packaging format for reproducible runs on any platform Models General model format that supports diverse deployment tools

Key Concepts in Tracking Parameters: key-value inputs to your code Metrics: numeric values (can update over time) Artifacts: arbitrary files, including data and models Source: training code that ran Version: version of the training code Tags and Notes: any additional information

MLflow Tracking Tracking Server Tracking APIs (REST, Python, Java, R) UI API Tracking APIs (REST, Python, Java, R)

Record and query experiments: code, configs, results, …etc Standard packaging format for reproducible ML runs Folder of code + data files with a “MLproject” description file MLflow Tracking Tracking Record and query experiments: code, configs, results, …etc import mlflow with mlflow.start_run(): mlflow.log_param("layers", layers) mlflow.log_param("alpha", alpha) # train model mlflow.log_metric("mse", model.mse()) mlflow.log_artifact("plot", model.plot(test_df)) mlflow.tensorflow.log_model(model)

MLflow backend stores Entity Store Artifact Repository FileStore (local filesystem) SQLStore (via SQLAlchemy) REST Store Artifact Repository S3 backed store Azure Blob storage Google Cloud storage DBFS artifact repo

Demo Goal: Classify hand-drawn digits Instrument Keras training code with MLflow tracking APIs Run training code as an MLflow Project Deploy an MLflow Model for real-time serving

MLflow Projects Motivation Diverse set of training tools Result: ML code is difficult to productionize. Diverse set of environments

MLflow Projects Project Spec Local Execution Remote Execution Code Config Remote Execution Dependencies Data Project: specify code & dependencies Soundtrack: Kubernetes, or EC2, or Databricks & logos

MLflow Projects Packaging format for reproducible ML runs Any code folder or GitHub repository Optional MLproject file with project configuration Defines dependencies for reproducibility Conda (+ R, Docker, …) dependencies can be specified in MLproject Reproducible in (almost) any environment Execution API for running projects CLI / Python / R / Java Supports local and remote execution

Example MLflow Project my_project/ ├── MLproject │ │ │ │ │ ├── conda.yaml ├── main.py └── model.py ... conda_env: conda.yaml entry_points: main: parameters: training_data: path lambda: {type: float, default: 0.1} command: python main.py {training_data} {lambda} $ mlflow run git://<my_project>

Demo Goal: Classify hand-drawn digits Instrument Keras training code with MLflow tracking APIs Run training code as an MLflow Project Deploy an MLflow Model for real-time serving

MLflow Models Motivation ML Frameworks Inference Code Batch & Stream Scoring Serving Tools

MLflow Models ML Frameworks Inference Code Model Format Flavor 1 Flavor 2 Batch & Stream Scoring Standard for ML models ML Frameworks Serving Tools

MLflow Models Packaging format for ML Models Any directory with MLmodel file Defines dependencies for reproducibility Conda environment can be specified in MLmodel configuration Model creation utilities Save models from any framework in MLflow format Deployment APIs CLI / Python / R / Java

Example MLflow Model mlflow.tensorflow.log_model(...) run_id: 769915006efd4c4bbd662461 time_created: 2018-06-28T12:34 flavors: tensorflow: saved_model_dir: estimator signature_def_key: predict python_function: loader_module: mlflow.tensorflow my_model/ ├── MLmodel │ │ │ │ │ └ estimator/ ├─ saved_model.pb └─ variables/ ... Usable with Tensorflow tools / APIs Usable with any Python tool mlflow.tensorflow.log_model(...)

Model Flavors Example PyTorch Train a model mlflow.pytorch.log_model() predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe) Flavor 1: Pyfunc Model Format Flavor 2: PyTorch model = mlflow.pytorch.load_model(…) with torch.no_grad(): model(input_tensor)

Model Flavors Example predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe)

Demo Goal: Classify hand-drawn digits Instrument Keras training code with MLflow tracking APIs Run training code as an MLflow Project Deploy an MLflow Model for real-time serving

Get started with MLflow pip install mlflow to get started Find docs & examples at mlflow.org tinyurl.com/mlflow-slack

0.9.0 Release MLflow 0.9.0 was released this week! Major features: Tracking server supports SQL via SQLAlchemy Pluggable tracking server backends Docker environments for Projects Custom python models

Ongoing MLflow Roadmap UI scalability improvements (1.0) X-coordinate logging for metrics & batched logging (1.0) Fluent API for Java and Scala (1.0) Packaging projects with build steps (1.0+) Better environment isolation when loading models (1.0) Improved model schemas (1.0) ✔

Thank you!