Download presentation
Presentation is loading. Please wait.
Published byHerbert Schulze Modified over 6 years ago
1
Machine Learning Platform Life-Cycle Management
Hope(Xinwei) Wang Software Engineer at Intuit
2
ABOUT ME Software Engineer at Intuit (Small Business Group) since Jan. 2017 M.S. in Biomedical Engineering from University of Southern California(USC), Los Angeles (Dec. 2016) Self-taught, self-motivated programmer Linkedin: Hope(Xinwei) Wang
3
ABOUT INTUIT You may have heard of or used some of Intuit’s products.
We make financial software for consumers, small businesses and the self-employed. Our mission is to power prosperity around the world. Here’s one of our main product at Intuit: Quickbooks, accounting software for small and medium-sized businesses which offers them accounting applications, including accepting business payments, managing and pay bills, and payroll functions.
4
What is a machine learning platform?
Machine Learning Platform Life-Cycle Management Machine Learning Platform Life-Cycle Management OVERVIEW What is a machine learning platform? What is the ML platform lifecycle? Why ML platform lifecycle management? Artifacts and their associations Use cases at Intuit
5
What is a Machine Learning Platform?
Machine Learning Platform Life-Cycle Management What is a Machine Learning Platform? Manages the entire lifecycle of an ML model Includes automating and accelerating the delivery of ML applications
6
Machine Learning Platform Life-Cycle Management
7
Life-Cycle Management
Machine Learning Platform Life-Cycle Management ML Platform Life-Cycle Data Ingestion Data Discovery Feature Engineering Model Development Model Training Model Scoring This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Life-Cycle Management
8
Metadata/Catalog tool
Machine Learning Platform Life-Cycle Management Data Discovery Metadata/Catalog tool Metadata/catalog tool Accessible data source (Raw attributes & Data lineage) Data Lake This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
9
Metadata/Catalog tool
Machine Learning Platform Life-Cycle Management Feature Engineering Metadata/Catalog tool Output : features Reproducible Reusable Feature Repository This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
10
Model Development Collaborative environment Access data lake
Machine Learning Platform Life-Cycle Management Model Development Notebooks Collaborative environment Access data lake Data Lake This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
11
Model Training Support ability for:
Machine Learning Platform Life-Cycle Management Model Training Support ability for: Being triggered either manually/via automation Creation and management of training sets Re-training Optimizing hyper parameter tuning through parallelization of model training execution This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
12
Model Scoring Support online/offline(depend on use cases)
Machine Learning Platform Life-Cycle Management Model Scoring Support online/offline(depend on use cases) Ability to be triggered either manually/via automation This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
13
Big Mess! No central artifact management solution
Machine Learning Platform Life-Cycle Management Big Mess! No central artifact management solution Hard to reuse existing features/data/algorithms/toolings Inability to scale for large datasets Lack of automation/orchestration across the ML life-cycle Lack of rigor/discipline in the ML development life-cycle Slow down delivery of Machine Learning applications This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
14
Ideal Status Artifact Management
Machine Learning Platform Life-Cycle Management Ideal Status Optimizing data scientists’ engineering process Tie ML components together into a cohesive platform, support the life-cycle of ML artifacts end-to-end Increase efficiency of delivering ML predictions at scale This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Artifact Management
15
Environment Artifacts
Machine Learning Platform Life-Cycle Management Data Artifacts Features Training sets Model Artifacts Model code Trained models Performance metrics Hyper parameter values Environment Artifacts Languages & language versions Packages & Package versions This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
16
Life-Cycle Management
Machine Learning Platform Life-Cycle Management Data Ingestion Data Discovery Feature Engineering Model Development Model Training Model Scoring This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Life-Cycle Management
17
CONTAINERIZATION! Environment in Container
Machine Learning Platform Life-Cycle Management CONTAINERIZATION! This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Environment in Container
18
Benefit Of Containerization
Machine Learning Platform Life-Cycle Management Benefit Of Containerization Flexibility : Model has specific environment Consistency : Model has same behavior throughout the life-cycle This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
19
Machine Learning Platform Life-Cycle Management
Feature Set Definition Features Model Artifacts Data Artifacts Environment Artifacts Hyper-parameter values Model Training Datasets Scheduled Re-training/ performance benchmarks Metrics Definition Environment Metadata This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Performance Metrics Trained Models Container Model Scoring
20
Model Code Developed in notebooks Multiple versions
Machine Learning Platform Life-Cycle Management Model Code Developed in notebooks Multiple versions Each version associate with an externalized environment artifact This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
21
Environment Artifacts
Machine Learning Platform Life-Cycle Management Environment Artifacts Environment must be consistent for development, training, scoring Externalized as metadata Model/Execution environments constructed from metadata and deployed into containers (Docker, Yarn, Conda, etc.) This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
22
Examples of containers/virtual environments
Machine Learning Platform Life-Cycle Management Examples of containers/virtual environments Tailored to the environment (built based on externalized environment metadata) Used for model development, training, execution Container/virtual environment Usage Docker container Model development Model training Online scoring Yarn container On Spark cluster Distributed training Batch offline scoring Conda environment This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
23
Features Used as data input of the model
Machine Learning Platform Life-Cycle Management Features Used as data input of the model Stored in discoverable feature repository Metadata defines the model specific feature-sets This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
24
Trained Models Serialized, weighted model files
Machine Learning Platform Life-Cycle Management Trained Models Serialized, weighted model files Associate with a version of model code and training set This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
25
Training Sets Datasets used to train, validate and test the model
Machine Learning Platform Life-Cycle Management Training Sets Datasets used to train, validate and test the model Associated to a trained model This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
26
Feature Set Definitions
Machine Learning Platform Life-Cycle Management Feature Set Definitions Define what feature sets this model requires This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
27
Hyper Parameter Values
Machine Learning Platform Life-Cycle Management Hyper Parameter Values Set up values before learning process Model specific This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
28
Machine Learning Platform Life-Cycle Management
Metric Definition Defines the metrics to collect and thresholds to evaluate models against. This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
29
Performance Metrics Metrics to evaluate model effectiveness
Machine Learning Platform Life-Cycle Management Performance Metrics Metrics to evaluate model effectiveness Model metrics including: ROC curve, confusion matrix, F1 score, precision, recall, etc. This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
30
Scheduled Re-training & performance benchmarks
Machine Learning Platform Life-Cycle Management Scheduled Re-training & performance benchmarks To automate the re-training and deployment of updated models Model specific This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
31
Model Training in Docker Container
Machine Learning Platform Life-Cycle Management Model Training in Docker Container Docker Base Image Docker file Environment Artifact Model Docker Image Host Server Feature Discovery & Model Development Model Docker Container Trained Model This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Model Code Training Datasets
32
Life-Cycle Management
Machine Learning Platform Life-Cycle Management Data Ingestion Data Discovery Feature Engineering Model Development Model Training Model Scoring This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Life-Cycle Management
33
Model Development & Training & Tuning
Machine Learning Platform Life-Cycle Management Model Development & Training & Tuning This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation.
34
Self-help Service in Quickbooks
Machine Learning Platform Life-Cycle Management Get Contextual help Personalization Service Clickstream Data Feature Service Score Data Online Scoring Service Batch Source Kafka Spark Streaming Online Store Online Features This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Offline Features Lake Feature Repository Offline Spark Job
35
Example: Online Scoring Service Deployment Diagram
Auto-Deployment Tool Merge Training Environment Model Save Deploy Code Model Repository Deploy Template Wrap Serialized Model Trained Model Serialized Model Code Repository Model Specific Deploy Code/metadata Online Scoring Worker Node Docker Container Deploy Environment Metadata Model Model Specific Docker Image Docker File Build Docker Build Docker Registry Docker Base Image
36
Life-Cycle Management
Machine Learning Platform Life-Cycle Management Data Ingestion Data Discovery Feature Engineering Model Development Model Training Model Scoring This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Life-Cycle Management
37
Example: Online Scoring Service Deployment Diagram
Auto-Deployment Tool Merge Training Environment Model Save Deploy Code Model Repository Deploy Template Wrap Serialized Model Trained Model Serialized Model Code Repository Model Specific Deploy Code/metadata Online Scoring Worker Node Docker Container Deploy Environment Metadata Model Model Specific Docker Image Docker File Build Docker Build Docker Registry Docker Base Image
38
Example: Online Scoring Service Deployment Diagram
Auto-Deployment Tool Merge Training Environment Model Save Deploy Code Model Repository Deploy Template Wrap Serialized Model Trained Model Serialized Model Code Repository Model Specific Deploy Code/metadata Online Scoring Worker Node Docker Container Deploy Environment Metadata Model Model Specific Docker Image Docker File Build Docker Build Docker Registry Docker Base Image
39
Example: Online Scoring Service Deployment Diagram
Auto-Deployment Tool Merge Training Environment Model Save Deploy Code Model Repository Deploy Template Wrap Serialized Model Trained Model Serialized Model Code Repository Model Specific Deploy Code/metadata Online Scoring Worker Node Docker Container Deploy Environment Metadata Model Model Specific Docker Image Docker File Build Docker Build Docker Registry Docker Base Image
40
Example: Online Scoring Service Deployment Diagram
Auto-Deployment Tool Merge Training Environment Model Save Deploy Code Model Repository Deploy Template Wrap Serialized Model Trained Model Serialized Model Code Repository Model Specific Deploy Code/metadata Online Scoring Worker Node Docker Container Deploy Environment Metadata Model Model Specific Docker Image Docker File Build Docker Build Docker Registry Docker Base Image
41
Example: Online Scoring Service Deployment Diagram
Auto-Deployment Tool Merge Training Environment Model Save Deploy Code Model Repository Deploy Template Wrap Serialized Model Trained Model Serialized Model Code Repository Model Specific Deploy Code/metadata Online Scoring Worker Node Docker Container Deploy Environment Metadata Model Model Specific Docker Image Docker File Build Docker Build Docker Registry Docker Base Image
42
CONTAINERIZATION! Environment in Container
Machine Learning Platform Life-Cycle Management CONTAINERIZATION! This is a sample INTRODUCTION using the 1-Column layout. Placeholder text boxes that appear as part of the selected layout have predefined fonts, sizes and colors. To change the appearance of any line of text, on the Home tab, in the Paragraph group, click Increase Indent or Decrease Indent. The selected text will reformat to the predefined size according to its indent level. Note: Any changes to the color, size, spacing or font in the placeholder text box will break its connection to the Master style. It will no longer automatically conform when switching between layouts or when imported into another presentation. Environment in Container
43
Thank you Hope (Xinwei) Wang
Linkedin:
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.