Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Services Serving Machine Learning

Similar presentations


Presentation on theme: "Intelligent Services Serving Machine Learning"— Presentation transcript:

1 Intelligent Services Serving Machine Learning
Joseph E. Gonzalez Assistant UC Berkeley Dato Inc.

2 Contemporary Learning Systems
Training Data Models Big Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model

3 Contemporary Learning Systems
Create MLlib BIDMach Oryx 2 VW LIBSVM MLC

4 Data Model What happens after we train a model? Training Drive Actions
Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model Conference Papers Dashboards and Reports

5 Data Model What happens after we train a model? Training Drive Actions
Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model Conference Papers Dashboards and Reports

6 Suggesting Items at Checkout
Fraud Detection Cognitive Assistance Internet of Things Low-Latency Personalized Rapidly Changing

7 Data Model Train Connect to political bias story
----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model

8 Actions Data Model Adapt Serve Train Connect to political bias story
----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model

9 Machine Learning Intelligent Services

10 The Life of a Query in an Intelligent Service
Lookup Model Model Info Request: Items like x Top-K Query μ σ ρ α β math Feature Lookup Intelligent Service User Data Top Items New Page Images … Web Serving Tier Feature Lookup User Product Info Feedback Feedback: Preferred Item Content Request

11 Essential Attributes of Intelligent Services
Responsive Intelligent applications are interactive Adaptive ML models out-of-date the moment learning is done Manageable Many models created by multiple people

12 Responsive: Now and Always
Compute predictions in < 20ms for complex Models Queries Top K Features SELECT * FROM users JOIN items, click_logs, pages WHERE … under heavy query load with system failures.

13 Experiment: End-to-end Latency in Spark MLlib
Evaluate Model To JSON HTTP Req. Feature Trans. 4 HTTP Response Encode Prediction

14 Latency measured in milliseconds
End-to-end Latency for Digits Classification 784 dimension input Served using MLlib and Dato Inc. NOP (Avg = 5.5, P99 = 20.6) Single Logistic Regression (Avg = 21.8, P99 = 38.6) 100 Tree Random Forrest (Avg = 50.5, P99 = 73.4) Count out of 1000 500 Tree Random Forrest (Avg = , P99 = 268.7) 500 Tree Random Forrest (Avg = 172.6, P99 = 268.7) Decision Tree (Avg = 22.4, P99 = 63.8) One-vs-all LR (10-class) (Avg = 137.7, P99 = 217.7) AlexNet CNN (Avg = 418.7, P99 = 549.8) Latency measured in milliseconds

15

16 Adaptive to Change at All Scales
Population Granularity of Data Session Shopping for Mom Shopping for Me : New features are Customer interests evolve within sessions Need to collect the right data to make the best predictions. Make clear the modeling assumptions here. Think about change in statistics Months Rate of Change Minutes

17 Adaptive to Change at All Scales
Population Granularity of Data Session Population Law of Large Numbers  Change Slow Rely on efficient offline retraining  High-throughput Systems Shopping for Mom Shopping for Me : New features are Customer interests evolve within sessions Need to collect the right data to make the best predictions. Months Rate of Change Minutes Months New users and products Evolving interests and trends

18 Adaptive to Change at All Scales
Population Granularity of Data Session Small Data  Rapidly Changing Low Latency  Online Learning Sensitive to feedback bias Shopping for Mom Shopping for Me : New features are Customer interests evolve within sessions Need to collect the right data to make the best predictions. Months Rate of Change Minutes

19 The Feedback Loop Opportunity for Bandit Algorithms
My Amazon Homepage The Feedback Loop I once looked at cameras on Amazon … Similar cameras and accessories Opportunity for Bandit Algorithms Bandits present new challenges: computation overhead complicates caching + indexing

20 Exploration / Exploitation Tradeoff
Systems that can take actions can adversely bias future data. Opportunity for Bandits! Bandits present new challenges: Complicates caching + indexing tuning + counterfactual reasoning

21 Management: Collaborative Development
Teams of data-scientists working on similar tasks “competing” features and models Complex model dependencies: Cat Photo Cat Classifier Animal Classifier isCat isAnimal Cuteness Predictor Cute!

22 Daniel Crankshaw, Xin Wang, Joseph Gonzalez
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan

23 Active Research Project
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Active Research Project

24 Velox Model Serving System
[CIDR’15, LearningSys’15] Focuses on the multi-task learning (MTL) domain Spam Classification Content Rec. Scoring Session 1: Session 2: Localized Anomaly Detection

25 Velox Model Serving System
[CIDR’15, LearningSys’15] Personalized Models (Mulit-task Learning) Input Output “Separate” model for each user/context.

26 Velox Model Serving System
[CIDR’15, LearningSys’15] Personalized Models (Mulit-task Learning) Split Feature Model Personalization Model

27 Hybrid Offline + Online Learning
Update feature functions offline using batch solvers Leverage high-throughput systems (Apache Spark) Exploit slow change in population statistics Split Feature Model Personalization Model Update the user weights online: Simple to train + more robust model Address rapidly changing user statistics

28 Hybrid Online + Offline Learning Results
Similar Test Error Substantially Faster Training Hybrid Offline Full Offline Full Hybrid User Pref. Change

29 Evaluating the Model Cache Feature Evaluation Split Input

30 Evaluating the Model Cache Feature Evaluation
Split Feature Caching Across Users Input Approximate Feature Hashing Anytime Feature Evaluation

31 Feature Caching Feature Hash Table New input: Compute feature:
Hash input: Store result in table Feature Hash Table Alex notes; use a cover tree….

32 LSH Cache Coarsening Feature Hash Table Use Wrong Value!
New input Use Wrong Value! Hash new input: False cache collision  LSH hash fn. Feature Hash Table Alex notes; use a cover tree….

33 LSH Cache Coarsening Feature Hash Table Locality-Sensitive Hashing:
Hash new input: False cache collision Use Value Anyways!  Req. LSH Locality-Sensitive Caching: Alex notes; use a cover tree….

34 Always able to render a prediction by the latency deadline
Anytime Predictions Compute features asynchronously: if a particular element does not arrive use estimator instead Always able to render a prediction by the latency deadline __ __ __ Andrew McCallum Emma Streubel – “ACL 2015”

35 Coarsening + Anytime Predictions
Coarser Hash Overly Coarsened No Coarsening More Features Approx. Expectation Best Better Alex idea: Hashing with some idea of the frequency of the key make more space Map keys to different spaces to depending on key frequency  Improve physical cache locality by building different hash spaces. Prefix code 000001${h(x)} truncate to bits Make zeros proportional to frequency of key Overlay with deadline misses and consider making deadline sooner to see impact on rightside of plot Checkout our poster!

36 Part of Berkeley Data Analytics Stack
Training Management + Serving Spark Streaming BlinkDB Graph X MLbase Velox Spark SQL ML library Model Manager Prediction Service Spark Mesos HDFS, S3, … Tachyon

37 Daniel Crankshaw, Xin Wang,
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan

38 Dato Predictive Services
Production ready model serving and management system Elastic scaling and load balancing of docker.io containers AWS Cloudwatch Metrics and Reporting Serves Dato Create models, scikit-learn, and custom python Distributed shared caching: scale-out to address latency REST management API: Demo?

39 Responsive Adaptive Manageable Caching, Bandits, & Management
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Responsive Adaptive Manageable Key Insights: Caching, Bandits, & Management Online/Offline Learning Latency vs. Accuracy

40 Actions Data Model Future of Learning Systems Adapt Serve Train
Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model Train Data Model

41 Joseph E. Gonzalez Thank You joseph@dato.com, Co-Founder @ Dato
Assistant UC Berkeley Dato


Download ppt "Intelligent Services Serving Machine Learning"

Similar presentations


Ads by Google