Download presentation
Presentation is loading. Please wait.
1
Intelligent Services Serving Machine Learning
Joseph E. Gonzalez Assistant UC Berkeley Dato Inc.
2
Contemporary Learning Systems
Training Data Models Big Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model
3
Contemporary Learning Systems
Create MLlib BIDMach Oryx 2 VW LIBSVM MLC
4
Data Model What happens after we train a model? Training Drive Actions
Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model Conference Papers Dashboards and Reports
5
Data Model What happens after we train a model? Training Drive Actions
Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model Conference Papers Dashboards and Reports
6
Suggesting Items at Checkout
Fraud Detection Cognitive Assistance Internet of Things Low-Latency Personalized Rapidly Changing
7
Data Model Train Connect to political bias story
----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model
8
Actions Data Model Adapt Serve Train Connect to political bias story
----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model
9
Machine Learning Intelligent Services
10
The Life of a Query in an Intelligent Service
Lookup Model Model Info Request: Items like x Top-K Query μ σ ρ ∑ ∫ α β math Feature Lookup Intelligent Service User Data Top Items New Page Images … Web Serving Tier Feature Lookup User Product Info Feedback Feedback: Preferred Item Content Request
11
Essential Attributes of Intelligent Services
Responsive Intelligent applications are interactive Adaptive ML models out-of-date the moment learning is done Manageable Many models created by multiple people
12
Responsive: Now and Always
Compute predictions in < 20ms for complex Models Queries Top K Features SELECT * FROM users JOIN items, click_logs, pages WHERE … under heavy query load with system failures.
13
Experiment: End-to-end Latency in Spark MLlib
Evaluate Model To JSON HTTP Req. Feature Trans. 4 HTTP Response Encode Prediction
14
Latency measured in milliseconds
End-to-end Latency for Digits Classification 784 dimension input Served using MLlib and Dato Inc. NOP (Avg = 5.5, P99 = 20.6) Single Logistic Regression (Avg = 21.8, P99 = 38.6) 100 Tree Random Forrest (Avg = 50.5, P99 = 73.4) Count out of 1000 500 Tree Random Forrest (Avg = , P99 = 268.7) 500 Tree Random Forrest (Avg = 172.6, P99 = 268.7) Decision Tree (Avg = 22.4, P99 = 63.8) One-vs-all LR (10-class) (Avg = 137.7, P99 = 217.7) AlexNet CNN (Avg = 418.7, P99 = 549.8) Latency measured in milliseconds
16
Adaptive to Change at All Scales
Population Granularity of Data Session Shopping for Mom Shopping for Me : New features are Customer interests evolve within sessions Need to collect the right data to make the best predictions. Make clear the modeling assumptions here. Think about change in statistics Months Rate of Change Minutes
17
Adaptive to Change at All Scales
Population Granularity of Data Session Population Law of Large Numbers Change Slow Rely on efficient offline retraining High-throughput Systems Shopping for Mom Shopping for Me : New features are Customer interests evolve within sessions Need to collect the right data to make the best predictions. Months Rate of Change Minutes Months New users and products Evolving interests and trends
18
Adaptive to Change at All Scales
Population Granularity of Data Session Small Data Rapidly Changing Low Latency Online Learning Sensitive to feedback bias Shopping for Mom Shopping for Me : New features are Customer interests evolve within sessions Need to collect the right data to make the best predictions. Months Rate of Change Minutes
19
The Feedback Loop Opportunity for Bandit Algorithms
My Amazon Homepage The Feedback Loop I once looked at cameras on Amazon … Similar cameras and accessories Opportunity for Bandit Algorithms Bandits present new challenges: computation overhead complicates caching + indexing
20
Exploration / Exploitation Tradeoff
Systems that can take actions can adversely bias future data. Opportunity for Bandits! Bandits present new challenges: Complicates caching + indexing tuning + counterfactual reasoning
21
Management: Collaborative Development
Teams of data-scientists working on similar tasks “competing” features and models Complex model dependencies: Cat Photo Cat Classifier Animal Classifier isCat isAnimal Cuteness Predictor Cute!
22
Daniel Crankshaw, Xin Wang, Joseph Gonzalez
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan
23
Active Research Project
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Active Research Project
24
Velox Model Serving System
[CIDR’15, LearningSys’15] Focuses on the multi-task learning (MTL) domain Spam Classification Content Rec. Scoring Session 1: Session 2: Localized Anomaly Detection
25
Velox Model Serving System
[CIDR’15, LearningSys’15] Personalized Models (Mulit-task Learning) Input Output “Separate” model for each user/context.
26
Velox Model Serving System
[CIDR’15, LearningSys’15] Personalized Models (Mulit-task Learning) Split Feature Model Personalization Model
27
Hybrid Offline + Online Learning
Update feature functions offline using batch solvers Leverage high-throughput systems (Apache Spark) Exploit slow change in population statistics Split Feature Model Personalization Model Update the user weights online: Simple to train + more robust model Address rapidly changing user statistics
28
Hybrid Online + Offline Learning Results
Similar Test Error Substantially Faster Training Hybrid Offline Full Offline Full Hybrid User Pref. Change
29
Evaluating the Model Cache Feature Evaluation Split Input
30
Evaluating the Model Cache Feature Evaluation
Split Feature Caching Across Users Input Approximate Feature Hashing Anytime Feature Evaluation
31
Feature Caching Feature Hash Table New input: Compute feature:
Hash input: Store result in table Feature Hash Table Alex notes; use a cover tree….
32
LSH Cache Coarsening Feature Hash Table Use Wrong Value!
New input Use Wrong Value! Hash new input: False cache collision LSH hash fn. Feature Hash Table Alex notes; use a cover tree….
33
LSH Cache Coarsening Feature Hash Table Locality-Sensitive Hashing:
Hash new input: False cache collision Use Value Anyways! Req. LSH Locality-Sensitive Caching: Alex notes; use a cover tree….
34
Always able to render a prediction by the latency deadline
Anytime Predictions Compute features asynchronously: if a particular element does not arrive use estimator instead Always able to render a prediction by the latency deadline __ __ __ Andrew McCallum Emma Streubel – “ACL 2015”
35
Coarsening + Anytime Predictions
Coarser Hash Overly Coarsened No Coarsening More Features Approx. Expectation Best Better Alex idea: Hashing with some idea of the frequency of the key make more space Map keys to different spaces to depending on key frequency Improve physical cache locality by building different hash spaces. Prefix code 000001${h(x)} truncate to bits Make zeros proportional to frequency of key Overlay with deadline misses and consider making deadline sooner to see impact on rightside of plot Checkout our poster!
36
Part of Berkeley Data Analytics Stack
Training Management + Serving Spark Streaming BlinkDB Graph X MLbase Velox Spark SQL ML library Model Manager Prediction Service Spark Mesos HDFS, S3, … Tachyon
37
Daniel Crankshaw, Xin Wang,
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan
38
Dato Predictive Services
Production ready model serving and management system Elastic scaling and load balancing of docker.io containers AWS Cloudwatch Metrics and Reporting Serves Dato Create models, scikit-learn, and custom python Distributed shared caching: scale-out to address latency REST management API: Demo?
39
Responsive Adaptive Manageable Caching, Bandits, & Management
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Responsive Adaptive Manageable Key Insights: Caching, Bandits, & Management Online/Offline Learning Latency vs. Accuracy
40
Actions Data Model Future of Learning Systems Adapt Serve Train
Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model Train Data Model
41
Joseph E. Gonzalez Thank You joseph@dato.com, Co-Founder @ Dato
Assistant UC Berkeley Dato
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.