Download presentation
Presentation is loading. Please wait.
Published byBeatrice Willis Modified over 8 years ago
1
Joseph E. Gonzalez jegonzal@cs.berkeley.edu; Assistant Professor @ UC Berkeley joseph@dato.com; Co-Founder @ Dato Inc. Intelligent Services Serving Machine Learning
2
Contemporary Learning Systems Training DataModels Big
3
Contemporary Learning Systems MLlib Create MLC LIBSVM VW Oryx 2 BIDMach
4
Training DataModel What happens after we train a model? Dashboards and Reports Conference Papers Drive Actions
5
Training DataModel What happens after we train a model? Dashboards and Reports Conference Papers Drive Actions
6
Suggesting Items at Checkout Fraud Detection Cognitive Assistance Internet of Things Low-LatencyPersonalizedRapidly Changing
7
Train DataModel
8
Train DataModel Actions Serve Adapt
9
9 Machine Learning Intelligent Services
10
The Life of a Query in an Intelligent Service Web Serving Tier User Product Info Intelligent Service User Data Model Info Lookup Model Feature Lookup Feature Lookup Top-K Query Request: Items like x New Page Images … Top Items Content Request Feedback: Preferred Item Feedback μ σ ρ ∑ ∫ α β math
11
Essential Attributes of Intelligent Services Responsi ve Intelligent applications are interactive Adaptiv e ML models out-of-date the moment learning is done Manageabl e Many models created by multiple people
12
Responsive : Now and Always Compute predictions in < 20ms for complex under heavy query load with system failures. ModelsQueries Top K Features SELECT * FROM users JOIN items, click_logs, pages WHERE …
13
Experiment: End-to-end Latency in Spark MLlib To JSON HTTP Req.Feature Trans. Evaluate Model Encode PredictionHTTP Response 4
14
Count out of 1000 Latency measured in milliseconds NOP (Avg = 5.5, P99 = 20.6) Single Logistic Regression (Avg = 21.8, P99 = 38.6) Decision Tree (Avg = 22.4, P99 = 63.8) One-vs-all LR (10-class) (Avg = 137.7, P99 = 217.7) 100 Tree Random Forrest (Avg = 50.5, P99 = 73.4) 500 Tree Random Forrest (Avg = 172.56, P99 = 268.7) 500 Tree Random Forrest (Avg = 172.6, P99 = 268.7) AlexNet CNN (Avg = 418.7, P99 = 549.8) End-to-end Latency for Digits Classification 784 dimension input Served using MLlib and Dato Inc.
16
Adaptive to Change at All Scales Months Rate of Change Minutes Population Granularity of Data Session Shopping for Mom Shopping for Me
17
Adaptive to Change at All Scales New users and products Evolving interests and trends Months Rate of Change Minutes Population Granularity of Data Session Shopping for Mom Shopping for Me Population Law of Large Numbers Change Slow Rely on efficient offline retraining High-throughput Systems Months
18
Adaptive to Change at All Scales Months Rate of Change Minutes Population Granularity of Data Session Small Data Rapidly Changing Low Latency Online Learning Sensitive to feedback bias Shopping for Mom Shopping for Me
19
The Feedback Loop I once looked at cameras on Amazon … My Amazon Homepage Similar cameras and accessories Opportunity for Bandit Algorithms Bandits present new challenges: computation overhead complicates caching + indexing
20
Exploration / Exploitation Tradeoff Systems that can take actions can adversely bias future data. Opportunity for Bandits ! Bandits present new challenges: Complicates caching + indexing tuning + counterfactual reasoning
21
Management : Collaborative Development Teams of data-scientists working on similar tasks “competing” features and models Complex model dependencies: Cat Photo isCat Cuteness Predictor Cat Classifier Animal Classifier Cute! isAnimal
22
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan
23
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Active Research Project
24
Velox Model Serving System Focuses on the multi-task learning (MTL) domain [CIDR’15, LearningSys’15] Spam Classification Content Rec. Scoring Session 1: Session 2: Localized Anomaly Detection
25
Velox Model Serving System Personalized Models (Mulit-task Learning) [CIDR’15, LearningSys’15] Input Output “Separate” model for each user/context.
26
Personalized Models (Mulit-task Learning) Split Personalization Model Feature Model Velox Model Serving System [CIDR’15, LearningSys’15]
27
Hybrid Offline + Online Learning Split Personalization Model Feature Model Update the user weights online: Simple to train + more robust model Address rapidly changing user statistics Update feature functions offline using batch solvers Leverage high-throughput systems (Apache Spark) Exploit slow change in population statistics
28
Hybrid Online + Offline Learning Results Similar Test ErrorSubstantially Faster Training User Pref. Change Hybrid Offline Full Hybrid Offline Full
29
Evaluating the Model Split Cache Feature Evaluation Input
30
Evaluating the Model Split Cache Feature Evaluation Input Feature Caching Across Users Anytime Feature Evaluation Approximate Feature Hashing
31
Feature Caching Feature Hash Table Hash input: Compute feature: New input:
32
LSH Cache Coarsening Hash new input: Use Wrong Value! LSH hash fn. New input Feature Hash Table
33
LSH Cache Coarsening Feature Hash Table Hash new input: Use Value Anyways! Req. LSH Locality-Sensitive Hashing: Locality-Sensitive Caching:
34
Anytime Predictions Compute features asynchronously: if a particular element does not arrive use estimator instead Always able to render a prediction by the latency deadline __
35
No Coarsening Coarsening + Anytime Predictions Overly Coarsened More Features Approx. Expectation Better Best Coarser Hash Checkout our poster!
36
Spark Streamin g Spark SQL Graph X ML library BlinkDB MLbas e Velox Training Management + Serving Spark HDFS, S3, … Tachyon Model Manager Predictio n Service Mesos Part of Berkeley Data Analytics Stack
37
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan
38
Dato Predictive Services Elastic scaling and load balancing of docker.io containers AWS Cloudwatch Metrics and Reporting Serves Dato Create models, scikit-learn, and custom python Distributed shared caching: scale-out to address latency REST management API: Demo? Production ready model serving and management system
39
Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Caching, Bandits, & Management Online/Offline Learning Latency vs. Accuracy Key Insights: Responsiv e AdaptiveManageable
40
Train Dat a Mode l Actions Serve Adapt Future of Learning Systems
41
Thank You Joseph E. Gonzalez jegonzal@cs.berkeley.edujegonzal@cs.berkeley.edu, Assistant Professor @ UC Berkeley joseph@dato.comjoseph@dato.com, Co-Founder @ Dato
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.