Joseph E. Gonzalez Assistant UC Berkeley Dato Inc. Intelligent Services Serving Machine.

Joseph E. Gonzalez jegonzal@cs.berkeley.edu; Assistant Professor @ UC Berkeley joseph@dato.com; Co-Founder @ Dato Inc. Intelligent Services Serving Machine Learning

Contemporary Learning Systems Training DataModels Big

Contemporary Learning Systems MLlib Create MLC LIBSVM VW Oryx 2 BIDMach

Training DataModel What happens after we train a model? Dashboards and Reports Conference Papers Drive Actions

Suggesting Items at Checkout Fraud Detection Cognitive Assistance Internet of Things Low-LatencyPersonalizedRapidly Changing

Train DataModel

Train DataModel Actions Serve Adapt

9 Machine Learning Intelligent Services

The Life of a Query in an Intelligent Service Web Serving Tier User Product Info Intelligent Service User Data Model Info Lookup Model Feature Lookup Feature Lookup Top-K Query Request: Items like x New Page Images … Top Items Content Request Feedback: Preferred Item Feedback μ σ ρ ∑ ∫ α β math

Essential Attributes of Intelligent Services Responsi ve Intelligent applications are interactive Adaptiv e ML models out-of-date the moment learning is done Manageabl e Many models created by multiple people

Responsive : Now and Always Compute predictions in < 20ms for complex under heavy query load with system failures. ModelsQueries Top K Features SELECT * FROM users JOIN items, click_logs, pages WHERE …

Experiment: End-to-end Latency in Spark MLlib To JSON HTTP Req.Feature Trans. Evaluate Model Encode PredictionHTTP Response 4

Count out of 1000 Latency measured in milliseconds NOP (Avg = 5.5, P99 = 20.6) Single Logistic Regression (Avg = 21.8, P99 = 38.6) Decision Tree (Avg = 22.4, P99 = 63.8) One-vs-all LR (10-class) (Avg = 137.7, P99 = 217.7) 100 Tree Random Forrest (Avg = 50.5, P99 = 73.4) 500 Tree Random Forrest (Avg = 172.56, P99 = 268.7) 500 Tree Random Forrest (Avg = 172.6, P99 = 268.7) AlexNet CNN (Avg = 418.7, P99 = 549.8) End-to-end Latency for Digits Classification 784 dimension input Served using MLlib and Dato Inc.

Adaptive to Change at All Scales Months Rate of Change Minutes Population Granularity of Data Session Shopping for Mom Shopping for Me

Adaptive to Change at All Scales New users and products Evolving interests and trends Months Rate of Change Minutes Population Granularity of Data Session Shopping for Mom Shopping for Me Population Law of Large Numbers  Change Slow Rely on efficient offline retraining  High-throughput Systems Months

Adaptive to Change at All Scales Months Rate of Change Minutes Population Granularity of Data Session Small Data  Rapidly Changing Low Latency  Online Learning Sensitive to feedback bias Shopping for Mom Shopping for Me

The Feedback Loop I once looked at cameras on Amazon … My Amazon Homepage Similar cameras and accessories Opportunity for Bandit Algorithms Bandits present new challenges: computation overhead complicates caching + indexing

Exploration / Exploitation Tradeoff Systems that can take actions can adversely bias future data. Opportunity for Bandits ! Bandits present new challenges: Complicates caching + indexing tuning + counterfactual reasoning

Management : Collaborative Development Teams of data-scientists working on similar tasks  “competing” features and models Complex model dependencies: Cat Photo isCat Cuteness Predictor Cat Classifier Animal Classifier Cute! isAnimal

Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan

Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Active Research Project

Velox Model Serving System Focuses on the multi-task learning (MTL) domain [CIDR’15, LearningSys’15] Spam Classification Content Rec. Scoring Session 1: Session 2: Localized Anomaly Detection

Velox Model Serving System Personalized Models (Mulit-task Learning) [CIDR’15, LearningSys’15] Input Output “Separate” model for each user/context.

Personalized Models (Mulit-task Learning) Split Personalization Model Feature Model Velox Model Serving System [CIDR’15, LearningSys’15]

Hybrid Offline + Online Learning Split Personalization Model Feature Model Update the user weights online: Simple to train + more robust model Address rapidly changing user statistics Update feature functions offline using batch solvers Leverage high-throughput systems (Apache Spark) Exploit slow change in population statistics

Hybrid Online + Offline Learning Results Similar Test ErrorSubstantially Faster Training User Pref. Change Hybrid Offline Full Hybrid Offline Full

Evaluating the Model Split Cache Feature Evaluation Input

Evaluating the Model Split Cache Feature Evaluation Input Feature Caching Across Users Anytime Feature Evaluation Approximate Feature Hashing

Feature Caching Feature Hash Table Hash input: Compute feature: New input:

LSH Cache Coarsening Hash new input: Use Wrong Value!  LSH hash fn. New input Feature Hash Table

LSH Cache Coarsening Feature Hash Table Hash new input: Use Value Anyways!  Req. LSH Locality-Sensitive Hashing: Locality-Sensitive Caching:

Anytime Predictions Compute features asynchronously: if a particular element does not arrive use estimator instead Always able to render a prediction by the latency deadline __

No Coarsening Coarsening + Anytime Predictions Overly Coarsened More Features Approx. Expectation Better Best Coarser Hash Checkout our poster!

Spark Streamin g Spark SQL Graph X ML library BlinkDB MLbas e Velox Training Management + Serving Spark HDFS, S3, … Tachyon Model Manager Predictio n Service Mesos Part of Berkeley Data Analytics Stack

Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan

Dato Predictive Services  Elastic scaling and load balancing of docker.io containers  AWS Cloudwatch Metrics and Reporting  Serves Dato Create models, scikit-learn, and custom python  Distributed shared caching: scale-out to address latency  REST management API: Demo? Production ready model serving and management system

Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Caching, Bandits, & Management Online/Offline Learning Latency vs. Accuracy Key Insights: Responsiv e AdaptiveManageable

Train Dat a Mode l Actions Serve Adapt Future of Learning Systems

Thank You Joseph E. Gonzalez jegonzal@cs.berkeley.edujegonzal@cs.berkeley.edu, Assistant Professor @ UC Berkeley joseph@dato.comjoseph@dato.com, Co-Founder @ Dato

Joseph E. Gonzalez Assistant UC Berkeley Dato Inc. Intelligent Services Serving Machine.

Similar presentations

Presentation on theme: "Joseph E. Gonzalez Assistant UC Berkeley Dato Inc. Intelligent Services Serving Machine."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Joseph E. Gonzalez Assistant UC Berkeley Dato Inc. Intelligent Services Serving Machine.

Similar presentations

Presentation on theme: "Joseph E. Gonzalez Assistant UC Berkeley Dato Inc. Intelligent Services Serving Machine."— Presentation transcript:

Similar presentations

About project

Feedback