Intelligent Services Serving Machine Learning

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

QA practitioners viewpoint
Book Recommender System Guided By: Prof. Ellis Horowitz Kaijian Xu Group 3 Ameet Nanda Bhaskar Upadhyay Bhavana Parekh.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Imbalanced data David Kauchak CS 451 – Fall 2013.
Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.
1 NETE4631 Cloud deployment models and migration Lecture Notes #4.
Turning Data into Value Ion Stoica CEO, Databricks (also, UC Berkeley and Conviva) UC BERKELEY.
Information Retrieval in Practice
Overview of Search Engines
Outline | Motivation| Design | Results| Status| Future
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
URLDoc: Learning to Detect Malicious URLs using Online Logistic Regression Presented by : Mohammed Nazim Feroz 11/26/2013.
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
Tyson Condie.
Search Engines and Information Retrieval Chapter 1.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Streaming Predictions of User Behavior in Real- Time Ethan DereszynskiEthan Dereszynski (Webtrends) Eric ButlerEric Butler (Cedexis) OSCON 2014.
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
1 CS 294: Big Data System Research: Trends and Challenges Fall 2015 (MW 9:30-11:00, 310 Soda Hall) Ion Stoica and Ali Ghodsi (
윤언근 DataMining lab.  The Web has grown exponentially in size but this growth has not been isolated to good-quality pages.  spamming and.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Security Analytics Thrust Anthony D. Joseph (UCB) Rachel Greenstadt (Drexel), Ling Huang (Intel), Dawn Song (UCB), Doug Tygar (UCB)
Resilient Distributed Datasets: A Fault- Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
Joseph E. Gonzalez Assistant UC Berkeley Dato Inc. Intelligent Services Serving Machine.
Accurate WiFi Packet Delivery Rate Estimation and Applications Owais Khan and Lili Qiu. The University of Texas at Austin 1 Infocom 2016, San Francisco.
Dato Confidential 1 Danny Bickson Co-Founder. Dato Confidential 2 Successful apps in 2015 must be intelligent Machine learning key to next-gen apps Recommenders.
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
IncApprox The marriage of incremental and approximate computing Pramod Bhatotia Dhanya Krishnan, Do Le Quoc, Christof Fetzer, Rodrigo Rodrigues* (TU Dresden.
Image taken from: slideshare
Big Data Analytics and HPC Platforms
Dan Crankshaw UCB RISE Lab Seminar 10/3/2015
Machine Learning with Spark MLlib
Big Data is a Big Deal!.
Katie Goldstein Christopher Bilz Brian Northern Michael Yang
Prediction Serving Joseph E. Gonzalez Asst. Professor, UC Berkeley
International Conference on Data Engineering (ICDE 2016)
Machine Learning overview Chapter 18, 21
R SE to the challenges of ntelligent systems
College of Engineering
Microsoft Ignite /22/2018 3:27 PM BRK2121
Erasmus University Rotterdam
Microsoft Build /20/2018 5:17 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Introduction to Spark.
Spark Software Stack Inf-2202 Concurrent and Data-Intensive Programming Fall 2016 Lars Ailo Bongo
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Intro to Machine Learning
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks
StreamApprox Approximate Stream Analytics in Apache Spark
CMPT 733, SPRING 2016 Jiannan Wang
Random Sampling over Joins Revisited
MARMIND’s New Service Delivers a Single Centralized Marketing Plan That Connects Teams, Campaigns and Outcomes by Using the Power of the Azure Platform.
Team 2: Graham Leech, Austin Woods, Cory Smith, Brent Niemerski
XtremeData on the Microsoft Azure Cloud Platform:
Parallel Analytic Systems
Overview of big data tools
Efficient Document Analytics on Compressed Data:
PROJECTS SUMMARY PRESNETED BY HARISH KUMAR JANUARY 10,2018.
Approaching an ML Problem
Where Intelligence Lives & Intelligence Management
100+ Machine Learning Models running live: The approach
CSE 491/891 Lecture 25 (Mahout).
Logistics Project proposals are due by midnight tonight (two pages)
Course Summary Joseph E. Gonzalez
Natural Language Processing (NLP) Systems Joseph E. Gonzalez
CMPT 733, SPRING 2017 Jiannan Wang
The Student’s Guide to Apache Spark
Modeling IDS using hybrid intelligent systems
Presentation transcript:

Intelligent Services Serving Machine Learning Joseph E. Gonzalez jegonzal@cs.berkeley.edu; Assistant Professor @ UC Berkeley joseph@dato.com; Co-Founder @ Dato Inc.

Contemporary Learning Systems Training Data Models Big Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model

Contemporary Learning Systems Create MLlib BIDMach Oryx 2 VW LIBSVM MLC

Data Model What happens after we train a model? Training Drive Actions Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model Conference Papers Dashboards and Reports

Data Model What happens after we train a model? Training Drive Actions Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model Conference Papers Dashboards and Reports

Suggesting Items at Checkout Fraud Detection Cognitive Assistance Internet of Things Low-Latency Personalized Rapidly Changing

Data Model Train Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model

Actions Data Model Adapt Serve Train Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model

Machine Learning Intelligent Services

The Life of a Query in an Intelligent Service Lookup Model Model Info Request: Items like x Top-K Query μ σ ρ ∑ ∫ α β math Feature Lookup Intelligent Service User Data Top Items New Page Images … Web Serving Tier Feature Lookup User Product Info Feedback Feedback: Preferred Item Content Request

Essential Attributes of Intelligent Services Responsive Intelligent applications are interactive Adaptive ML models out-of-date the moment learning is done Manageable Many models created by multiple people

Responsive: Now and Always Compute predictions in < 20ms for complex Models Queries Top K Features SELECT * FROM users JOIN items, click_logs, pages WHERE … under heavy query load with system failures.

Experiment: End-to-end Latency in Spark MLlib Evaluate Model To JSON HTTP Req. Feature Trans. 4 HTTP Response Encode Prediction

Latency measured in milliseconds End-to-end Latency for Digits Classification 784 dimension input Served using MLlib and Dato Inc. NOP (Avg = 5.5, P99 = 20.6) Single Logistic Regression (Avg = 21.8, P99 = 38.6) 100 Tree Random Forrest (Avg = 50.5, P99 = 73.4) Count out of 1000 500 Tree Random Forrest (Avg = 172.56, P99 = 268.7) 500 Tree Random Forrest (Avg = 172.6, P99 = 268.7) Decision Tree (Avg = 22.4, P99 = 63.8) One-vs-all LR (10-class) (Avg = 137.7, P99 = 217.7) AlexNet CNN (Avg = 418.7, P99 = 549.8) Latency measured in milliseconds

Adaptive to Change at All Scales Population Granularity of Data Session Shopping for Mom Shopping for Me : New features are Customer interests evolve within sessions Need to collect the right data to make the best predictions. Make clear the modeling assumptions here. Think about change in statistics Months Rate of Change Minutes

Adaptive to Change at All Scales Population Granularity of Data Session Population Law of Large Numbers  Change Slow Rely on efficient offline retraining  High-throughput Systems Shopping for Mom Shopping for Me : New features are Customer interests evolve within sessions Need to collect the right data to make the best predictions. Months Rate of Change Minutes Months New users and products Evolving interests and trends

Adaptive to Change at All Scales Population Granularity of Data Session Small Data  Rapidly Changing Low Latency  Online Learning Sensitive to feedback bias Shopping for Mom Shopping for Me : New features are Customer interests evolve within sessions Need to collect the right data to make the best predictions. Months Rate of Change Minutes

The Feedback Loop Opportunity for Bandit Algorithms My Amazon Homepage The Feedback Loop I once looked at cameras on Amazon … Similar cameras and accessories Opportunity for Bandit Algorithms Bandits present new challenges: computation overhead complicates caching + indexing

Exploration / Exploitation Tradeoff Systems that can take actions can adversely bias future data. Opportunity for Bandits! Bandits present new challenges: Complicates caching + indexing tuning + counterfactual reasoning

Management: Collaborative Development Teams of data-scientists working on similar tasks “competing” features and models Complex model dependencies: Cat Photo Cat Classifier Animal Classifier isCat isAnimal Cuteness Predictor Cute!

Daniel Crankshaw, Xin Wang, Joseph Gonzalez Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan

Active Research Project Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Active Research Project

Velox Model Serving System [CIDR’15, LearningSys’15] Focuses on the multi-task learning (MTL) domain Spam Classification Content Rec. Scoring Session 1: Session 2: Localized Anomaly Detection

Velox Model Serving System [CIDR’15, LearningSys’15] Personalized Models (Mulit-task Learning) Input Output “Separate” model for each user/context.

Velox Model Serving System [CIDR’15, LearningSys’15] Personalized Models (Mulit-task Learning) Split Feature Model Personalization Model

Hybrid Offline + Online Learning Update feature functions offline using batch solvers Leverage high-throughput systems (Apache Spark) Exploit slow change in population statistics Split Feature Model Personalization Model Update the user weights online: Simple to train + more robust model Address rapidly changing user statistics

Hybrid Online + Offline Learning Results Similar Test Error Substantially Faster Training Hybrid Offline Full Offline Full Hybrid User Pref. Change

Evaluating the Model Cache Feature Evaluation Split Input

Evaluating the Model Cache Feature Evaluation Split Feature Caching Across Users Input Approximate Feature Hashing Anytime Feature Evaluation

Feature Caching Feature Hash Table New input: Compute feature: Hash input: Store result in table Feature Hash Table Alex notes; use a cover tree….

LSH Cache Coarsening Feature Hash Table Use Wrong Value! New input Use Wrong Value! Hash new input: False cache collision  LSH hash fn. Feature Hash Table Alex notes; use a cover tree….

LSH Cache Coarsening Feature Hash Table Locality-Sensitive Hashing: Hash new input: False cache collision Use Value Anyways!  Req. LSH Locality-Sensitive Caching: Alex notes; use a cover tree….

Always able to render a prediction by the latency deadline Anytime Predictions Compute features asynchronously: if a particular element does not arrive use estimator instead Always able to render a prediction by the latency deadline __ __ __ Andrew McCallum Emma Streubel – “ACL 2015”

Coarsening + Anytime Predictions Coarser Hash Overly Coarsened No Coarsening More Features Approx. Expectation Best Better Alex idea: Hashing with some idea of the frequency of the key make more space Map keys to different spaces to depending on key frequency  Improve physical cache locality by building different hash spaces. Prefix code 000001${h(x)} truncate to bits Make zeros proportional to frequency of key Overlay with deadline misses and consider making deadline sooner to see impact on rightside of plot Checkout our poster!

Part of Berkeley Data Analytics Stack Training Management + Serving Spark Streaming BlinkDB Graph X MLbase Velox Spark SQL ML library Model Manager Prediction Service Spark Mesos HDFS, S3, … Tachyon

Daniel Crankshaw, Xin Wang, Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan

Dato Predictive Services Production ready model serving and management system Elastic scaling and load balancing of docker.io containers AWS Cloudwatch Metrics and Reporting Serves Dato Create models, scikit-learn, and custom python Distributed shared caching: scale-out to address latency REST management API: Demo?

Responsive Adaptive Manageable Caching, Bandits, & Management Predictive Services UC Berkeley AMPLab Daniel Crankshaw, Xin Wang, Joseph Gonzalez Peter Bailis, Haoyuan, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan Responsive Adaptive Manageable Key Insights: Caching, Bandits, & Management Online/Offline Learning Latency vs. Accuracy

Actions Data Model Future of Learning Systems Adapt Serve Train Connect to political bias story ----- Meeting Notes (2/3/15 11:29) ----- Example of Data and Model use amazon model Train Data Model

Joseph E. Gonzalez Thank You joseph@dato.com, Co-Founder @ Dato jegonzal@cs.berkeley.edu, Assistant Professor @ UC Berkeley joseph@dato.com, Co-Founder @ Dato