Download presentation
Presentation is loading. Please wait.
Published byRaymond Lucas Modified over 6 years ago
1
100+ Machine Learning Models running live: The approach
Lucas Bernardi - Principal Data Scientist
2
Mission to empower people to experience the world
28+ million reported listings 5.6+ million are homes, apartments and other unique places to stay 141+ thousands destinations 1.5+ million room nights/day Terabytes of data every day 200+ Machine Learning Models Deployed Mission to empower people to experience the world
4
Machine Learning
5
Machine Learning Personalization NLP Recommendations Metric Learning
Ranking Vision Perdiction
6
Why do we need 100s of models?
7
Continuous Learning.
8
Continuous Learning. 2 4 6 1 3 5 7 Turn the idea into a Hypothesis
Build a ML Model when necessary 4 Learn from Results 6 Product Team has an Idea 1 Design an Experiment 3 Run the Experiment 5 Repeat 7
9
Insight. About 30% of searches done by users travelling with kids have no information about children
10
Hypothesis : They forget their Children
11
Hypothesis : They forget their Children
12
Experiment.
13
How do we support the demand?
14
RS: A Central Repository for Machine Learning Models.
Deploy Discover Consume Monitor Data Scientists can easily deploy their models. Product teams can find new existing models and use them in their products. Developers invoke the model through a standard call to the Repository Monitor the health of the model in production
15
Diversity Gives us Strength Programming Languages
Libraries Backgrounds
16
Decouple Training from Prediction.
17
Decouple Training from Prediction: Lookup Tables
Table that maps Input to Predictions A request just requires a lookup Fast, Scalable, Reliable
18
Decouple Training from Prediction: Lookup Tables
Table that maps Input to Predictions A request just requires a lookup Fast, Scalable, Reliable Feature Space Complexity Training Flexibility / Model Complexity
19
Decouple Training from Prediction: Generalized Linear Models
Prediction(X) = F(<W, T(X)>)
20
Decouple Training from Prediction: Generalized Linear Models
Prediction(X) = F(<W, T(X)>) Learn W Naive Bayes Classifier Logistic Regression Linear SVM Linear Regression Poisson Regression Neg Binomial Regression Quantile Regression Beta Regression Bucketing Substitution Interaction Choose T: Identity: Continuous Regression Sigmoid: Probabilistic Classification Exponential: Discrete Regression Choose F:
21
Decouple Training from Prediction: Generalized Linear Models
Ranking(X) = arg sort i ∈ I <Wi, T(X,i)>
22
Decouple Training from Prediction: Generalized Linear Models
Ranking(X) = arg sort i ∈ I <Wi, T(X,i)> Learn Wi Softmax, One-vs-All, etc: Multiclass Classification Cost Sensitive Classification: Multilabel Classification Word2Vec, GloVe: Cosine / Euclidean k-Nearest Neighbours Matrix Factorization(s): Recommender Systems
23
Ranking(X) = arg sort i ∈ I <Wi, T(X,i)>
Decouple Training from Prediction: Generalized Linear Models Ranking(X) = arg sort i ∈ I <Wi, T(X,i)> Learn Wi Softmax, One-vs-All, etc: Multiclass Classification Cost Sensitive Classification: Multilabel Classification Word2Vec, GloVe: Cosine / Euclidean k-Nearest Neighbours Matrix Factorization(s): Recommender Systems Training Flexibility / Feature Space Complexity Model Complexity
24
Decouple Training from Prediction: Beyond
Tree Based Models Neural Networks Your unique awesome algorithm
25
Decouple Training from Prediction: Beyond
Tree Based Models Neural Networks Your unique awesome algorithm Model Complexity / Feature Space Complexity Training Flexibility
26
Second Challenge: Monitoring.
Missing Information Delayed Information Changing Information Labels are only available for a subsample of the population Labels are only available after weeks Labels and Feature Space distribution are not stationary
27
Response Distribution Analysis.
Histogram of the Probabilities the model outputs for each presented example
28
Response Distribution Analysis: An Omniscient Model
Is always right, and it knows it
29
Response Distribution Analysis: A Confused Model
Main Characteristics Single mode Central mode No stable point Potential Root Causes High Bayes Error
30
Response Distribution Analysis: An Overconfident Model
Main Characteristics Extreme mode Single mode High frequency mode Potential Root Causes Cold start Outliers Wrong Feature Scaling
31
Response Distribution Analysis: A Maybe-Good Model
Main Characteristics Bi-modal Smooth Wide Support Single stable point
32
Response Distribution Analysis: A Maybe-Good Model
33
Response Distribution Analysis: Too Good to be True
34
Response Distribution Analysis: Robots Travel for Leisure
35
Second Challenge: Monitoring.
Missing Information Delayed Information Changing Information Global Feedback Considers all examples, even those for which labels will never be available Immediate Feedback It can be computed as soon as the model makes predictions Responsive Feedback The histogram is very sensitive to changes
36
100+ Machine Learning Models Running Live.
Continuous Learning Centralized ML Repository Principled Tradeoffs Heuristics The approach
37
Thank you! Check out our blog! booking.ai
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.