Presentation is loading. Please wait.

Presentation is loading. Please wait.

100+ Machine Learning Models running live: The approach

Similar presentations


Presentation on theme: "100+ Machine Learning Models running live: The approach"— Presentation transcript:

1 100+ Machine Learning Models running live: The approach
Lucas Bernardi - Principal Data Scientist

2 Mission to empower people to experience the world
28+ million reported listings 5.6+ million are homes, apartments and other unique places to stay 141+ thousands destinations 1.5+ million room nights/day Terabytes of data every day 200+ Machine Learning Models Deployed Mission to empower people to experience the world

3

4 Machine Learning

5 Machine Learning Personalization NLP Recommendations Metric Learning
Ranking Vision Perdiction

6 Why do we need 100s of models?

7 Continuous Learning.

8 Continuous Learning. 2 4 6 1 3 5 7 Turn the idea into a Hypothesis
Build a ML Model when necessary 4 Learn from Results 6 Product Team has an Idea 1 Design an Experiment 3 Run the Experiment 5 Repeat 7

9 Insight. About 30% of searches done by users travelling with kids have no information about children

10 Hypothesis : They forget their Children

11 Hypothesis : They forget their Children

12 Experiment.

13 How do we support the demand?

14 RS: A Central Repository for Machine Learning Models.
Deploy Discover Consume Monitor Data Scientists can easily deploy their models. Product teams can find new existing models and use them in their products. Developers invoke the model through a standard call to the Repository Monitor the health of the model in production

15 Diversity Gives us Strength Programming Languages
Libraries Backgrounds

16 Decouple Training from Prediction.

17 Decouple Training from Prediction: Lookup Tables
Table that maps Input to Predictions A request just requires a lookup Fast, Scalable, Reliable

18 Decouple Training from Prediction: Lookup Tables
Table that maps Input to Predictions A request just requires a lookup Fast, Scalable, Reliable Feature Space Complexity Training Flexibility / Model Complexity

19 Decouple Training from Prediction: Generalized Linear Models
Prediction(X) = F(<W, T(X)>)

20 Decouple Training from Prediction: Generalized Linear Models
Prediction(X) = F(<W, T(X)>) Learn W Naive Bayes Classifier Logistic Regression Linear SVM Linear Regression Poisson Regression Neg Binomial Regression Quantile Regression Beta Regression Bucketing Substitution Interaction Choose T: Identity: Continuous Regression Sigmoid: Probabilistic Classification Exponential: Discrete Regression Choose F:

21 Decouple Training from Prediction: Generalized Linear Models
Ranking(X) = arg sort i ∈ I <Wi, T(X,i)>

22 Decouple Training from Prediction: Generalized Linear Models
Ranking(X) = arg sort i ∈ I <Wi, T(X,i)> Learn Wi Softmax, One-vs-All, etc: Multiclass Classification Cost Sensitive Classification: Multilabel Classification Word2Vec, GloVe: Cosine / Euclidean k-Nearest Neighbours Matrix Factorization(s): Recommender Systems

23 Ranking(X) = arg sort i ∈ I <Wi, T(X,i)>
Decouple Training from Prediction: Generalized Linear Models Ranking(X) = arg sort i ∈ I <Wi, T(X,i)> Learn Wi Softmax, One-vs-All, etc: Multiclass Classification Cost Sensitive Classification: Multilabel Classification Word2Vec, GloVe: Cosine / Euclidean k-Nearest Neighbours Matrix Factorization(s): Recommender Systems Training Flexibility / Feature Space Complexity Model Complexity

24 Decouple Training from Prediction: Beyond
Tree Based Models Neural Networks Your unique awesome algorithm

25 Decouple Training from Prediction: Beyond
Tree Based Models Neural Networks Your unique awesome algorithm Model Complexity / Feature Space Complexity Training Flexibility

26 Second Challenge: Monitoring.
Missing Information Delayed Information Changing Information Labels are only available for a subsample of the population Labels are only available after weeks Labels and Feature Space distribution are not stationary

27 Response Distribution Analysis.
Histogram of the Probabilities the model outputs for each presented example

28 Response Distribution Analysis: An Omniscient Model
Is always right, and it knows it

29 Response Distribution Analysis: A Confused Model
Main Characteristics Single mode Central mode No stable point Potential Root Causes High Bayes Error

30 Response Distribution Analysis: An Overconfident Model
Main Characteristics Extreme mode Single mode High frequency mode Potential Root Causes Cold start Outliers Wrong Feature Scaling

31 Response Distribution Analysis: A Maybe-Good Model
Main Characteristics Bi-modal Smooth Wide Support Single stable point

32 Response Distribution Analysis: A Maybe-Good Model

33 Response Distribution Analysis: Too Good to be True

34 Response Distribution Analysis: Robots Travel for Leisure

35 Second Challenge: Monitoring.
Missing Information Delayed Information Changing Information Global Feedback Considers all examples, even those for which labels will never be available Immediate Feedback It can be computed as soon as the model makes predictions Responsive Feedback The histogram is very sensitive to changes

36 100+ Machine Learning Models Running Live.
Continuous Learning Centralized ML Repository Principled Tradeoffs Heuristics The approach

37 Thank you! Check out our blog! booking.ai


Download ppt "100+ Machine Learning Models running live: The approach"

Similar presentations


Ads by Google