Download presentation
Presentation is loading. Please wait.
Published byDaniel Gallagher Modified over 9 years ago
1
Probabilistic Machine Learning in Computational Advertising Thore Graepel, Thomas Borchert, Ralf Herbrich and Joaquin Quiñonero Candela Online Services and Advertising Microsoft Research Cambridge, UK NIPS 2009 – December 2009
2
Outline Online Advertising and Paid Search AdPredictor TM : Predicting User Clicks on Ads [Appendix] Model shrinking Parallel training
3
ONLINE ADVERTISING AND PAID SEARCH
4
Advertising Industry Business: Size 0 100 200 300 400 500 600 200120022003200420052006 Year Outdoor Cinema Radio TV Print Online Annual Expenditure (in billion USD) GDP Denmark (2006) Microsoft Revenue (2008) Data: World Advertising Research Center Report 2007
5
Advertising Industry Business: Growth -20.00% -10.00% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 200120022003200420052006 Year Outdoor Cinema Radio TV Print Online Data: World Advertising Research Center Report 2007
7
$1.00 $2.00 $0.10 * 10% * 4% * 50% =$0.10 =$0.08 =$0.05 $0.80 $1.25 $0.05 Display to users (expected bid) Charge advertisers (per click) Increase user satisfaction by better targeting Fairer charges to advertisers Increase revenue by showing ads with high click-thru rate Importance of accurate probability estimates:
8
The Scale of Things Realistic training set for proof of concept: 7,000,000,000 impressions 2 weeks of CPU time during training: 2 wks × 7 days × 86,400 sec/day = 1,209,600 seconds Learning algorithm speed requirement: – 5,787 impression updates / sec – 172.8 μs per impression update
9
ADPREDICTOR Bayesian Linear Probit Regression
10
Impression Level Predictions
11
One Weight per Feature Value 102.34.12.201 15.70.165.9 221.98.2.187 92.154.3.86 Client IP Exact Match Broad Match Match Type Position ML-1 SB-1 SB-2 + pClick
12
Click Potential Linear: click potential = sum of feature click contributions click potential 0 PageNumber/DisplayPosition/ReturnedAds = 0/ML-1/2 ListingId = 798831 ClientIP = 98.0.101.23 clickclick no click Impression click potential
13
Gaussian Noise Probit: area under Gaussian tail as a function of click potential click potential 0 Impression click potential clickclick no click P(click) = P(potential > 0)
14
Probit Probit: area under Gaussian tail as a function of click potential click potential 0 Impression click potential 100% P(click|Impression)
15
click potential 0 PageNumber/DisplayPosition/ReturnedAds = 0/ML-1/2 ListingId = 798831 ClientIP = 98.0.101.23 Modelling Uncertainty
16
click potential 0 Impression click potential Uncertainty about the Potential
17
click potential 0 Impression click potential Probability of Click 100%
18
Uncertainty: Bayesian Probabilities 102.34.12.201 15.70.165.9 221.98.2.187 92.154.3.86 Client IP Exact Match Broad Match Match Type Position ML-1 SB-1 SB-2 p(pClick) + +
19
Principled Exploration
20
Training Algorithm in Action No Click Click w1w1 w1w1 w2w2 w2w2 + + z z c c Prediction Training/Update
21
Posterior Updates for the Click Event
22
Client IP: Mean & Variance
23
Calibrated Predictions
24
Joint Updates vs. Independent Aggregation Naive Bayes
25
adPredictor Wrap Up Automatic learning rateCalibrated: 2% prediction means 2% clicksUse of many features, even if correlatedModelling the uncertainty explicitlyNatural exploration modeParallelizable with approximate inference
26
Thank you! thoreg@microsoft.com
27
APPENDIX
28
Dealing with Millions of Variables Observation 1: Large variable bags follow a power-law w.r.t. frequency of items Observation 2: Weight posteriors of rare items are close to their prior Idea: 1.Initially, the belief of each new item is compactly represented by one (and the same) prior 2.After observing an item for the first time, the posterior is allocated 3.At regular intervals, all weight posteriors with a small deviation from the prior are removed
29
Naïve Approach – Shared Memory Does not scale – Constant contention for locks – Some features are very frequent – Synchronization issues Training Node 1Training Node 2 Impression AImpression B MSNH1110.0.0.1USA (etc) MSNH11Canada10.0.1.25(etc) Model File Conflict! Update
30
Proposal: Approximate Learning Train Node 1Train Node 2 Impression AImpression B MSNH1110.0.0.1USA (etc) MSNH11Canada10.0.1.25(etc) Update Merge Deltas Update Final Model File
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.