Download presentation
Presentation is loading. Please wait.
Published byPatrick Horton Modified over 9 years ago
1
Catching the Drift: Learning Broad Matches from Clickthrough Data Sonal Gupta, Mikhail Bilenko, Matthew Richardson University of Texas at Austin, Microsoft Research
2
Introduction Keyword-based online advertising: bidded keywords are extracted from context Context: query (search ads) or page (content ads) Broad matching: expanding keywords via keyword-to-keywords mapping Example: electric cars tesla, hybrids, toyota prius, golf carts Broad matching benefits advertisers (increased reach, less campaign tuning), users (more relevant ads), ad platform (higher monetization) Expanded Keywords kw 1 kw 11 kw 12 kw n kw n1 kw n2 Broad Match Expansion Ad Selection and Ranking Ad 1 Ad 2 Ad k Extracted Keywords Keyword Extraction kw 1 kw 2 kw n Query or Web Page Selected Ads
3
Identifying Broad Matches Good keyword mappings retrieve relevant ads that users click How to measure what is relevant and likely to be clicked? Human judgments: expensive, hard to scale Past user clicks: provide click data for kw → kw’ when user was shown ad(kw' ) in context of kw Highly available, less trustworthy What similarity functions may indicate relevance of kw → kw' ? Syntactic (edit distance, TF-IDF cosine, string kernels, …) Co-occurrence (in documents, query sessions, bid campaigns, …) Expanded representation (search result snippets, category bags, …)
4
Approach Task: train a learner to estimate p(click | kw → kw' ) for any kw → kw' Data triples from clickthrough logs, where kw → kw' was suggested by previous broad match mappings Features Convert each pair to a feature vector capturing similarities etc. (kw → kw') → For each triple, create an instance: ( ϕ (kw, kw' ), click) Learner: max-margin averaged perceptron (strong theory, very efficient) ϕ 1 (kw, kw' ) ϕ n (kw, kw' ) … where ϕ i (kw, kw' ) can be any function of kw, kw' or both
5
5 Example: Creating an Instance Historical broad match clickthrough data: kw kw' ad(kw' ) click event digital slr canon rebel Canon Rebel Kit for $499 click seattle baseball mariners tickets Mariners season tickets no click Feature functions Instances [0.78 0.001 0.9], 1 [0.05 0.02 0.2], 0 Original kwBroad match kw' ϕ1ϕ1 ϕ2ϕ2 ϕ3ϕ3 digital slrcanon rebel0.780.0010.9 seattle baseballmariners tickets0.050.020.2
6
Experiments Data 2 months of previous broad match ads from Microsoft Content Ads logs 1 month for training, 1 month for testing 68 features (syntactic, co-occurrence based, etc.); greedy feature selection Metrics LogLoss: LogLoss Lift: difference between obtained LogLoss and an oracle that has access to empirical p(click | kw → kw' ) in test set. CTR and revenue results in live test with users
7
Results
8
Live Test Results Use CTR prediction to maximize expected revenue Re-rank mappings to incorporate revenue +18% revenue, -2% CTR
9
Online Learning with Amnesia Advertisers, campaigns, bidded keywords and delivery contexts change very rapidly: high concept drift Recent data is more informative Goal: utilize older data while capturing changes in distributions Averaged Perceptron doesn’t capture drift Solution: Amnesiac Averaged Perceptron Exponential weight decay when averaging hypotheses
10
Results Model-LogLossLogL Lift Prior0.65720.1224 Feature Selection + Online Learning + Amnesia 0.57090.0361 Online+Feature Selection, No Amnesia0.60330.0685 Online+Amnesia, No Feature Selection0.65630.1215 Feature Selection+Amnesia, Weekly Batch0.59480.0600
11
Contributions and Conclusions learning broad matches from implicit feedback Combining arbitrary similarity measures/features Using clickthrough logs as implicit feedback Amnesiac Averaged Perceptron Exponentially weighted averaging: distant examples “fade out” Online learning adapts to market dynamics
12
Thank You!
13
13 Features and Feature Selection Co-occurrence feature examples: User search sessions: keywords searched within 10 mins Advertiser campaigns: keywords co-bidded by the same advertiser Past clickthrough rates of original and broad matched keywords Various syntactic similarities Various existing broad matching lists and so on… Feature Selection: A total of 68 features Greedy feature selection
14
Additional Information Estimation of expected value of click over all the ads shown for a broad match mapping E(p(click(ad(kw))|q)) Query Expansion vs. Broad Matching Our broad matching algorithm can be extended for query expansion But, broad matching is for a fixed set of bidded keywords Forgetron vs. Amesiac Averaged Perceptron Forgetron maintains a set of budget support vectors: stores examples explicitly and does not take into account all the data AAP: weighted average over all the examples, no need to store examples explicitly
15
Results Model-LogLossLogL Lift Prior0.65720.1224 Feature Selection + Online Learning + Amnesia 0.57090.0361 Online+Amnesia, No Feature Selection0.65630.1215 Feature Selection+Amnesia, Weekly Batch0.59480.0600 Online+Feature Selection, No Amnesia0.60330.0685
16
16 Amnesiac Averaged Perceptron
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.