Ramakrishnan Srikant Sugato Basu Ni Wang Daryl Pregibon 1.

Slides:



Advertisements
Similar presentations
Accurately Interpreting Clickthrough Data as Implicit Feedback Joachims, Granka, Pan, Hembrooke, Gay Paper Presentation: Vinay Goel 10/27/05.
Advertisements

Evaluating Novelty and Diversity Charles Clarke School of Computer Science University of Waterloo two talks in one!
Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)
Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search Date: 2014/03/25 Author: Taifeng Wang, Jiang Bian, Shusen.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:
Optimizing search engines using clickthrough data
SIGIR 2013 Recap September 25, 2013.
Discrete Choice Model of Bidder Behavior in Sponsored Search Quang Duong University of Michigan Sebastien Lahaie
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
Catching the Drift: Learning Broad Matches from Clickthrough Data Sonal Gupta, Mikhail Bilenko, Matthew Richardson University of Texas at Austin, Microsoft.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Improving relevance prediction by addressing biases and sparsity in web search click data Qi Guo, Dmitry Lagun, Denis Savenkov, Qiaoling Liu
Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data Misha Bilenko and Ryen White presented by Matt Richardson.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
Statistic Models for Web/Sponsored Search Click Log Analysis The Chinese University of Hong Kong 1 Some slides are revised from Mr Guo Fan’s tutorial at.
BBM: Bayesian Browsing Model from Petabyte-scale Data Chao Liu, MSR-Redmond Fan Guo, Carnegie Mellon University Christos Faloutsos, Carnegie Mellon University.
Evaluating Search Engine
Click Evidence Signals and Tasks Vishwa Vinay Microsoft Research, Cambridge.
Time-dependent Similarity Measure of Queries Using Historical Click- through Data Qiankun Zhao*, Steven C. H. Hoi*, Tie-Yan Liu, et al. Presented by: Tie-Yan.
Modeling the Cost of Misunderstandings in the CMU Communicator System Dan BohusAlex Rudnicky School of Computer Science, Carnegie Mellon University, Pittsburgh,
A Search-based Method for Forecasting Ad Impression in Contextual Advertising Defense.
Cohort Modeling for Enhanced Personalized Search Jinyun YanWei ChuRyen White Rutgers University Microsoft BingMicrosoft Research.
Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.
Adapting Deep RankNet for Personalized Search
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:
Hypothesis Testing in Linear Regression Analysis
Modern Retrieval Evaluations Hongning Wang
Evaluation David Kauchak cs458 Fall 2012 adapted from:
Evaluation David Kauchak cs160 Fall 2009 adapted from:
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
Automatically Identifying Localizable Queries Center for E-Business Technology Seoul National University Seoul, Korea Nam, Kwang-hyun Intelligent Database.
User Browsing Graph: Structure, Evolution and Application Yiqun Liu, Yijiang Jin, Min Zhang, Shaoping Ma, Liyun Ru State Key Lab of Intelligent Technology.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
Fan Guo 1, Chao Liu 2 and Yi-Min Wang 2 1 Carnegie Mellon University 2 Microsoft Research Feb 11, 2009.
Evaluation Methods and Challenges. 2 Deepak Agarwal & Bee-Chung ICML’11 Evaluation Methods Ideal method –Experimental Design: Run side-by-side.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Hao Wu Nov Outline Introduction Related Work Experiment Methods Results Conclusions & Next Steps.
Understanding and Predicting Personal Navigation Date : 2012/4/16 Source : WSDM 11 Speaker : Chiu, I- Chih Advisor : Dr. Koh Jia-ling 1.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.
Jun Li, Peng Zhang, Yanan Cao, Ping Liu, Li Guo Chinese Academy of Sciences State Grid Energy Institute, China Efficient Behavior Targeting Using SVM Ensemble.
Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.
CSKGOI'08 Commonsense Knowledge and Goal Oriented Interfaces.
Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
Post-Ranking query suggestion by diversifying search Chao Wang.
More Than Relevance: High Utility Query Recommendation By Mining Users' Search Behaviors Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan Institute of.
1 Click Chain Model in Web Search Fan Guo Carnegie Mellon University PPT Revised and Presented by Xin Xin.
Modern Retrieval Evaluations Hongning Wang
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Date: 2013/9/25 Author: Mikhail Ageev, Dmitry Lagun, Eugene Agichtein Source: SIGIR’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Improving Search Result.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
Why Decision Engine Bing Demos Search Interaction model Data-driven Research Problems Q & A.
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
LIGO- G Z August 17, 2005August 2005 LSC Meeting 1 Bayesian Statistics for Burst Search Results LIGO-T Keith Thorne and Sam Finn Penn State.
1 Learning to Impress in Sponsored Search Xin Supervisors: Prof. King and Prof. Lyu.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.
Algorithms for Position Bias Correction in Ranking Anirban Majumder Machine Learning, Amazon.
Learning Profiles from User Interactions
Modern Retrieval Evaluations
Content-Aware Click Modeling
FRM: Modeling Sponsored Search Log with Full Relational Model
Section 12.2: Tests about a Population Proportion
Evidence from Behavior
Click Chain Model in Web Search
Efficient Multiple-Click Models in Web Search
Presentation transcript:

Ramakrishnan Srikant Sugato Basu Ni Wang Daryl Pregibon 1

2

Estimating Relevance of Search Engine Results Use CTR (click-through rate) data. Pr(click) = Pr(examination) x Pr(click | examination) Need user browsing models to estimate Pr(examination) Relevance 3

Notation Φ(i) : result at position i Examination event: Click event: 4

Examination Hypothesis Richardson et al, WWW 2007: Pr(C i = 1) = Pr(E i = 1) Pr(C i = 1 | E i = 1) α i : position bias Depends solely on position. Can be estimated by looking at CTR of the same result in different positions. 5

Using Prior Clicks 6 Clicks Pr(E 5 | C 1,C 3 ) = 0.5 Pr(E 5 | C 1 ) = 0.3 Clicks R1 R2 R3 R4 R5: R1 R2 R3 R4 R5:

Examination depends on prior clicks Cascade model Dependent click model (DCM) User browsing model (UBM) [Dupret & Piwowarski, SIGIR 2008] More general and more accurate than Cascade, DCM. Conditions Pr(examination) on closest prior click. Bayesian browsing model (BBM) [Liu et al, KDD 2009] Same user behavior model as UBM. Uses Bayesian paradigm for relevance. 7

Use position of closest prior click to predict Pr(examination). Pr(E i = 1 | C 1:i-1 ) = α i β i,p(i) Pr(C i = 1 | C 1:i-1 ) = Pr(E i = 1 | C 1:i-1 ) Pr(C i = 1 | E i = 1) User browsing model (UBM) 8 position bias p(i) = position of closest prior click Prior clicks don’t affect relevance.

Other Related Work Examination depends on prior clicks and prior relevance Click chain model (CCM) General click model (GCM) Post-click models Dynamic Bayesian model Session utility model 9

10

Constant Relevance Assumption Cascade model, DCM, UBM, BBM, CCM, GCM all implicitly assume: Relevance is independent of prior clicks. Relevance is constant across query instances. Query = “Canon S90” Aggregate relevance: Relevance to a query. Query instance = “Canon S90” for a specific user at a specific point in time. Instance relevance: Relevance to a query instance. 11

CTR Sponsored Results UserQuery Canon S90 Ready to buy. Very relevant. High Wants to learn. Less relevant. Low User intent Query string does not fully capture user intent. 12

Prior clicks signal relevance.... not just Pr(examination). Second sponsored result User First sponsored result Query Canon S90 Click Likely ready to buy Very relevant No click Likely wants to learn Less relevant 13

Testing Constant Relevance If we know that Pr(examination) ≈ 1: Relevance ≈ CTR Test whether relevance is independent of prior clicks. When is Pr(examination) ≈ 1? Users scan from top to bottom. If there is a click below position i, then Pr(E i ) ≈ 1. 14

For a specific position i: If relevance is independent of prior clicks, we expect Lift = LHS / RHS ≈ 1 All query instances Statistical Power of the Test 15 Pr(examination) ≈ 1 S 0 : No clicks above i S 1 : Click above i

The data speaks… Over all configs: Lift = / (99% conf. Interval) Graph shows config with 3 top ads, 8 rhs ads. T2 = 2 nd top ad, R3 = 3 rd rhs ad, etc. 16

Pure Relevance Max Examination Joint Relevance Examination (JRE) 17

Pure Relevance Any change in Pr(C i = 1) when conditioned on other clicks is solely due to change in instance relevance. Number of clicks on other results used as signal of instance relevance. Does not use position of other clicks, only the count. Yields identical aggregate relevance estimates as the baseline model (which does not use co-click information). Pr(C i = 1 | C ≠i, E i = 1) = r Φ(i) δ n(i) 18 n(i) = number of click in other positions

Max Examination Like UBM/BBM, but also use information about clicks below position i. Pr(examination) ≈ 1 if there is a click below i UBM/BBM: Pr(E i = 1 | C 1:i-1 ) = α i β i,p(i) Max-examination: Pr(E i = 1 | C ≠i ) = α i β i,e(i) 19 p(i) = position of closest prior click

Joint Relevance Examination (JRE) Combines the features of the pure relevance and max- examination models. Allows CTR changes to be caused by both changes in examination and changes in instance relevance. Pr(E i = 1 | C ≠i ) = α i β i,e(i) Pr(C i = 1 | C ≠i, E i = 1) = r Φ(i) δ n(i) Pr(C i = 1 | C ≠i ) = Pr(E i = 1 | C ≠i ) Pr(C i = 1 | E i = 1, C ≠i ) 20

21

Predicting CTR Models: Baseline: Google's production system for predicting relevance of sponsored results. Does not use co-click information. Compare to UBM/BBM, max examination, pure relevance, and JRE. Data: 10% sample of a week of data split between training and testing. 22

Absolute Error 23 Baseline: Google’s production system.

Log likelihood 24

Squared Error 25

26

Predicting Relevance vs. Predicting CTR If model A is more accurate than model B at predicting CTR, wouldn’t A also be better at predicting (aggregate) relevance? 27

Counter-Example CTR: pure-relevance >> max-examination >> baseline Relevance: Either pure-relevance == baseline >> max-examination OR max-examination >> pure-relevance == baseline 28

Intuition Predicting CTR: Get the product, Pr(examination) x Relevance, right. Predicting Relevance: Need to correctly assign credit between examination and relevance. Incorrectly assigning credit can improve CTR prediction, while making relevance estimates less accurate. 29

Predicting relevance. Run an experiment on live traffic. Sponsored results are ranked by bid x relevance. More accurate relevance estimates should result in higher CTR and revenue. Will place results with higher relevance in positions with higher Pr(examination). Baseline/pure-relevance had better revenue and CTR than max-examination. Results were statistically significant results. 30

Conclusions Changes in CTR when conditioned on other clicks are also due to instance relevance, not just examination. New user browsing models that incorporate this insight are more accurate. Evaluating user browsing models solely using offline analysis of CTR prediction can be problematic. Use human ratings or live experiments. 31

Future Work What about organic search results? Quantitatively assigning credit between instance relevance and examination. Features are correlated. Generalize pure-relevance and JRE to incorporate information about the relevance of prior results, or the satisfaction of the user with the prior clicked results. 32

33

Scan order 34