Download presentation
Presentation is loading. Please wait.
Published byJoanna Barrett Modified over 9 years ago
1
An Experimental Comparison of Click Position-Bias Models Nick Craswell Onno Zoeter Michael Taylor Bill Ramsey Microsoft Research
2
Position Bias Top-ranked search results get more clicks This position bias occurs because: –...users sometimes blindly click on early results? –...users are less likely to view lower ranks? –...users click the first relevant thing they see? A model for position bias allows: – List data Debiased evaluation of a result – Per-result data Evaluate a list
3
Summary A.Four alternate hypotheses for explaining position bias – Including a `cascade’ model B.A large-scale data gathering effort C.Evaluation: Which model best explains data? – Which models fail and how – Cascade model succeeds, at early ranks D.Conclusions
4
A. HYPOTHESES
5
Hypothesis 1: No Bias Our baseline – c di is P( Click=True | Document=d, Position=i ) – r d is P( Click=True | Document=d ) Why this baseline? – We know that r d is part of the explanation – Perhaps, for ranks 9 vs 10, it’s the main explanation – It is a bad explanation at rank 1 e.g. Eye tracking Attractiveness of summary ~= Relevance of result
6
Hypothesis 2: Blind Clicks There are two types of user/interaction 1.Click based on relevance 2.Click based on rank (blindly) A.k.a. the OR model: – Clicks arise from relevance OR position
7
Hypothesis 3: Examination Users are less likely to look at lower ranks, therefore less likely to click This is the AND model – Clicks arise from relevance AND examination – Probability of examination does not depend on what else is in the list
8
Hypothesis 4: Cascade Users examine the results in rank order At each document d – Click with probability r d – Or continue with probability (1-r d )
9
Cascade Model Example 500 users typed a query 0 click on result A in rank 1 100 click on result B in rank 2 100 click on result C in rank 3 Cascade (with no smoothing) says: 0 of 500 clicked A r A = 0 100 of 500 clicked B r B = 0.2 100 of remaining 400 clicked C r C = 0.25 This may seem different from the formulation on the previous slide, but is precisely equivalent
10
B. DATA COLLECTION
11
Flipping Adjacent Results Do adjacent flips in the top 10 – 9 types of flip: 1-2, 2-3,..., 9-10. An “experiment”: query, URL A, URL B, rank m A&B originate from m&m+1, though maybe not that order Equally likely to show AB and BA Controlled experiment: We only vary the position 108 thousand experiments with real users – Because it’s real users, adjacent flips Our experiment requires flips, but our models do not
12
Our Dataset logodds(p)=log(p/(1-p))
13
Blind-Click & Examination Hypotheses Are “Broken” Blind-Click: Rank 1 might have 0 clicks Examination: Rank 2 might have 100% clicks Learn our parameters to stay within bounds: – Blind-Click: makes no adjustment – Examination: 2 1 is 3.5%, while 4 3 is 9.0%. Something in rank 2 had c d2 =0.966 Need some other way to stay within bounds
14
Non-Hypothesis: “Logistic” The shape of the data suggests a Logistic model This is related to logistic regression
15
Measurement Given click information for AB, predict clicks in order BA: – 4 events : Click B, Click A, click both, click neither 10-fold cross validation
16
C. RESULTS
17
Main Results Best possible: Given the true click counts for ordering BA
18
Results by Rank
20
Cascade Errors Predictions are closer to diagonal, with less spread Not perfect
21
D. Conclusions + Future Work Surprisingly, we reject the simple AND/OR – Users do not click randomly on rank 1 – Users do not have a fixed examination curve Cascade model works well – Particularly for 1-2 and 2-3 flips Cascade model is basic. In future could model: – Users who click multiple results – Users who abandon their search – Different types of user or search?
22
THANK YOU
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.