Download presentation
Presentation is loading. Please wait.
Published byMaurice Hancock Modified over 9 years ago
1
1 Click Chain Model in Web Search Fan Guo Carnegie Mellon University PPT Revised and Presented by Xin Xin
2
2 Outline Background and motivation Designing a click model Algorithms Experiments
3
3
4
4 How to utilize users’ feedback to improve search engine results?
5
5 Diverse User Feedback Click-through Browser action Dwelling time Explicit judgment Other page elements 5
6
6 Web Search Click Log Auto-generated data keeping important information about search activity. PositionURLClick 1cikm2008.org1 2www.cikm.org0 3www.cikm.org/20020 4www.fc.ul.pt/cikm20070 5www.comp.polyu.edu.hk/conference/cikm20091 6cikmconference.org0 7Ir.iit.edu/cikm20040 8www.informatik.uni-trier.de/~ley/db/conf/cikm/index.html0 9www.tzi.de/CIKM20050 10www.cikm.com0 Query cikm Session ID f851c5af178384d12f3d
7
7 A real world example
8
8 – search logs: 10+ TB/day –In existing publications: [Craswell+08]: 108k sessions [Dupret+08] : 4.5M sessions (21 subsets * 216k sessions) [Guo +09a] : 8.8M sessions from 110k unique queries [Guo+09b]: 8.8M sessions from 110k unique queries [Chapelle+09]: 58M sessions from 682k unique queries [Liu+09a]: 0.26PB data from 103M unique queries How large is the clicklog?
9
9 Intuition to Utilize Clicks Adapt ranking to user clicks # of clicks received
10
10 Position Bias Problem # of clicks received
11
11 Problem Definition Given a click log data set, for each query- document pair, compute user-perceived relevance and the solution should be –Aware of the position bias and context dependency –Scalable to Terabyte data –Incremental to stay updated
12
12 Outline Background and motivation Designing a click model Algorithms Experiments
13
13 Examination Hypothesis A document must be examined before a click. The (conditional) probability of click upon examination depends on document relevance.
14
14 Cascade Hypothesis The first document is always examined. First-order Markov property: –Examination at position (i+1) depends on examination and click at position i only Examination follows a strict linear order: Position iPosition (i+1)
15
15 User Behavior Description Examine the Document Click? See Next Doc? Done No Yes No Yes See Next Doc? Done No
16
16 Click Chain Model C4C4 C3C3 C2C2 C1C1 R1R1 E1E1 E2E2 R2R2 R3R3 R4R4 E3E3 E4E4 … … … C5C5 R5R5 E5E5 Examination Hypothesis Cascade Hypothesis
17
17 Outline Background and motivation Designing a click model Algorithms Experiments
18
18 A Coin-Toss Example for Bayesian Framework Prior Posterior x 1 (1-x) 0 x 2 (1-x) 0 x 3 (1-x) 0 x 3 (1-x) 1 x 4 (1-x) 1 Density Function (not normalized)
19
19 Click Data Example Prior Density Function (not normalized) x 1 (1-x) 0 (1-0.6x) 0 (1+0.3x) 1 (1-0.5x) 0 (1- 0.2x) 0 … x 1 (1-x) 1 (1-0.6x) 0 (1+0.3x) 1 (1-0.5x) 0 (1- 0.2x) 0 … x 2 (1-x) 1 (1-0.6x) 0 (1+0.3x) 2 (1-0.5x) 0 (1- 0.2x) 0 … x 3 (1-x) 1 (1-0.6x) 1 (1+0.3x) 2 (1-0.5x) 0 (1- 0.2x) 0 … x 3 (1-x) 1 (1-0.6x) 1 (1+0.3x) 2 (1-0.5x) 1 (1- 0.2x) 0 …
20
20 Estimating P(C|Ri)
21
21 C4C4 C3C3 C2C2 C1C1 R1R1 E1E1 E2E2 R2R2 R3R3 R4R4 E3E3 E4E4 … … … C5C5 R5R5 E5E5 0101
22
22 C4C4 C3C3 C2C2 C1C1 R1R1 E1E1 E2E2 R2R2 R3R3 R4R4 E3E3 E4E4 … … … C5C5 R5R5 E5E5 0101
23
23 C4C4 C3C3 C2C2 C1C1 R1R1 E1E1 E2E2 R2R2 R3R3 R4R4 E3E3 E4E4 … … … C5C5 R5R5 E5E5 0101
24
24 C4C4 C3C3 C2C2 C1C1 R1R1 E1E1 E2E2 R2R2 R3R3 R4R4 E3E3 E4E4 … … … C5C5 R5R5 E5E5 0101
25
25 C4C4 C3C3 C2C2 C1C1 R1R1 E1E1 E2E2 R2R2 R3R3 R4R4 E3E3 E4E4 … … … C5C5 R5R5 E5E5 0101
26
26 Putting them together
27
27 Alpha Estimation
28
28 Outline Background and motivation Designing a click model Algorithms Experiments
29
29 Data Set Collected in 2 weeks in July 2008. Preprocessing: –Discard no-click sessions for fair comparison. –178 most frequent queries removed. Split to training/test sets according to time stamps.
30
30 Data Set After preprocessing: –110,630 distinct queries; –4.8M/4.0M query sessions in the training/test set.
31
31 Metric Efficiency: –Computational Time Effectiveness: –Perplexity –Log-likely hood –Click Prediction.
32
32 Competitors UBM: User Browsing Model (Dupret et al., SIGIR’08) DCM: Dependent Click Model (WSDM’09)
33
33 Results - Time Environment: Unix Server, 2.8GHz cores, MATLAB R2008b. CCMUBMDCM 9.8 min333 min5.4 min 1.0340.55
34
34 Results – Perplexity Worse Better
35
35 Results – Log Likelihood Better Worse
36
36 First Clicked Position
37
37 Last Clicked Position
38
38 The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.