Click Chain Model in Web Search

Click Chain Model in Web Search
Fan Guo 5/1/2019 Speaking Skills Talk

5/1/2019 Speaking Skills Talk

Click Logs Auto-generated data keeping important information about search activity. Query csd speaking club Time 04/09/2009, 19:44:30 Rank/Position URL of Document Click 1 2 3 4 5 webapps.cs.cmu.edu/speakerclub/ 6 7 8 9 10 5/1/2019 Speaking Skills Talk

Problem Definition Given a click log data set, compute user-perceived relevance for each query-document pair. Query csd speaking club Session Index 103 Document Idx Relevance 1 ? 2 3 4 5 6 7 8 Rank/Position Document Idx Click 1 2 8 3 4 7 5 6 12 9 42 10 20 Impression Data Click Data … … 5/1/2019 Speaking Skills Talk

Problem Definition Given a click log data set, compute user-perceived relevance for each query-document pair. Different type of relevance: 1 Excellent Good Fair Bad 0.75 in Competitors in Click Chain Model 5/1/2019 Speaking Skills Talk

How to interpret clicks effectively and efficiently?

Eye-Tracking User Study
Top-Left: Right: Fixation Heat Map 5/1/2019 Speaking Skills Talk

Overall: Fixation is biased towards higher ranks, so do the clicks.
For each position: fixation/clicks are dependent on the context. Normal Impression (Joachims et al., 2007) in ACM TOIS Reversed Impression 5/1/2019 Speaking Skills Talk

Problem Definition (Recap)
Given a click log data set, compute user-perceived relevance for each query-document pair, and the solution should be Aware of the position bias and context dependency Scalable to Terabyte data Incremental to stay updated

Applications (1) Automated Ranking Alterations 0.72 0.20 0.05 0.08
0.90 0.10 5/1/2019 Speaking Skills Talk

Applications (2) Measure and Monitor Search Engine Performance 0.72
0.20 Session Performance Score = 0.62 0.05 We know that user behavior are context dependent Right: 0.08 0.90 0.10 5/1/2019 Speaking Skills Talk

Applications (3) Inspiring related topics in sponsored search 5/1/2019
Speaking Skills Talk

Roadmap Background and Motivation Model and Algorithms
Experimental Evaluation Related Work and Conclusion 5/1/2019 Speaking Skills Talk

Our Approach User behavior assumptions Graphical modeling techniques

Examination Hypothesis
A document must be examined before being clicked. Click decision process is after the examination decision, formulated in (Richardson et al., WWW’07) Top 10; Ignore other elements. 5/1/2019 Speaking Skills Talk

Examination Hypothesis
For each position, P(Click=1) = P(Examination=1) * Relevance Relevance = P(Click=1|Examination=1) Hint: correct the position bias by deriving P(Examination) Cover exceptions 5/1/2019 Speaking Skills Talk

Cascade Hypothesis User scans through documents and make decisions in strict linear order. 5/1/2019 Speaking Skills Talk

User Behavior Description
Examine the Document Click? No See Next Doc? Yes No Yes Done See Next Doc? Yes No Done 5/1/2019 Speaking Skills Talk

Our Approach (Recap) User behavior assumptions
capture the context dependency construct descriptive user models Graphical modeling techniques visualize and decode the (in)dependence relationship derive efficient algorithms 5/1/2019 Speaking Skills Talk

Click Chain Model … … … R1 R2 R3 R4 R5 E1 E2 E3 E4 E5 C1 C2 C3 C4 C5

Relevance Inference Given a query, and all its click data
compute the posterior for each possible j. Let then focus on click probability for a particular session, and look at different cases 5/1/2019 Speaking Skills Talk

Click Chain Model … … … Cascade Hypothesis Examination Hypothesis R1

1 1 R1 R2 R3 R4 R5 … E1 E2 E3 E4 E5 … Multiply this term decreases the relevance C1 C2 C3 C4 C5 … 5/1/2019 Speaking Skills Talk

1 1 R1 R2 R3 R4 R5 … E1 E2 E3 E4 E5 … We have a mixed feeling C1 C2 C3 C4 C5 … 5/1/2019 Speaking Skills Talk

… … … 1 1 R1 R2 R3 R4 R5 E1 E2 E3 E4 E5 C1 C2 C3 C4 C5 5/1/2019
1 1 R1 R2 R3 R4 R5 … E1 E2 E3 E4 E5 … C1 C2 C3 C4 C5 … 5/1/2019 Speaking Skills Talk

Putting them together 5/1/2019 Speaking Skills Talk

A Quick Example Here we are interested in R3 5/1/2019

A Quick Example Here we are interested in R3 C1 C2 C3 C4 5/1/2019

A Quick Example Here we are interested in R3 C1 C2 C3 C4 C1 C2 C3 C4

A Quick Example Here we are interested in R3 Mean(R3) = 0.52
Std(R3) = 0.22 5/1/2019 Speaking Skills Talk

Summary of Algorithms (1)
Procedures: Initializing (2*10+2) counts for each pair; Go through the click log once and update the counts; Compute parameter values and get β values; Ready to output results (using numerical integration if necessary). 5/1/2019 Speaking Skills Talk

Summary of Algorithms (2)
Sanity check: the solution should be Aware of the position bias and context dependency Scalable to Terabyte data Single Pass, Linear Incremental to stay updated Update counts

Data Set Collected in 2 weeks in July 2008. Preprocessing:
Discard no-click sessions for fair comparison. 178 most frequent queries removed. Split to training/test sets according to time stamps. Split done for each query 5/1/2019 Speaking Skills Talk

Data Set After preprocessing: 110,630 distinct queries;
4.8M/4.0M query sessions in the training/test set. 5/1/2019 Speaking Skills Talk

Metric Efficiency: Effectiveness: (resort to indirect measure)
Computational Time Effectiveness: With known document identities in the test set, Using the relevance and parameter learned on the training set, To do Click Prediction. (resort to indirect measure) 5/1/2019 Speaking Skills Talk

Competitors UBM: User Browsing Model (Dupret et al., SIGIR’08)
More parameters Iterative, more expensive algorithm DCM: Dependent Click Model (WSDM’09) Modeling 1+ clicks per session Both of them give point estimates 5/1/2019 Speaking Skills Talk

Results - Time Environment: Unix Server, 2.8GHz cores, MATLAB R2008b.
CCM UBM DCM 9.8 min 333 min 5.4 min 1 34 0.55 5/1/2019 Speaking Skills Talk

Results – Perplexity Perplexity: quality of click prediction for each position individually. Random Guess (pH=0.5): 2.00 Best Guess (pH=0.8): 1.65 Ground Truth (Cheating): 1.00 5/1/2019 Speaking Skills Talk

Results – Perplexity Worse Better 5/1/2019 Speaking Skills Talk

Results – Perplexity Average Perplexity over top 10 positions. Model
CCM UBM DCM Perplexity 1.1479 1.1577 1.1590 Equiv. PH 0.0309 0.0334 0.0337 Improv. 7.5% 8.3% 5/1/2019 Speaking Skills Talk

Results – Log Likelihood
Log-likelihood: log of the chance to recover the entire click vector out of 210 possibilities. Model CCM UBM DCM LL -1.171 -1.264 -1.302 Likelihood 0.3100 0.2719 0.2826 Improv. 9.7% 14% 5/1/2019 Speaking Skills Talk

Results – Log Likelihood
Better Smoothing helps Worse 5/1/2019 Speaking Skills Talk

Related Work User behavior study and hypothesis Other click models
Eye-tracking Study (Joachims et al., KDD’05, ACM TOIS) Examination Hypothesis (Richardson et al., WWW’07) Cascade Hypothesis (Craswell et al., WSDM’08) Other click models Logistic Regression (Dupret et al., SIGIR’08) Dynamic Bayesian Network (Chapelle et al., WWW’09) Bayesian Browsing Model (KDD’09, To appear) Chronological order, details in the paper Cornell, MSR, MSR Yahoo!, MSR, Yahoo!, MSR 5/1/2019 Speaking Skills Talk

Conclusion Click Chain Model Future Directions
A probabilistic approach to interpret clicks. Both scalable and incremental. Bayesian approach to model relevance. Future Directions Validation/Bucket Test. More on context dependency. Other page elements? 5/1/2019 Speaking Skills Talk

Chao Liu Anitha Kannan Tom Minka Mike Taylor
Yi-Min Wang Christos Faloutsos 5/1/2019 Speaking Skills Talk

Thank you :-) 5/1/2019 Speaking Skills Talk

Results – Perplexity (by Freq)
Worse Better 5/1/2019 Speaking Skills Talk

Examination/Click Distribution

Predicting First/Last Clicks
Root-Mean-Square error in predicting the first/last clicked position for the test data. Two approaches (bias/variance tradeoff): EXPectation: using the expected value (bias) SIMulation: drawing sample from the model (variance) 5/1/2019 Speaking Skills Talk

First Clicked Position

Last Clicked Position 5/1/2019 Speaking Skills Talk

Click Chain Model in Web Search

Similar presentations

Presentation on theme: "Click Chain Model in Web Search"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Click Chain Model in Web Search

Similar presentations

Presentation on theme: "Click Chain Model in Web Search"— Presentation transcript:

Similar presentations

About project

Feedback