Context-Sensitive IR using Implicit Feedback Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign
2 Problem of Context-Independent Search Jaguar Car Apple Software Animal Chemistry Software
3 Other Context Info: Dwelling time Mouse movement Clickthrough Query History Put Search in Context Apple software Hobby …
4 Problem Definition Q2Q2 {C 2,1, C 2,2,C 2,3, … } C2C2 … Q1Q1 User Query {C 1,1, C 1,2,C 1,3, …} C1C1 User Clickthrough ? User Information Need How to model and use all the information? QkQk e.g., Apple software e.g., Apple - Mac OS X Apple - Mac OS X The Apple Mac OS X product page. Describes features in the current version of Mac OS X, a screenshot gallery, latest software downloads, and a directory of...
5 Outline Four contextual statistical language models Experiment design and results Summary and future work
6 Retrieval Model QkQk D θQkθQk θDθD Similarity Measure Results Basis: Unigram language model + KL divergence U Contextual search: query model update using user query and clickthrough history Query HistoryClickthrough
7 Fixed Coefficient Interpolation (FixInt) QkQk Q1Q1 Q k-1 … C1C1 C k-1 … Average user query history and clickthrough Linearly interpolate history models Linearly interpolate current query and history model
8 Bayesian Interpolation (BayesInt) Q1Q1 Q k-1 … C1C1 C k-1 … Average user query and clickthrough history Intuition: if the current query Q k is longer, we should trust Q k more QkQk Dirichlet Prior
9 Online Bayesian Update (OnlineUp) QkQk C2C2 Q1Q1 Intuition: continuous belief update about user information need Q2Q2 C1C1
10 Batch Bayesian Update (BatchUp) C1C1 C2C2 … C k-1 Intuition: clickthrough data may not decay QkQk Q1Q1 Q2Q2
11 Data Set of Evaluation Data collection: TREC AP88-90 Topics: 30 hard topics of TREC topics System: search engine + RDBMS Context: Query and clickthrough history of 3 participants.
12 Experiment Design Models: FixInt, BayesInt, OnlineUp and BatchUp Performance Comparison: Q k vs. Q k +H Q +H C Evaluation Metrics: MAP and docs
13 Overall Effect of Search Context Query FixInt ( =0.1, =1.0) BayesInt ( =0.2, =5.0) OnlineUp ( =5.0, =15.0) BatchUp ( =2.0, =15.0) Q3Q Q 3 +H Q +H C Improve 72.4%32.6%93.8%39.4%67.7%20.2%92.4%39.4% Q4Q Q 4 +H Q +H C Improve 66.2%15.5%78.2%19.9%47.8%6.9%77.2%16.4% Interaction history helps system improve retrieval accuracy BayesInt better than FixInt; BatchUp better than OnlineUp
14 Using Clickthrough Data Only Q3Q Q 3 +H C Improve81.9%37.1% Q4Q Q 4 +H C Improve72.6%18.1% Q3Q Q 3 +H C Improve23.8%23.0% Q4Q Q 4 +H C Improve15.7%-4.1% Q3Q Q 3 +H C Improve99.7%42.4% Q4Q Q 4 +H C Improve 67.2%13.9% BayesInt ( =0.0, =5.0) Clickthrough data can improve retrieval accuracy of unseen relevant docs Clickthrough data corresponding to non- relevant docs are useful for feedback
15 Sensitivity of BatchUp Parameters BatchUp is stable with different parameter settings Best performance is achieved when =2.0; =15.0
16 Summary Propose four contextual language models to exploit user interaction history for contextual search Construct an evaluation dataset based on TREC data ( ) Experiment results show that user interaction history, especially clickthrough data, can improve the retrieval accuracy
17 Future Work Study a general framework for interactive information retrieval Study more sophisticated models to incorporate context information Build a system on the client side to capture and exploit user context information
18 Thank you ! The End