Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA 1 Modeling Diversity in Information.
Evaluating the Robustness of Learning from Implicit Feedback Filip Radlinski Thorsten Joachims Presentation by Dinesh Bhirud
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Metadata in Carrot II Current metadata –TF.IDF for both documents and collections –Full-text index –Metadata are transferred between different nodes Potential.
Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.
Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.
A Formal Study of Information Retrieval Heuristics Hui Fang, Tao Tao, and ChengXiang Zhai University of Illinois at Urbana Champaign SIGIR 2004 (Best paper.
Retrieval Evaluation J. H. Wang Mar. 18, Outline Chap. 3, Retrieval Evaluation –Retrieval Performance Evaluation –Reference Collections.
1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois,
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
Catching the Drift: Learning Broad Matches from Clickthrough Data Sonal Gupta, Mikhail Bilenko, Matthew Richardson University of Texas at Austin, Microsoft.
Information Retrieval Models: Probabilistic Models
1 Language Model CSC4170 Web Intelligence and Social Computing Tutorial 8 Tutor: Tom Chao Zhou
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
Language Models for TR Rong Jin Department of Computer Science and Engineering Michigan State University.
Putting Query Representation and Understanding in Context: ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign A.
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, (2014) BERLIN CHEN, YI-WEN CHEN, KUAN-YU CHEN, HSIN-MIN WANG2 AND KUEN-TYNG YU Department of Computer.
1 Data Mining over the Deep Web Tantan Liu, Gagan Agrawal Ohio State University April 12, 2011.
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, Prepare Yourself for IR Research ChengXiang Zhai Department of Computer.
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.
Estimating Topical Context by Diverging from External Resources SIGIR’13, July 28–August 1, 2013, Dublin, Ireland. Presenter: SHIH, KAI WUN Romain Deveaud.
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai
Information commitments, evaluative standards and information searching strategies in web-based learning evnironments Ying-Tien Wu & Chin-Chung Tsai Institute.
Toward A Session-Based Search Engine Smitha Sriram, Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.
Algorithmic Detection of Semantic Similarity WWW 2005.
Positional Relevance Model for Pseudo–Relevance Feedback Yuanhua Lv & ChengXiang Zhai Department of Computer Science, UIUC Presented by Bo Man 2014/11/18.
Lower-Bounding Term Frequency Normalization Yuanhua Lv and ChengXiang Zhai University of Illinois at Urbana-Champaign CIKM 2011 Best Student Award Paper.
1 A Formal Study of Information Retrieval Heuristics Hui Fang, Tao Tao and ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Implicit User Modeling for Personalized Search Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
NTNU Speech Lab Dirichlet Mixtures for Query Estimation in Information Retrieval Mark D. Smucker, David Kulp, James Allan Center for Intelligent Information.
Active Feedback in Ad Hoc IR Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Automatic Labeling of Multinomial Topic Models
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
{ Adaptive Relevance Feedback in Information Retrieval Yuanhua Lv and ChengXiang Zhai (CIKM ‘09) Date: 2010/10/12 Advisor: Dr. Koh, Jia-Ling Speaker: Lin,
Automatic Labeling of Multinomial Topic Models Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai DAIS The Database and Information Systems Laboratory.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Nonintrusive Personalization in Interactive IR Xuehua Shen Department of Computer Science University of Illinois at Urbana-Champaign Thesis Committee:
Reinforcement Learning for Mapping Instructions to Actions S.R.K. Branavan, Harr Chen, Luke S. Zettlemoyer, Regina Barzilay Computer Science and Artificial.
A Study of Poisson Query Generation Model for Information Retrieval
Context-Sensitive IR using Implicit Feedback Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval Chengxiang Zhai, John Lafferty School of Computer Science Carnegie.
CONCLUSIONS & CONTRIBUTIONS Ground-truth dataset, simulated search tasks environment Implicit feedback, semi-explicit feedback (annotations), explicit.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,
John Lafferty, Chengxiang Zhai School of Computer Science
Retrieval Performance Evaluation - Measures
Inductive Clustering: A technique for clustering search results Hieu Khac Le Department of Computer Science - University of Illinois at Urbana-Champaign.
GhostLink: Latent Network Inference for Influence-aware Recommendation
Presentation transcript:

Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign Bin Tan : department of Computer Science University of Illinois at Urbana-Champaign ChengXiang Zhai : department of Computer Science University of Illinois at Urbana-Champaign Present by Chia-Hao Lee

2 outline Introduction Problem Definition Language Models for Context-Sensitive Information Retrieval –Basic retrieval model –Fixed Coefficient Interpolation (FixInt) –Bayesian Interpolation (BayesInt) –Online Bayesian Updating (OnlineInt) –Batch Bayesian updating (batchUp) Experiments Conclusions and Future Work

3 Introduction In most existing information retrieval models, the retrieval problem is treated as involving one single query and a set of documents. From a single query, however, the retrieval system can only have very limited clue about the user’s information need. An optimal retrieval system thus should try to exploit as much additional context information as possible to improve retrieval accuracy, whenever it is available.

4 Introduction There are many kinds of context that we can exploit. Relevance feedback is known to be effective for improving retrieval accuracy. However, relevance feedback requires that a user explicitly provides feedback information, such as specifying the category of the information need or marking a subset of retrieved documents as relevant documents.

5 Introduction A major advantage of implicit feedback is that we can improve the retrieval accuracy without requiring any user effort. For example, if the current query is “java”, without knowing any extra information, it would be impossible to know whether it is intended to mean the Java programming language or Java island in Indonesia.

6 Problem Definition There are two kinds of context information we can use for implicit feedback. –Short-term context –Long-term context Short-term context is the immediate surrounding information which throws light on a user’s current information need in a single session. A session can be considered as a period consisting of all interactions for the same information need.

7 Problem Definition In a single search session, a user may interact with the search system several times. During interactions, the user would continuously modify the query. Therefore for the current query, there is a query history. associated with it, which consists of the preceding queries given by the same user in the current session. Indeed, our work has shown that the short-term query history is useful for improving retrieval accuracy.

8 Problem Definition A user would presumably frequently click some documents to view. We refer to data associated with these actions as clickthrough history. The clickthrough data may include the title, summary, and perhaps also the content and location of the clicked document. Our work has shown positive results using similar clickthrough information.

9 Language models for context-sensitive information retrieval We propose to use statistical language models to model a user’s information need and develop four specific context- sensitive language models to incorporate context information into a basic retrieval model. 1. Basic retrieval model We compute, which serves as the score of the document. One advantage of this approach is that we can naturally incorporate the search context as additional evidence to improve our estimate of the query language model.

10 Language models for context-sensitive information retrieval Our task is to estimate a context query model, which we denote by, based on the current query, as well as the query and clickthough history. We will use to denote the count of word ω in text X, which could be either a query or a clicked document’s summary or any other text. We will use to denote the length of text X or the total number of words in X. 

11 Language models for context-sensitive information retrieval 2. Fixed Coefficient Interpolation (FixInt) Our first idea is to summarize the query history with a unigram language model and the clickthrough history with another unigram language model.

12 Language models for context-sensitive information retrieval 3. Bayesian Interpolation (BayesInt) One possible problem with the FixInt approach is that the coefficient, especially α, are fixed across all the queries. If our current query is very long, we should trust the current query more, whereas if has just one word, it may be beneficial to put more weight on the history. To capture this intuition, we treat and as Dirichlet priors and as the observed data to estimate a context query model using Bayesian estimator. 

13 Language models for context-sensitive information retrieval The estimated model is given by  : the prior sample size for

14 Language models for context-sensitive information retrieval 4. Online Bayesian Updating (Online Up) 4.1 Bayesian updating Let be or current query model and T be a new piece of text evidence observed. To update the query model based on T, we use to define a Dirichlet prior parameterized as With such a conjugate prior, the predictive distribution of

15 Language models for context-sensitive information retrieval 4.2 Sequential query model updating We use such information to define a prior on the query model, which is denoted by. After we observe the first query, we can update the query model based on the new observed data. The update query model can then be used for ranking documents in response to. As the user’s views some documents, the displayed summary text for such documents can serve as some new data for us to further update the query model to obtain.    

16 Language models for context-sensitive information retrieval We see two types of updating: (1) updating based on a new query (2) updating based on a new clicked summary Thus we have the following updating equations:   : the equivalent sample size for the prior when updating the model based on a query : the equivalent sample size for the prior when updating the model based on a clicked summary

17 Language models for context-sensitive information retrieval 5. Batch Bayesian updating (BatchUp) The updating equations are as follows. : the same interpretation as in OnlineUp : indicates to what extent we want to trust the clicked summaries 

18 Experiments

19 Experiments

20 Experiments

21 Experiments

22 Experiments

23 Conclusions In this paper, we have explored how to exploit implicit feedback information, including query history and clickthrough history within the same search session, to improve information retrieval performance. Experiment results show that using implicit feedback, especially clickthrough history, can substantially improve retrieval performance without requiring any additional user effort.