Active Feedback in Ad Hoc IR Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Slides:



Advertisements
Similar presentations
ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA 1 Modeling Diversity in Information.
Advertisements

Active Feedback: UIUC TREC 2003 HARD Track Experiments Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Less is More Probabilistic Model for Retrieving Fewer Relevant Docuemtns Harr Chen and David R. Karger MIT CSAIL SIGIR2006 4/30/2007.
1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois,
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, 龙星计划课程 : 信息检索 Personalized Search & User Modeling ChengXiang Zhai.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Information Retrieval Models: Probabilistic Models
How to Make Manual Conjunctive Normal Form Queries Work in Patent Search Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science.
Evaluating Evaluation Measure Stability Authors: Chris Buckley, Ellen M. Voorhees Presenters: Burcu Dal, Esra Akbaş.
Learning to Rank: New Techniques and Applications Martin Szummer Microsoft Research Cambridge, UK.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
Modern Information Retrieval
Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
1 Ranked Queries over sources with Boolean Query Interfaces without Ranking Support Vagelis Hristidis, Florida International University Yuheng Hu, Arizona.
Modern Information Retrieval Chapter 5 Query Operations.
Evaluating the Performance of IR Sytems
Affinity Rank Yi Liu, Benyu Zhang, Zheng Chen MSRA.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University.
Basic IR Concepts & Techniques ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, (2014) BERLIN CHEN, YI-WEN CHEN, KUAN-YU CHEN, HSIN-MIN WANG2 AND KUEN-TYNG YU Department of Computer.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
Minimal Test Collections for Retrieval Evaluation B. Carterette, J. Allan, R. Sitaraman University of Massachusetts Amherst SIGIR2006.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.
Language Models Hongning Wang Two-stage smoothing [Zhai & Lafferty 02] c(w,d) |d| P(w|d) = +  p(w|C) ++ Stage-1 -Explain unseen words -Dirichlet.
A General Optimization Framework for Smoothing Language Models on Graph Structures Qiaozhu Mei, Duo Zhang, ChengXiang Zhai University of Illinois at Urbana-Champaign.
Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.
UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai
Toward A Session-Based Search Engine Smitha Sriram, Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
Probabilistic Models of Novel Document Rankings for Faceted Topic Retrieval Ben Cartrette and Praveen Chandar Dept. of Computer and Information Science.
Less is More Probabilistic Models for Retrieving Fewer Relevant Documents Harr Chen, David R. Karger MIT CSAIL ACM SIGIR 2006 August 9, 2006.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA 1 Modeling Diversity in Information.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Positional Relevance Model for Pseudo–Relevance Feedback Yuanhua Lv & ChengXiang Zhai Department of Computer Science, UIUC Presented by Bo Man 2014/11/18.
1 A Formal Study of Information Retrieval Heuristics Hui Fang, Tao Tao and ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Implicit User Modeling for Personalized Search Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
AN EFFECTIVE STATISTICAL APPROACH TO BLOG POST OPINION RETRIEVAL Ben He Craig Macdonald Iadh Ounis University of Glasgow Jiyin He University of Amsterdam.
INAOE at GeoCLEF 2008: A Ranking Approach based on Sample Documents Esaú Villatoro-Tello Manuel Montes-y-Gómez Luis Villaseñor-Pineda Language Technologies.
Automatic Labeling of Multinomial Topic Models
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
Relevance Feedback Hongning Wang
{ Adaptive Relevance Feedback in Information Retrieval Yuanhua Lv and ChengXiang Zhai (CIKM ‘09) Date: 2010/10/12 Advisor: Dr. Koh, Jia-Ling Speaker: Lin,
1 Personalized IR Reloaded Xuehua Shen
The Effect of Database Size Distribution on Resource Selection Algorithms Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University.
Automatic Labeling of Multinomial Topic Models Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai DAIS The Database and Information Systems Laboratory.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.
Nonintrusive Personalization in Interactive IR Xuehua Shen Department of Computer Science University of Illinois at Urbana-Champaign Thesis Committee:
A Study of Poisson Query Generation Model for Information Retrieval
Context-Sensitive IR using Implicit Feedback Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
哈工大信息检索研究室 HITIR ’ s Update Summary at TAC2008 Extractive Content Selection Using Evolutionary Manifold-ranking and Spectral Clustering Reporter: Ph.d.
1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.
A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval Chengxiang Zhai, John Lafferty School of Computer Science Carnegie.
Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1
Information Retrieval Models: Probabilistic Models
Relevance Feedback Hongning Wang
Modeling Diversity in Information Retrieval
John Lafferty, Chengxiang Zhai School of Computer Science
CS 4501: Information Retrieval
Junghoo “John” Cho UCLA
Retrieval Performance Evaluation - Measures
Language Models for TR Rong Jin
Presentation transcript:

Active Feedback in Ad Hoc IR Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign

2 Normal Relevance Feedback (RF) Feedback Judgments: d 1 + d 2 - … d k - Query Retrieval System Top K Results d d … d k 0.5 User Document Collection

3 Document Selection in RF Feedback Judgments: d 1 + d 2 - … d k - Query Retrieval System Which k docs to present ? User Document Collection Can we do better than just presenting top-K? (Consider diversity…)

4 Active Feedback (AF) An IR system actively selects documents for obtaining relevance judgments If a user is willing to judge K documents, which K documents should we present in order to maximize learning effectiveness?

5 Outline Framework and specific methods Experiment design and results Summary and future work

6 A Framework for Active Feedback Consider active feedback as a decision problem –Decide K documents (D) for relevance judgment Formalize it as an optimization problem –Optimize the expected learning benefits (loss) by requesting relevance judgments on D from the user Consider two cases of loss function according to the interaction between documents

7 Formula of the Framework Value of documents for learning Independent judgment Different judgments

8 Independent Loss Expected loss of each document

9 Independent Loss (cont.) Uncertainty Sampling Top K Relevant docs more useful than non-relevant docs More uncertain, more useful

10 Dependent Loss First select Top N docs of baseline retrieval Cluster N docs into K clusters K Cluster Centroid MMR … Gapped Top K Pick one doc every G+1 docs More relevant, more useful More diverse, more useful

11 Illustration of Three AF Methods Top-K (normal feedback) … Gapped Top-K K-Cluster Centroid Aiming at high diversity …

12 Evaluating Active Feedback Query Select K Docs K docs Judgment File + Judged Docs Initial Results No Feedback (Top-k, Gapped, Clustering) Feedback Results

13 Retrieval Methods (Lemur toolkit) Query Q Document D Results KL Divergence Feedback Docs F={d 1, …, d n } Active Feedback Default parameter settings unless otherwise stated Mixture Model Feedback Only learn from relevant docs

14 Comparison of Three AF Methods Collection Active FB Method #AFRel Per topic Include judged docs HARD 2003 Baseline/ Pseudo FB/ Top-K Gapped ** * Clustering AP88-89 Baseline/ Pseudo FB/ Top-K Gapped * ** Clustering ** ** Top-K is the worst! Clustering uses fewest relevant docs

15 Appropriate Evaluation of Active Feedback New DB (AP88-89, AP90) Original DB with judged docs (AP88-89, HARD) Original DB without judged docs Can’t tell if the ranking of un-judged documents is improved Different methods have different test documents See the learning effect more explicitly But the docs must be similar to original docs

16 Retrieval Performance on AP90 Dataset MethodBaselinePseudo FB Top KGapped Top K K Cluster Centroid MAP Top-K is consistently the worst!

17 Mixture Model Parameter  Factor

18 Summary Introduce the active feedback problem Propose a preliminary framework and three methods (Top-k, Gapped Top-k, Clustering) Study the evaluation strategy Experiment results show that –Presenting the top-k is not the best strategy –Clustering can generate fewer, higher quality feedback examples

19 Future Work Explore other methods for active feedback Develop a general framework Combine pseudo feedback and active feedback

20 Thank you ! The End