Modeling Diversity in Information Retrieval

Slides:

Advertisements

Similar presentations

ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA 1 Modeling Diversity in Information.

Advertisements

Active Feedback: UIUC TREC 2003 HARD Track Experiments Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Evaluating Novelty and Diversity Charles Clarke School of Computer Science University of Waterloo two talks in one!

A Framework for Result Diversification

Less is More Probabilistic Model for Retrieving Fewer Relevant Docuemtns Harr Chen and David R. Karger MIT CSAIL SIGIR2006 4/30/2007.

Term Level Search Result Diversification DATE : 2013/09/11 SOURCE : SIGIR’13 AUTHORS : VAN DANG, W. BRUCE CROFT ADVISOR : DR.JIA-LING, KOH SPEAKER : SHUN-CHEN,

1 Opinion Summarization Using Entity Features and Probabilistic Sentence Coherence Optimization (UIUC at TAC 2008 Opinion Summarization Pilot) Nov 19,

Diversified Retrieval as Structured Prediction Redundancy, Diversity, and Interdependent Document Relevance (IDR ’09) SIGIR 2009 Workshop Yisong Yue Cornell.

1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois,

Information Retrieval Models: Probabilistic Models

Personalized Search Result Diversification via Structured Learning

Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.

Modern Information Retrieval

Language Models for TR Rong Jin Department of Computer Science and Engineering Michigan State University.

Affinity Rank Yi Liu, Benyu Zhang, Zheng Chen MSRA.

Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University.

Putting Query Representation and Understanding in Context: ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign A.

The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.

Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.

Evaluation David Kauchak cs458 Fall 2012 adapted from:

1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,

Minimal Test Collections for Retrieval Evaluation B. Carterette, J. Allan, R. Sitaraman University of Massachusetts Amherst SIGIR2006.

A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA

1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.

Language Models Hongning Wang Two-stage smoothing [Zhai & Lafferty 02] c(w,d) |d| P(w|d) = +  p(w|C) ++ Stage-1 -Explain unseen words -Dirichlet.

Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.

A General Optimization Framework for Smoothing Language Models on Graph Structures Qiaozhu Mei, Duo Zhang, ChengXiang Zhai University of Illinois at Urbana-Champaign.

Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.

UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai

Toward A Session-Based Search Engine Smitha Sriram, Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.

Probabilistic Models of Novel Document Rankings for Faceted Topic Retrieval Ben Cartrette and Praveen Chandar Dept. of Computer and Information Science.

Less is More Probabilistic Models for Retrieving Fewer Relevant Documents Harr Chen, David R. Karger MIT CSAIL ACM SIGIR 2006 August 9, 2006.

LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.

ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA 1 Modeling Diversity in Information.

Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.

Carnegie Mellon Novelty and Redundancy Detection in Adaptive Filtering Yi Zhang, Jamie Callan, Thomas Minka Carnegie Mellon University {yiz, callan,

1 A Formal Study of Information Retrieval Heuristics Hui Fang, Tao Tao and ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Implicit User Modeling for Personalized Search Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.

Active Feedback in Ad Hoc IR Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Automatic Labeling of Multinomial Topic Models

Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.

Automatic Labeling of Multinomial Topic Models Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai DAIS The Database and Information Systems Laboratory.

Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.

Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.

Nonintrusive Personalization in Interactive IR Xuehua Shen Department of Computer Science University of Illinois at Urbana-Champaign Thesis Committee:

1 Risk Minimization and Language Modeling in Text Retrieval ChengXiang Zhai Thesis Committee: John Lafferty (Chair), Jamie Callan Jaime Carbonell David.

A Study of Poisson Query Generation Model for Information Retrieval

Context-Sensitive IR using Implicit Feedback Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

哈工大信息检索研究室 HITIR ’ s Update Summary at TAC2008 Extractive Content Selection Using Evolutionary Manifold-ranking and Spectral Clustering Reporter: Ph.d.

A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval Chengxiang Zhai, John Lafferty School of Computer Science Carnegie.

Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,

Queensland University of Technology

A Formal Study of Information Retrieval Heuristics

Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1

An Empirical Study of Learning to Rank for Entity Search

Information Retrieval Models: Probabilistic Models

Relevance Feedback Hongning Wang

Language Models for Information Retrieval

John Lafferty, Chengxiang Zhai School of Computer Science

CS 4501: Information Retrieval

Junghoo “John” Cho UCLA

CS590I: Information Retrieval

Retrieval Evaluation - Measures

Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007

Learning to Rank with Ties

Retrieval Performance Evaluation - Measures

Language Models for TR Rong Jin

Presentation transcript:

Modeling Diversity in Information Retrieval ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute for Genomic Biology Department of Statistics University of Illinois, Urbana-Champaign Joint work with John Lafferty, William Cohen, and Xuehua Shen ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA

Different Reasons for Diversification Redundancy reduction Diverse information needs Mixture of users Single user with an under-specified query Aspect retrieval Overview of results Active relevance feedback …

Outline Risk minimization framework Capturing different needs for diversification Language models for diversification

IR as Sequential Decision Making (Information Need) (Model of Information Need) User System A1 : Enter a query Which documents to present? How to present them? Which documents to view? Ri: results (i=1, 2, 3, …) Which part of the document to show? How? A2 : View document View more? R’: Document content A3 : Click on “Back” button

Retrieval Decisions Given U, C, At , and H, choose the best Rt from all possible responses to At History H={(Ai,Ri)} i=1, …, t-1 Query=“Jaguar” User U: A1 A2 … … At-1 At System: R1 R2 … … Rt-1 Click on “Next” button Rt  r(At) Rt =? The best ranking for the query The best k unseen docs C All possible rankings of C Document Collection All possible size-k subsets of unseen docs

A Risk Minimization Framework Observed User Model M=(S, U…) Seen docs Information need User: U Interaction history: H Current user action: At Document collection: C Optimal response: r* (minimum loss) All possible responses: r(At)={r1, …, rn} L(ri,At,M) Loss Function Inferred Observed Bayes risk

A Simplified Two-Step Decision-Making Procedure Approximate the Bayes risk by the loss at the mode of the posterior distribution Two-step procedure Step 1: Compute an updated user model M* based on the currently available information Step 2: Given M*, choose a response to minimize the loss function

Optimal Interactive Retrieval User U C Collection A1 M*1 P(M1|U,H,A1,C) L(r,A1,M*1) R1 A2 M*2 P(M2|U,H,A2,C) L(r,A2,M*2) R2 A3 … IR system

Refinement of Risk Minimization At {“enter a query”, “click on Back button”, “click on Next button, …} r(At): decision space (At dependent) r(At) = all possible subsets of C + presentation strategies r(At) = all possible rankings of docs in C r(At) = all possible rankings of unseen docs … M: user model Essential component: U = user information need S = seen documents n = “Topic is new to the user” L(Rt ,At,M): loss function Generally measures the utility of Rt for a user modeled as M Often encodes retrieval criteria (e.g., using M to select a ranking of docs) P(M|U, H, At, C): user model inference Often involves estimating a unigram language model U

Generative Model of Document & Query [Lafferty & Zhai 01] inferred observed Partially U User q Query R d Document S Source

Risk Minimization with Language Models [Lafferty & Zhai 01, Zhai & Lafferty 06] Loss L ... query q user U doc set C source S q 1 N Choice: (D1,1) Choice: (D2,2) Choice: (Dn,n)

Optimal Ranking for Independent Loss Decision space = {rankings} Sequential browsing Independent loss Independent risk = independent scoring “Risk ranking principle” [Zhai 02, Zhai & Lafferty 06]

Risk Minimization for Diversification Redundancy reduction: loss function includes a redundancy/novelty measure Special case: list presentation + MMR [Zhai et al. 03] Diverse information needs: loss function defined on latent topics Special case: PLSA/LDA + aspect retrieval [Zhai 02] Active relevance feedback: loss function considers both relevance and benefit for feedback Special case: feedback only (hard queries) [Shen & Zhai 05]

Need to model interdependent document relevance Subtopic Retrieval Query: What are the applications of robotics in the world today? Find as many DIFFERENT applications as possible. Subtopic judgments A1 A2 A3 … ... Ak d1 1 1 0 0 … 0 0 d2 0 1 1 1 … 0 0 d3 0 0 0 0 … 1 0 …. dk 1 0 1 0 ... 0 1 Example subtopics: A1: spot-welding robotics A2: controlling inventory A3: pipe-laying robots A4: talking robot A5: robots for loading & unloading memory tapes A6: robot [telephone] operators A7: robot cranes … … Need to model interdependent document relevance

Diversify = Remove Redundancy [Zhai et al. 03] Greedy Algorithm for Ranking: Maximal Marginal Relevance (MMR) “Willingness to tolerate redundancy” C2<C3, since a redundant relevant doc is better than a non-relevant doc

A Mixture Model for Redundancy P(w|Old) Ref. document 1- =?  P(w|Background) Collection p(New|d)= = probability of “new” (estimated using EM) p(New|d) can also be estimated using KL-divergence

Evaluation metrics Intuitive goals: How do we quantify these? Should see documents from many different subtopics appear early in a ranking (subtopic coverage/recall) Should not see many different documents that cover the same subtopics (redundancy). How do we quantify these? One problem: the “intrinsic difficulty” of queries can vary.

Evaluation metrics: a proposal Definition: Subtopic recall at rank K is the fraction of subtopics a so that one of d1,..,dK is relevant to a. Definition: minRank(S,r) is the smallest rank K such that the ranking produced by IR system S has subtopic recall r at rank K. Definition: Subtopic precision at recall level r for IR system S is: This generalizes ordinary recall-precision metrics. It does not explicitly penalize redundancy.

Evaluation metrics: rationale precision 1.0 0.0 K minRank(S,r) For subtopics, the minRank(Sopt,r) curve’s shape is not predictable and linear. minRank(Sopt,r) recall

Evaluating redundancy Definition: the cost of a ranking d1,…,dK is where b is cost of seeing document, a is cost of seeing a subtopic inside a document (before a=0). Definition: minCost(S,r) is the minimal cost at which recall r is obtained. Definition: weighted subtopic precision at r is will use a=b=1

Evaluation Metrics Summary Measure performance (size of ranking minRank, cost of ranking minCost) relative to optimal. Generalizes ordinary precision/recall. Possible problems: Computing minRank, minCost is NP-hard! A greedy approximation seems to work well for our data set

Experiment Design Dataset: TREC “interactive track” data. London Financial Times: 210k docs, 500Mb 20 queries from TREC 6-8 Subtopics: average 20, min 7, max 56 Judged docs: average 40, min 5, max 100 Non-judged docs assumed not relevant to any subtopic. Baseline: relevance-based ranking (using language models) Two experiments Ranking only relevant documents Ranking all documents

S-Precision: re-ranking relevant docs

WS-precision: re-ranking relevant docs

Results for ranking all documents “Upper bound”: use subtopic names to build an explicit subtopic model.

Summary: Remove Redundancy Mixture model is effective for identifying novelty in relevant documents Trading off novelty and relevance is hard Relevance seems to be dominating factor in TREC interactive-track data

Diversity = Satisfy Diverse Info. Need [Zhai 02] Need to directly model latent aspects and then optimize results based on aspect/topic matching Reducing redundancy doesn’t ensure complete coverage of diverse aspects

Aspect Generative Model of Document & Query User q Query  =( 1,…, k) S Source d Document PLSI: LDA:

Aspect Loss Function  U q S d

Aspect Loss Function: Illustration New candidate p(a|k) non-relevant redundant perfect Combined coverage p(a|k) Desired coverage p(a|Q) “Already covered” p(a|1)... p(a|k -1)

Evaluation Measures Aspect Coverage (AC): measures per-doc coverage #distinct-aspects/#docs Aspect Uniqueness(AU): measures redundancy #distinct-aspects/#aspects Examples 1 1 1 … ... d1 d2 d3 #doc 1 2 3 … … #asp 2 5 8 … … #uniq-asp 2 4 5 AC: 2/1=2.0 4/2=2.0 5/3=1.67 AU: 2/2=1.0 4/5=0.8 5/8=0.625

Effectiveness of Aspect Loss Function (PLSI)

Effectiveness of Aspect Loss Function (LDA)

Comparison of 4 MMR Methods CC - Cost-based Combination QB - Query Background Model MQM - Query Marginal Model MDM - Document Marginal Model

Summary: Diverse Information Need Mixture model is effective for capturing latent topics Direct modeling of latent aspects/topics is more effective than indirect modeling through MMR in improving aspect coverage, but MMR is better for improving aspect uniqueness With direct topic modeling and matching, aspect coverage can be improved at the price of lower relevance-based precision

Diversify = Active Feedback [Shen & Zhai 05] Decision problem: Decide subset of documents for relevance judgment

Independent Loss Independent Loss

Independent Loss (cont.) Top K Uncertainty Sampling

Dependent Loss … MMR K Cluster Centroid Gapped Top K Heuristics: consider relevance first, then diversity Select Top N documents … Cluster N docs into K clusters MMR K Cluster Centroid Gapped Top K

Illustration of Three AF Methods Top-K (normal feedback) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 … Gapped Top-K K-cluster centroid Aiming at high diversity …

Evaluating Active Feedback Select K docs K docs Query Initial Results No feedback (Top-k, gapped, clustering) Judgment File + Judged docs - Feedback Results

Retrieval Methods (Lemur toolkit) Results Kullback-Leibler Divergence Scoring Document D Query Q Feedback Docs F={d1, …, dn} Active Feedback Mixture Model Feedback Only learn from relevant docs Default parameter settings unless otherwise stated

Comparison of Three AF Methods bold font = worst * = best Collection Active FB Method #Rel Include judged docs MAP Pr@10doc HARD Top-K 146 0.325 0.527 Gapped 150 0.330 0.548 Clustering 105 0.332 0.565 AP88-89 198 0.228 0.351 180 0.234* 0.389* 118 0.237 0.393 Top-K is the worst! Clustering uses fewest relevant docs

Appropriate Evaluation of Active Feedback Original DB with judged docs (AP88-89, HARD) Original DB without judged docs New DB (AP88-89, AP90) + + - - + + Can’t tell if the ranking of un-judged documents is improved Different methods have different test documents See the learning effect more explicitly But the docs must be similar to original docs

Comparison of Different Test Data Top-K is consistently the worst! Test Data Active FB Method #Rel MAP Pr@10doc AP88-89 Including judged docs Top-K 198 0.228 0.351 Gapped 180 0.234 0.389 Clustering 118 0.237 0.393 AP90 0.220 0.321 0.222 0.326 0.223 0.325 Clustering generates fewer, but higher quality examples

Summary: Active Feedback Presenting the top-k is not the best strategy Clustering can generate fewer, higher quality feedback examples

Conclusions There are many reasons for diversifying search results (redundancy, diverse information needs, active feedback) Risk minimization framework can model all these cases of diversification Different scenarios may need different techniques and different evaluation measures

References Risk Minimization [Lafferty & Zhai 01] John Lafferty and ChengXiang Zhai. Document language models, query models, and risk minimization for information retrieval. In Proceedings of the ACM SIGIR 2001, pages 111-119. [Zhai & Lafferty 06] ChengXiang Zhai and John Lafferty, A risk minimization framework for information retrieval, Information Processing and Management, 42(1), Jan. 2006, pages 31-55. Subtopic Retrieval [Zhai et al. 03] ChengXiang Zhai, William Cohen, and John Lafferty, Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval, In Proceedings of ACM SIGIR 2003. [Zhai 02] ChengXiang Zhai, Language Modeling and Risk Minimization in Text Retrieval, Ph.D. thesis, Carnegie Mellon University, 2002. Active Feedback [Shen & Zhai 05] Xuehua Shen, ChengXiang Zhai, Active Feedback in Ad Hoc Information Retrieval, Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'05), 59-66, 2005 ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA

Thank You!