CS 4501: Information Retrieval

Slides:



Advertisements
Similar presentations
Relevance Feedback User tells system whether returned/disseminated documents are relevant to query/information need or not Feedback: usually positive sometimes.
Advertisements

Information Retrieval and Organisation Chapter 11 Probabilistic Information Retrieval Dell Zhang Birkbeck, University of London.
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.
Chapter 5: Introduction to Information Retrieval
Introduction to Information Retrieval
1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois,
Language Models Hongning Wang
Probabilistic Ranking Principle
K-means clustering Hongning Wang
Information Retrieval Models: Probabilistic Models
Visual Recognition Tutorial
IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
Carnegie Mellon 1 Maximum Likelihood Estimation for Information Thresholding Yi Zhang & Jamie Callan Carnegie Mellon University
Lecture 5: Learning models using EM
Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Probabilistic Information Retrieval.
Modern Information Retrieval Chapter 5 Query Operations.
Language Models for TR Rong Jin Department of Computer Science and Engineering Michigan State University.
1 Query Language Baeza-Yates and Navarro Modern Information Retrieval, 1999 Chapter 4.
Recall: Query Reformulation Approaches 1. Relevance feedback based vector model (Rocchio …) probabilistic model (Robertson & Sparck Jones, Croft…) 2. Cluster.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Language Models Hongning Wang Two-stage smoothing [Zhai & Lafferty 02] c(w,d) |d| P(w|d) = +  p(w|C) ++ Stage-1 -Explain unseen words -Dirichlet.
Query Operations J. H. Wang Mar. 26, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Search Engine Architecture
Lecture 1: Overview of IR Maya Ramanath. Who hasn’t used Google? Why did Google return these results first ? Can we improve on it? Is this a good result.
IR Theory: Relevance Feedback. Relevance Feedback: Example  Initial Results Search Engine2.
Probabilistic Ranking Principle Hongning Wang
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
Active Feedback in Ad Hoc IR Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Relevance Feedback Hongning Wang
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Introduction to Information Retrieval Introduction to Information Retrieval Lecture Probabilistic Information Retrieval.
The Effect of Database Size Distribution on Resource Selection Algorithms Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.
Relevant Document Distribution Estimation Method for Resource Selection Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University
KNN & Naïve Bayes Hongning Wang
Queensland University of Technology
Information Retrieval Models: Language Models
Lecture 13: Language Models for IR
Lecture 12: Relevance Feedback & Query Expansion - II
Language Models for Text Retrieval
Search Engine Architecture
Dipartimento di Ingegneria «Enzo Ferrari»,
Latent Variables, Mixture Models and EM
Information Retrieval Models: Probabilistic Models
Relevance Feedback Hongning Wang
Language Models for Information Retrieval
Bayesian Models in Machine Learning
John Lafferty, Chengxiang Zhai School of Computer Science
Search Engine Architecture
Probabilistic Ranking Principle
CS246: Latent Dirichlet Analysis
Topic Models in Text Processing
Language Models Hongning Wang
CS 430: Information Discovery
CS590I: Information Retrieval
Relevance and Reinforcement in Interactive Browsing
Retrieval Utilities Relevance feedback Clustering
INF 141: Information Retrieval
Language Models for TR Rong Jin
Presentation transcript:

CS 4501: Information Retrieval Recap: From RankNet to LambdaRank to LambdaMART: An Overview Christopher J.C. Burges, 2010 Minimizing mis-ordered pair => maximizing IR metrics? Mis-ordered pairs: 6 Mis-ordered pairs: 4 AP: 5 8 AP: 5 12 DCG: 1.333 DCG: 0.931 Position is crucial! CS@UVa CS 4501: Information Retrieval

CS 4501: Information Retrieval Recap: From RankNet to LambdaRank to LambdaMART: An Overview Christopher J.C. Burges, 2010 Weight the mis-ordered pairs? Some pairs are more important to be placed in the right order Inject into object function 𝑦 𝑖 > 𝑦 𝑗 Ω 𝑑 𝑖 , 𝑑 𝑗 exp −𝑤ΔΦ 𝑞 𝑛 , 𝑑 𝑖 , 𝑑 𝑗 Inject into gradient 𝜆 𝑖𝑗 = 𝜕 𝑂 𝑎𝑝𝑝𝑟𝑜 𝜕𝑤 Δ𝑂 Depend on the ranking of document i, j in the whole list Change in original object, e.g., NDCG, if we switch the documents i and j, leaving the other documents unchanged Gradient with respect to approximated objective, i.e., exponential loss on mis-ordered pairs CS@UVa CS 4501: Information Retrieval

Relevance Feedback Hongning Wang CS@UVa

What we have learned so far Indexed corpus Crawler Research attention Ranking procedure Feedback Evaluation Doc Analyzer (Query) User Query Rep Doc Rep (Index) Ranker Indexer results Index CS@UVa CS4501: Information Retrieval

Inferred information need User feedback should be An IR system is an interactive system Query Information need GAP! Feedback Ranked documents Inferred information need CS@UVa CS4501: Information Retrieval

CS4501: Information Retrieval Relevance feedback Results: d1 3.5 d2 2.4 … dk 0.5 ... Retrieval Engine Document collection Query User judgment Updated query Feedback Judgments: d1 + d2 - d3 + … dk - ... CS@UVa CS4501: Information Retrieval

CS4501: Information Retrieval Basic idea in feedback Query expansion Feedback documents can help discover related query terms E.g., query=“information retrieval” Relevant docs may likely share very related words, such as “search”, “search engine”, “ranking”, “query” Expand the original query with such words will increase recall and sometimes also precision CS@UVa CS4501: Information Retrieval

CS4501: Information Retrieval Basic idea in feedback Learning-based retrieval Feedback documents can be treated as supervision for ranking model update Covered in the lecture of “learning-to-rank” CS@UVa CS4501: Information Retrieval

Relevance feedback in real systems Google used to provide such functions Guess why? Relevant Nonrelevant CS@UVa CS4501: Information Retrieval

Relevance feedback in real systems Popularly used in image search systems CS@UVa CS4501: Information Retrieval

Pseudo relevance feedback What if the users are reluctant to provide any feedback Results: d1 3.5 d2 2.4 … dk 0.5 ... Retrieval Engine Query Feedback Updated query Document collection Judgments: d1 + d2 + d3 + … dk - ... top k CS@UVa CS4501: Information Retrieval

CS4501: Information Retrieval Feedback techniques Feedback as query expansion Step 1: Term selection Step 2: Query expansion Step 3: Query term re-weighting Feedback as training signal Covered in learning to rank CS@UVa CS4501: Information Retrieval

Relevance feedback in vector space models General idea: query modification Adding new (weighted) terms Adjusting weights of old terms The most well-known and effective approach is Rocchio [Rocchio 1971] CS@UVa CS4501: Information Retrieval

Illustration of Rocchio feedback - - - - - - + + + - - + - - - - + + q - q - + + + + + - - - - + + + + + - + + + - - - - - - - - CS@UVa CS4501: Information Retrieval

Formula for Rocchio feedback Standard operation in vector space Parameters Modified query 𝑞 𝑚 =𝛼 𝑞 + 𝛽 | 𝐷 𝑟 | ∀ 𝑑 𝑖 ∈ 𝐷 𝑟 𝑑 𝑖 − 𝛾 | 𝐷 𝑛 | ∀ 𝑑 𝑗 ∈ 𝐷 𝑛 𝑑 𝑗 Original query Rel docs Non-rel docs CS@UVa CS4501: Information Retrieval

CS4501: Information Retrieval Rocchio in practice Negative (non-relevant) examples are not very important (why?) Efficiency concern Restrict the vector onto a lower dimension (i.e., only consider highly weighted words in the centroid vector) Avoid “training bias” Keep relatively high weight on the original query Can be used for relevance feedback and pseudo feedback Usually robust and effective CS@UVa CS4501: Information Retrieval

Recap: relevance feedback Results: d1 3.5 d2 2.4 … dk 0.5 ... Retrieval Engine Document collection Query User judgment Updated query Feedback Judgments: d1 + d2 - d3 + … dk - ... CS@UVa CS4501: Information Retrieval

Recap: illustration of Rocchio feedback - - - - - - + + + - - + - - - - + + q - q - + + + + + - - - - + + + + + - + + + - - - - - - - - CS@UVa CS4501: Information Retrieval

Feedback in probabilistic models Rel. doc model NonRel. doc model “Rel. query” model Classic Prob. Model Language Model (q1,d1,1) (q1,d2,1) (q1,d3,1) (q1,d4,0) (q1,d5,0) (q3,d1,1) (q4,d1,1) (q5,d1,1) (q6,d2,1) (q6,d3,0) Parameter Estimation P(D|Q,R=1) P(D|Q,R=0) P(Q|D,R=1) Feedback: - P(D|Q,R=1) can be improved for the current query and future doc - P(Q|D,R=1) can be improved for the current doc and future query CS@UVa CS4501: Information Retrieval

Robertson-Sparck Jones Model (Robertson & Sparck Jones 76) (RSJ model) Two parameters for each term Ai: pi = P(Ai=1|Q,R=1): prob. that term Ai occurs in a relevant doc ui = P(Ai=1|Q,R=0): prob. that term Ai occurs in a non-relevant doc How to estimate these parameters? Suppose we have relevance judgments, “+0.5” and “+1” can be justified by Bayesian estimation as priors P(D|Q,R=1) can be improved for the current query and future doc Per-query estimation! CS@UVa CS4501: Information Retrieval

Feedback in language models Recap of language model Rank documents based on query likelihood Difficulty Documents are given, i.e., 𝑝(𝑤|𝑑) is fixed Document language model CS@UVa CS4501: Information Retrieval

Feedback in language models Approach Introduce a probabilistic query model Ranking: measure distance between query model and document model Feedback: query model update Q: Back to vector space model? A: Kind of, but in a different perspective. CS@UVa CS4501: Information Retrieval

Kullback-Leibler (KL) divergence based retrieval model Probabilistic similarity measure 𝑠𝑖𝑚 𝑞;𝑑 ∝−𝐾𝐿( 𝜃 𝑞 | 𝜃 𝑑 − 𝑤 𝑝 𝑤 𝜃 𝑞 log 𝑝 𝑤 𝜃 𝑞 + 𝑤 𝑝 𝑤 𝜃 𝑞 log 𝑝 𝑤 𝜃 𝑑 Query-specific quality, ignored for ranking Query language model, need to be estimated Document language model, we know how to estimate CS@UVa CS4501: Information Retrieval

CS4501: Information Retrieval Background knowledge Kullback-Leibler divergence A non-symmetric measure of the difference between two probability distributions P and Q 𝐾𝐿(𝑃| 𝑄 = 𝑃 𝑥 log 𝑃(𝑥) 𝑄(𝑥) 𝑑𝑥 It measures the expected number of extra bits required to code samples from P when using a code based on Q P usually refers to the “true” data distribution, Q refers to the “approximated” distribution Properties Non-negative 𝐾𝐿(𝑃| 𝑄 =0, iff 𝑃=𝑄 almost everywhere Explains why 𝑠𝑖𝑚 𝑞;𝑑 ∝−𝐷( 𝜃 𝑞 | 𝜃 𝑑 CS@UVa CS4501: Information Retrieval

Kullback-Leibler (KL) divergence based retrieval model Retrieval ≈ estimation of 𝜃 𝑞 and 𝜃 𝑑 𝑅𝑒𝑙 𝑞;𝑑 ∝ 𝑤∈𝑑,𝑝 𝑤 𝜃 𝑞 >0 𝑝 𝑤 𝜃 𝑞 log 𝑝(𝑤|𝑑) 𝛼 𝑑 𝑝(𝑤|𝐶) + log 𝛼 𝑑 A generalized version of query-likelihood language model 𝑝 𝑤 𝜃 𝑞 is the empirical distribution of words in a query same smoothing strategy CS@UVa CS4501: Information Retrieval

Feedback as model interpolation Document D Results Query Q Key: estimate the feedback model Feedback Docs F={d1, d2 , …, dn} =0 No feedback =1 Full feedback Generative model Q: Rocchio feedback in vector space model? A: Very similar, but with different interpretations. CS@UVa CS4501: Information Retrieval

Feedback in language models Feedback documents protect passengers, accidental/malicious harm, crime, rules CS@UVa CS4501: Information Retrieval

Generative mixture model of feedback F={d1, …, dn} P(w| F ) P(w| C)  1- P(feedback) Background words Topic words log 𝑝( 𝑑 𝐹 )= 𝑑,𝑤 𝑐 𝑤,𝑑 log 1−𝜆 𝑝 𝑤 𝜃 𝐹 +𝜆𝑝(𝑤|𝐶)  = Noise ratio in feedback documents Maximum Likelihood 𝜃 𝐹 =𝑎𝑟𝑔𝑚𝑎 𝑥 𝜃 log 𝑝( 𝑑 𝐹 ) CS@UVa CS4501: Information Retrieval

CS4501: Information Retrieval How to estimate F? fixed the 0.2 a 0.1 we 0.01 to 0.02 … flight 0.0001 company 0.00005 Feedback Doc(s) Known Background p(w|C) =0.7 ML Estimator Unknown query topic p(w|F)=? “airport security” … accident =? regulation =? passenger=? rules =? =0.3 Suppose, we know the identity of each word ; but we don’t... CS@UVa CS4501: Information Retrieval

Appeal to Expectation Maximization algorithm Identity (“hidden”) variable: zi {1 (background), 0(topic)} Suppose the parameters are all known, what’s a reasonable guess of zi? - depends on  (why?) - depends on p(w|C) and p(w|F) (how?) zi 1 ... the paper presents a text mining algorithm ... E-step M-step Why in Rocchio we did not distinguish a word’s identity? CS@UVa CS4501: Information Retrieval

A toy example of EM computation Expectation-Step: Augmenting data by guessing hidden variables Maximization-Step With the “augmented data”, estimate parameters using maximum likelihood Assume =0.5 CS@UVa CS4501: Information Retrieval

Example of feedback query model Open question: how do we handle negative feedback? Query: “airport security” Pesudo feedback with top 10 documents =0.7 =0.9 CS@UVa CS4501: Information Retrieval

CS4501: Information Retrieval What you should know Purpose of relevance feedback Rocchio relevance feedback for vector space models Query model based feedback for language models CS@UVa CS4501: Information Retrieval

CS4501: Information Retrieval Today’s reading Chapter 9. Relevance feedback and query expansion 9.1 Relevance feedback and pseudo relevance feedback 9.2 Global methods for query reformulation CS@UVa CS4501: Information Retrieval