Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct.

Slides:



Advertisements
Similar presentations
Relevance Feedback User tells system whether returned/disseminated documents are relevant to query/information need or not Feedback: usually positive sometimes.
Advertisements

Information Retrieval and Organisation Chapter 11 Probabilistic Information Retrieval Dell Zhang Birkbeck, University of London.
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Chapter 5: Introduction to Information Retrieval
Introduction to Information Retrieval
Probabilistic Information Retrieval Chris Manning, Pandu Nayak and
CpSc 881: Information Retrieval
Information Retrieval Models: Probabilistic Models
Chapter 7 Retrieval Models.
Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct.
How to Make Manual Conjunctive Normal Form Queries Work in Patent Search Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science.
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
11 September 2002IR/LM workshop, Amherst1 Information retrieval, language and ‘language models’ Stephen Robertson Microsoft Research Cambridge and City.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.
Chapter 5: Information Retrieval and Web Search
Automatic Term Mismatch Diagnosis for Selective Query Expansion Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie.
WikiQuery.org -- An interactive collaboration interface for creating, storing and sharing effective CNF queries Le Zhao*, Xiaozhong Liu #, Jamie Callan*
CS344: Introduction to Artificial Intelligence Vishal Vachhani M.Tech, CSE Lecture 34-35: CLIR and Ranking in IR.
Probabilistic Models in IR Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Using majority of the slides from.
Modeling and Solving Term Mismatch for Full-Text Retrieval
Homework Define a loss function that compares two matrices (say mean square error) b = svd(bellcore) b2 = b$u[,1:2] %*% diag(b$d[1:2]) %*% t(b$v[,1:2])
1 Vector Space Model Rong Jin. 2 Basic Issues in A Retrieval Model How to represent text objects What similarity function should be used? How to refine.
TREC 2009 Review Lanbo Zhang. 7 tracks Web track Relevance Feedback track (RF) Entity track Blog track Legal track Million Query track (MQ) Chemical IR.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
CLEF 2005: Multilingual Retrieval by Combining Multiple Multilingual Ranked Lists Luo Si & Jamie Callan Language Technology Institute School of Computer.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Term Frequency. Term frequency Two factors: – A term that appears just once in a document is probably not as significant as a term that appears a number.
Chapter 6: Information Retrieval and Web Search
1 Computing Relevance, Similarity: The Vector Space Model.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
CPSC 404 Laks V.S. Lakshmanan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides at UC-Berkeley.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
University of Malta CSA3080: Lecture 6 © Chris Staff 1 of 20 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
Alternative IR models DR.Yeni Herdiyeni, M.Kom STMIK ERESHA.
Boolean Model Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Language Model in Turkish IR Melih Kandemir F. Melih Özbekoğlu Can Şardan Ömer S. Uğurlu.
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA SIGIR 2001.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
CpSc 881: Information Retrieval. 2 Using language models (LMs) for IR ❶ LM = language model ❷ We view the document as a generative model that generates.
Learning to Estimate Query Difficulty Including Applications to Missing Content Detection and Distributed Information Retrieval Elad Yom-Tov, Shai Fine,
Relevance Feedback Hongning Wang
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture Probabilistic Information Retrieval.
SIGIR 2005 Relevance Information: A Loss of Entropy but a Gain for IDF? Arjen P. de Vries Thomas Roelleke,
Introduction to Information Retrieval Probabilistic Information Retrieval Chapter 11 1.
Information Retrieval and Extraction 2009 Term Project – Modern Web Search Advisor: 陳信希 TA: 蔡銘峰&許名宏.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 14: Language Models for IR.
CSCE 590 Web Scraping – Information Extraction II
Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1
Lecture 13: Language Models for IR
Lecture 12: Relevance Feedback & Query Expansion - II
CSCI 5417 Information Retrieval Systems Jim Martin
Information Retrieval Models: Probabilistic Models
Language Models for Information Retrieval
IR Theory: Evaluation Methods
Murat Açar - Zeynep Çipiloğlu Yıldız
John Lafferty, Chengxiang Zhai School of Computer Science
Chapter 5: Information Retrieval and Web Search
Language Model Approach to IR
CS 430: Information Discovery
Information Retrieval and Web Design
Presentation transcript:

Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct 27, CIKM Necessity is as important as idf (theory) Explains behavior of IR models (practice) Can be predicted Performance gain Main Points

Definition of Necessity P(t | R q ) Directly calculated given relevance judgements for q Docs that contain t Relevant (q) 2 P(t | R q ) = 0.4 Collection Necessity == 1 – mismatch == term recall

Why Necessity? Roots in Probabilistic Models Binary Independence Model –[Robertson and Spärck Jones 1976] –“Relevance Weight”, “Term Relevance” P(t | R) is effectively the only part about relevance. 3 Necessity odds idf (sufficiency) Necessity is as important as idf (theory) Explains behavior of IR models (practice) Can be predicted Performance gain Main Points

Without Necessity The emphasis problem for idf-only term weighting –Emphasize high idf terms in query “prognosis/viability of a political third party in U.S.” (Topic 206) 4

Ground Truth partypoliticalthirdviabilityprognosis True P(t | R) idf Emphasis TREC 4 topic 206

Indri Top Results 1. (ZF ) Recession concerns lead to a discouraging prognosis for (AP ) Politics … party … Robertson's viability as a candidate 3. (WSJ ) political parties … 4. (AP ) there is no viable opposition … 5. (WSJ ) A third of the votes 6. (WSJ ) politics, party, two thirds 7. (AP ) third ranking political movement… 8. (AP ) political parties 9. (AP ) prognosis for the Sunday school 10. (ZF ) third party provider (Google, Bing still have top 10 false positives. Emphasis also a problem for large search engines!) 6

Without Necessity The emphasis problem for idf-only term weighting –Emphasize high idf terms in query “prognosis/viability of a political third party in U.S.” (Topic 206) –False positives throughout rank list especially detrimental at top rank –No term recall hurts precision at all recall levels –(This is true for BIM, and also BM25, LM that use tf.) How significant is the emphasis problem? 7

Failure Analysis of 44 Topics from TREC RIA workshop 2003 (7 top research IR systems, >56 expert*weeks) Necessity term weighting Necessity guided expansion Basis: Term Necessity Prediction Necessity is as important as idf (theory) Explains behavior of IR models (practice) & Bigrams, &Term restriction using doc fields Can be predicted Performance gain Main Points

Given True Necessity +100% over BIM (in precision at all recall levels) [Robertson and Spärk Jones 1976] % over Language Model, BM25 (in MAP) This work For a new query w/o relevance judgements, need to predict necessity. –Predictions don’t need to be very accurate to show performance gain. 9

(Examples from TREC 3 topics) Term in Query Oil Spills Term limitations for US Congress members Insurance Coverage which pays for Long Term Care School Choice Voucher System and its effects on the US educational program Vitamin the cure or cause of human ailments P(t | R) How Necessary are Words? 10

Mismatch Statistics Mismatch variation across terms (TREC 3 title) (TREC 9 desc) –Not constant, need prediction 11

Mismatch Statistics (2) Mismatch variation for the same term in different queries TREC 3 recurring words –Query dependent features needed (1/3 term occurrences have necessity variation>0.1) 12

Prior Prediction Approaches Croft/Harper combination match (1979) –treats P(t | R) as a tuned constant –when >0.5, rewards docs that match more query terms Greiff’s (1998) exploratory data analysis –Used idf to predict overall term weighting –Improved over BIM Metzler’s (2008) generalized idf –Used idf to predict P(t | R) –Improved over BIM Years of simple idf feature, limited success –Missing piece: P(t | R) = term necessity = term recall 13

Factors that Affect Necessity What causes a query term to not appear in relevant documents? Topic Centrality (Concept Necessity) –E.g., Laser research related or potentially related to US defense, Welfare laws propounded as reforms Synonyms –E.g., movie == film == … Abstractness –E.g., Ailments in the vitamin query, Dog Maulings, Christian Fundamentalism –Worst thing is a rare & abstract term, e.g. prognosis 14

Features We need to –Identify synonyms/searchonyms of a query term –in a query dependent way Use Thesauri? –Biased (not collection dependent) –Static (not query dependent) –Not promising, Not easy Term-term similarity in concept space! –Local LSI (Latent Semantic Indexing) LSI of (e.g. 200) top ranked documents keep (e.g. 150) dimensions 15

Features Topic Centrality –Length of term vector after dimension reduction (local LSI) Synonymy (Concept Necessity) –Average similarity scores of top 5 similar terms Replaceability –Adjust the Synonymy measure by how many new documents the synonyms match Abstractness –Users modify abstract terms with concrete terms 16 effects on the US educational programprognosis of a political third party

Experiments Necessity Prediction Error –Regression problem Model: RBF kernel regression, M:  P(t | R) Necessity for Term Weighting –End-to-End retrieval performance –How to weight terms by their necessity In BM25 –Binary Independence Model In Language Models –Relevance model P m (t | R) – multinomial (Lavrenko and Croft 2001) 17

Necessity Prediction Example 18 partypoliticalthirdviabilityprognosis True P(t | R) Predicted Emphasis Trained on TREC 3, tested on TREC 4

Necessity Prediction Error 19 L1 Loss: The lower The better Necessity is as important as idf Explains behavior of IR models Can be predicted Performance gain Main Points

Predicted Necessity Weighting 20 TREC train sets Test/x-validation4688 LM desc – Baseline LM desc – Necessity Improvement26.38%23.52%20.33%21.32% Baseline Necessity Baseline Necessity % gain (necessity weight) 10-20% gain (top Precision)

TREC train sets Test/x-validation LM desc – Baseline LM desc – Necessity Improvement11.43%11.25%149.8%24.82% Baseline Necessity Baseline Necessity Predicted Necessity Weighting (ctd.) 21 Necessity is as important as idf Explains behavior of IR models Can be predicted Performance gain Main Points

vs. Relevance Model Test/x-validation Relevance Model desc RM reweight-Only desc RM reweight-Trained desc Weight Only ≈ Expansion Supervised > Unsupervised (5-10%) Relevance Model: #weight( 1-λ #combine( t 1 t 2 ) λ #weight( w 1 t 1 w 2 t 2 w 3 t 3 … ) x ~ y w 1 ~ P(t 1 |R) w 2 ~ P(t 2 |R) x y

23 Necessity is as important as idf (theory) Explains behavior of IR models (practice) Effective features can predict necessity Performance gain Take Home Messages

Acknowledgements Reviewers from multiple venues Ni Lao, Frank Lin, Yiming Yang, Stephen Robertson, Bruce Croft, Matthew Lease –Discussions & references David Fisher, Mark Hoy –Maintaining the Lemur toolkit Andrea Bastoni and Lorenzo Clemente –Maintaining LSI code for Lemur toolkit SVM-light, Stanford parser TREC –All the data NSF Grant IIS and IIS Feedback: Le Zhao