© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of.

Slides:



Advertisements
Similar presentations
Relevance Feedback User tells system whether returned/disseminated documents are relevant to query/information need or not Feedback: usually positive sometimes.
Advertisements

Introduction to Information Retrieval
Improved TF-IDF Ranker
Lecture 11 Search, Corpora Characteristics, & Lucene Introduction.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
PROBLEM BEING ATTEMPTED Privacy -Enhancing Personalized Web Search Based on:  User's Existing Private Data Browsing History s Recent Documents 
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.
Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Probabilistic Information Retrieval.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Experiments on Using Semantic Distances Between Words in Image Caption Retrieval Presenter: Cosmin Adrian Bejan Alan F. Smeaton and Ian Quigley School.
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
1 Query Language Baeza-Yates and Navarro Modern Information Retrieval, 1999 Chapter 4.
Recall: Query Reformulation Approaches 1. Relevance feedback based vector model (Rocchio …) probabilistic model (Robertson & Sparck Jones, Croft…) 2. Cluster.
Evaluating the Performance of IR Sytems
Lessons Learned from Information Retrieval Chris Buckley Sabir Research
1 CS 430 / INFO 430 Information Retrieval Lecture 10 Probabilistic Information Retrieval.
Automatically obtain a description for a larger cluster of relevant documents Identify terms related to query terms  Synonyms, stemming variations, terms.
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Query Expansion.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
University of Malta CSA4080: Topic 8 © Chris Staff 1 of 49 CSA4080: Adaptive Hypertext Systems II Dr. Christopher Staff Department.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
Query Operations J. H. Wang Mar. 26, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text.
1 Query Operations Relevance Feedback & Query Expansion.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Chapter 6: Information Retrieval and Web Search
Information retrieval 1 Boolean retrieval. Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text)
1 Computing Relevance, Similarity: The Vector Space Model.
1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.
University of Malta CSA3080: Lecture 4 © Chris Staff 1 of 14 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
University of Malta CSA3080: Lecture 6 © Chris Staff 1 of 20 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
IR Theory: Relevance Feedback. Relevance Feedback: Example  Initial Results Search Engine2.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.
Natural Language Processing for Information Retrieval -KVMV Kiran ( )‏ -Neeraj Bisht ( )‏ -L.Srikanth ( )‏
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
Query Suggestion. n A variety of automatic or semi-automatic query suggestion techniques have been developed  Goal is to improve effectiveness by matching.
University of Malta CSA3080: Lecture 12 © Chris Staff 1 of 22 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
Performance Measurement. 2 Testing Environment.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
Motivation  Methods of local analysis extract information from local set of documents retrieved to expand the query  An alternative is to expand the.
Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
University of Malta CSA4080: Topic 7 © Chris Staff 1 of 15 CSA4080: Adaptive Hypertext Systems II Dr. Christopher Staff Department.
Evaluation. The major goal of IR is to search document relevant to a user query. The evaluation of the performance of IR systems relies on the notion.
Information Retrieval Quality of a Search Engine.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Information Retrieval Lecture 3 Introduction to Information Retrieval (Manning et al. 2007) Chapter 8 For the MSc Computer Science Programme Dell Zhang.
Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
INFORMATION RETRIEVAL Pabitra Mitra Computer Science and Engineering IIT Kharagpur
University of Malta CSA3080: Lecture 10 © Chris Staff 1 of 18 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
Walid Magdy Gareth Jones
Lecture 12: Relevance Feedback & Query Expansion - II
An Automatic Construction of Arabic Similarity Thesaurus
Multimedia Information Retrieval
CSA3212: User Adaptive Systems
Panagiotis G. Ipeirotis Luis Gravano
CS 4501: Information Retrieval
Automatic Global Analysis
Web Information retrieval (Web IR)
Information Retrieval and Web Design
Presentation transcript:

© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of Computer Science & AI University of Malta

© 2004 Chris Staff CSAW’04 University of Malta of 15 Aims of this presentation Background –The Vocabulary Problem in IR Scenario –Using retrieved documents to determine how to expand query Approach Evaluation

© 2004 Chris Staff CSAW’04 University of Malta of 15 The Vocabulary Problem Furnas et al, 1987, find that any two people describe the same concept/object using the same term with a probability of less than.2 This is a huge problem for IR –High probability of finding some documents about your term (but watch ambiguous terms!) –Low probability of finding all documents about your concept (so low ‘coverage’)

© 2004 Chris Staff CSAW’04 University of Malta of 15 What’s Query Expansion? Adding terms to query to improve recall while keeping precision high Recall is 1 when all relevant docs are retrieved Precision is 1 when all retrieved docs are relevant

© 2004 Chris Staff CSAW’04 University of Malta of 15 What’s Query Expansion? Attempts to improve recall (adding synonyms) usually involve constructed thesaurus (Qiu et al, 1995, Mandala et al, 1999, Voorhees, 1994) Attempts to improve precision (by adding restricting terms) now based around automatic relevance feedback (e.g., Mitra et al, 1998) Indiscriminate query expansion can lead to loss of precision (Voorhees, 1994) or hurt recall

© 2004 Chris Staff CSAW’04 University of Malta of 15 Scenario Two users search for information related to the same concept C User queries Q 1 and Q 2 have no terms in common R 1 and R 2 are results sets of Q 1 and Q 2 respectively R common = R 1  R 2

© 2004 Chris Staff CSAW’04 University of Malta of 15 Scenario We assume that R common is small and non- empty (Furnas, 1985 and Furnas et al, 1987) If R common is large then Q 1 and Q 2 will both retrieve same set of documents Can determine (using WordNet) if any term in Q 1 is the synonym of a term in Q 2 –Some doc D k in R common probably includes both terms (because of way Web IR works)!

© 2004 Chris Staff CSAW’04 University of Malta of 15 Scenario If t 1 in Q 1 and t 2 in Q 2 are synonyms –Can expand either in future queries containing t 1 or t 2 –As long as doc D k appears in results set (the context)

© 2004 Chris Staff CSAW’04 University of Malta of 15 Approach ‘Learning’ synonyms in context Query Expansion

© 2004 Chris Staff CSAW’04 University of Malta of 15 ‘Learning’ Synonyms in Context A document is associated with a “bag of words” ever used to retrieve doc A term, document pair is associated with a synset for the term in the context of the doc –Word sense from WordNet also recorded to reduce ambiguity

© 2004 Chris Staff CSAW’04 University of Malta of 15 Query Expansion in Context Submit unexpanded original user query Q to obtain results set R For each document D k in R (k is rank) retrieve synsets for terms in Q Same query term in context of different docs in R may yield inconsistent synsets –Countered using Inverse Document Relevance

© 2004 Chris Staff CSAW’04 University of Malta of 15 Inverse Document Relevance IDR is relative frequency with which doc d is retrieved in rank k when term q occurs in the query IDR q,d = W q,d / W d (where W d is number of times d retrieved, W q,d number of times d retrieved when q occurs in query)

© 2004 Chris Staff CSAW’04 University of Malta of 15 Term Document Relevance We then re-rank documents in R based on their TDR TDR q,d,k = IDR q,d x W q,d,k / W d,k Synsets of top-10 re-ranked document are merged according to word category and sense Most frequently occurring word category, word sense pair synset used to expand q in query

© 2004 Chris Staff CSAW’04 University of Malta of 15 Evaluation Need huge query log, ideally, with relevance judgements for queries We have TREC QA collection, but we’ll need to index them before running the test queries through them (using, e.g., SMART) –Disadvantage that there might not be enough queries User Studies

© 2004 Chris Staff CSAW’04 University of Malta of 15 Thank you!