Retrieval Utilities Relevance feedback Clustering

Slides:



Advertisements
Similar presentations
Relevance Feedback User tells system whether returned/disseminated documents are relevant to query/information need or not Feedback: usually positive sometimes.
Advertisements

Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.
Chapter 5: Introduction to Information Retrieval
Basic IR: Modeling Basic IR Task: Slightly more complex:
Lecture 11 Search, Corpora Characteristics, & Lucene Introduction.
What is missing? Reasons that ideal effectiveness hard to achieve: 1. Users’ inability to describe queries precisely. 2. Document representation loses.
K nearest neighbor and Rocchio algorithm
Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
CSM06 Information Retrieval Lecture 3: Text IR part 2 Dr Andrew Salway
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Modern Information Retrieval Chapter 5 Query Operations.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
The Vector Space Model …and applications in Information Retrieval.
1 CS 430 / INFO 430 Information Retrieval Lecture 10 Probabilistic Information Retrieval.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Automatically obtain a description for a larger cluster of relevant documents Identify terms related to query terms  Synonyms, stemming variations, terms.
Chapter 5: Information Retrieval and Web Search
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
1 Vector Space Model Rong Jin. 2 Basic Issues in A Retrieval Model How to represent text objects What similarity function should be used? How to refine.
Query Relevance Feedback and Ontologies How to Make Queries Better.
APPLICATIONS OF DATA MINING IN INFORMATION RETRIEVAL.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
1 Query Operations Relevance Feedback & Query Expansion.
Information Retrieval Models - 1 Boolean. Introduction IR systems usually adopt index terms to process queries Index terms:  A keyword or group of selected.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Term Frequency. Term frequency Two factors: – A term that appears just once in a document is probably not as significant as a term that appears a number.
Chapter 6: Information Retrieval and Web Search
1 Computing Relevance, Similarity: The Vector Space Model.
1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
CPSC 404 Laks V.S. Lakshmanan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides at UC-Berkeley.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
Lecture 1: Overview of IR Maya Ramanath. Who hasn’t used Google? Why did Google return these results first ? Can we improve on it? Is this a good result.
IR Theory: Relevance Feedback. Relevance Feedback: Example  Initial Results Search Engine2.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
Vector Space Models.
More on Document Similarity and Clustering How similar are these two documents (Again) ? Are these two documents about the same topic ?
Relevance Feedback Hongning Wang
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
IR 6 Scoring, term weighting and the vector space model.
Sampath Jayarathna Cal Poly Pomona
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Lecture 12: Relevance Feedback & Query Expansion - II
Clustering medical and biomedical texts – document map based approach
Vector-Space (Distributional) Lexical Semantics
Multimedia Information Retrieval
Special Topics on Information Retrieval
Relevance Feedback Hongning Wang
Information Retrieval
Representation of documents and queries
موضوع پروژه : بازیابی اطلاعات Information Retrieval
John Lafferty, Chengxiang Zhai School of Computer Science
Relevance Feedback & Query Expansion
CS 430: Information Discovery
Chapter 5: Information Retrieval and Web Search
CS 4501: Information Retrieval
Mining Anchor Text for Query Refinement
CS 430: Information Discovery
Relevance Feedback and Query Modification
Relevance and Reinforcement in Interactive Browsing
Zhixiang Chen & Xiannong Meng U.Texas-PanAm & Bucknell Univ.
INF 141: Information Retrieval
Information Retrieval and Web Design
Information Retrieval and Web Design
CS 430: Information Discovery
Presentation transcript:

Retrieval Utilities Relevance feedback Clustering Passage-based Retrieval Parsing N-grams Thesauri Semantic Networks Regression Analysis

Relevance Feedback Do the retrieval in multiple steps User refines the query at each step wrt the results of the previous queries User tells the IR system which documents are relevant New terms are added to the query based on the feedback Term weights may be updated based on the user feedback

Relevance Feedback Bypass the user for relevance feedback by Assuming the top-k results in the ranked list are relevant Modify the original query as done before

Relevance Feedback Example: “find information surrounding the various conspiracy theories about the assassination of John F. Kennedy” (Example from your text book) IF the highly ranked document contains the term “Oswald” then this needs to be added to the initial query If the term “assassination” appears in the top ranked document, then its weight should be increased.

Relevance Feedback in Vector Space Model Q is the original query R is the set of relevant and S is the set of irrelevant documents selected by the user |R| = n1, |S| = n2

Relevance Feedback in Vector Space Model Q is the original query R is the set of relevant and S is the set of irrelevant documents selected by the user |R| = n1, |S| = n2 In general The weights are referred to as Rocchio weights

Relevance Feedback in Vector Space Model What if the original query retrieves only non-relevant documents (determined by the user)? Then increase the weight of the most frequently occurring term in the document collection.

Relevance Feedback in Vector Space Model Result set clustering can be used as a utility for relevance feedback. Hierarchical clustering can be used for that purpose where the distance is defined by the cosine similarity