SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.

Slides:



Advertisements
Similar presentations
Relevance Feedback User tells system whether returned/disseminated documents are relevant to query/information need or not Feedback: usually positive sometimes.
Advertisements

Introduction to Information Retrieval
Introduction to Information Retrieval (Part 2) By Evren Ermis.
K nearest neighbor and Rocchio algorithm
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
SLIDE 1IS 202 – FALL 2004 Lecture 29: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
Modern Information Retrieval
SLIDE 1IS 240 – Spring 2009 Prof. Ray Larson University of California, Berkeley School of Information Principles of Information Retrieval.
Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.
SLIDE 1IS 202 – FALL 2004 Lecture 13: Midterm Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am -
SLIDE 1IS 202 – FALL 2003 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003
Modern Information Retrieval Chapter 5 Query Operations.
1 Query Language Baeza-Yates and Navarro Modern Information Retrieval, 1999 Chapter 4.
SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002
SLIDE 1IS 202 – FALL 2004 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2004
Evaluating the Performance of IR Sytems
Query Reformulation: User Relevance Feedback. Introduction Difficulty of formulating user queries –Users have insufficient knowledge of the collection.
9/21/2000Information Organization and Retrieval Ranking and Relevance Feedback Ray Larson & Marti Hearst University of California, Berkeley School of Information.
Relevance Feedback Main Idea:
MARS: Applying Multiplicative Adaptive User Preference Retrieval to Web Search Zhixiang Chen & Xiannong Meng U.Texas-PanAm & Bucknell Univ.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
Chapter 5: Information Retrieval and Web Search
Search and Retrieval: Relevance and Evaluation Prof. Marti Hearst SIMS 202, Lecture 20.
Evaluation David Kauchak cs458 Fall 2012 adapted from:
Evaluation David Kauchak cs160 Fall 2009 adapted from:
Modern Information Retrieval Relevance Feedback
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement and Relevance Feedback.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Query Operations J. H. Wang Mar. 26, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text.
CS276A Text Information Retrieval, Mining, and Exploitation Lecture 10 7 Nov 2002.
1 Query Operations Relevance Feedback & Query Expansion.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
Chapter 6: Information Retrieval and Web Search
CS 533 Information Retrieval Systems.  Introduction  Connectivity Analysis  Kleinberg’s Algorithm  Problems Encountered  Improved Connectivity Analysis.
1 Computing Relevance, Similarity: The Vector Space Model.
CPSC 404 Laks V.S. Lakshmanan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides at UC-Berkeley.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
IR Theory: Relevance Feedback. Relevance Feedback: Example  Initial Results Search Engine2.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
 A Case For Interaction: A Study of Interactive Information Retrieval Behavior and Effectiveness By Jürgen Koenemann and Nicholas J. Belkin (1996) John.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
SLIDE 1IS 240 – Spring 2013 Prof. Ray Larson University of California, Berkeley School of Information Principles of Information Retrieval.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Information Retrieval
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
Relevance Feedback Hongning Wang
Relevance Feedback Prof. Marti Hearst SIMS 202, Lecture 24.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.
1 CS 430: Information Discovery Lecture 21 Interactive Retrieval.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
Lecture 9: Query Expansion. This lecture Improving results For high recall. E.g., searching for aircraft doesn’t match with plane; nor thermodynamic with.
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
Sampath Jayarathna Cal Poly Pomona
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Lecture 12: Relevance Feedback & Query Expansion - II
Relevance Feedback Hongning Wang
Thanks to Bill Arms, Marti Hearst
IR Theory: Evaluation Methods
Web Information retrieval (Web IR)
Relevance and Reinforcement in Interactive Browsing
Zhixiang Chen & Xiannong Meng U.Texas-PanAm & Bucknell Univ.
Presentation transcript:

SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000

Today l Relevance Feedback (RF) –Review the Rocchio formula –Evaluation of an RF interface –(this is an intro to user interfaces in IR) –The Guides System

Information need Index Pre-process Parse Collections Rank Query text input

Information need Index Pre-process Parse Collections Rank Query text input Query Modification

l Problem: how to reformulate the query? –Thesaurus expansion: »Suggest terms similar to query terms –Relevance feedback: »Suggest terms (and documents) similar to retrieved documents that have been judged to be relevant

Relevance Feedback Also known as: –query modification – “more like this”

Relevance Feedback l Main Idea: –Modify existing query based on relevance judgements »Extract terms from relevant documents and add them to the query »and/or re-weight the terms already in the query –Two main approaches: »Automatic (psuedo-relevance feedback) »Users select relevant documents –Users/system select terms from an automatically-generated list

Relevance Feedback l Usually do both: –expand query with new terms –re-weight terms in query l There are many variations –usually positive weights for terms from relevant docs –sometimes negative weights for terms from non-relevant docs

Rocchio Method

l Rocchio automatically –re-weights terms –adds in new terms (from relevant docs) »have to be careful when using negative terms »Rocchio is not a machine learning algorithm l Most methods perform similarly –results heavily dependent on test collection l Machine learning methods are proving to work better than standard IR approaches like Rocchio

Using Relevance Feedback l Known to improve results –in TREC-like conditions (no user involved) l What about with a user in the loop? –How might you measure this? –Let’s examine a user study of relevance feedback by Koenneman & Belkin 1996.

Questions being Investigated Koenemann & Belkin 96 l How well do users work with statistical ranking on full text? l Does relevance feedback improve results? l Is user control over operation of relevance feedback helpful? l How do different levels of user control effect results?

How much of the guts should the user see? l Opaque (black box) –(like web search engines) l Transparent –(see available terms after the r.f. ) l Penetrable –(see suggested terms before the r.f.) l Which do you think worked best?

Terms available for relevance feedback made visible (from Koenemann & Belkin)

Details on User Study Koenemann & Belkin 96 l Subjects have a tutorial session to learn the system l Their goal is to keep modifying the query until they’ve developed one that gets high precision l This is an example of a routing query (as opposed to ad hoc) l Reweighting: –They did not reweight query terms –Instead, only term expansion »pool all terms in rel docs »take top N terms, where »n = 3 + (number-marked-relevant-docs*2) »(the more marked docs, the more terms added to the query)

Details on User Study Koenemann & Belkin 96 l 64 novice searchers –43 female, 21 male, native English l TREC test bed –Wall Street Journal subset l Two search topics –Automobile Recalls –Tobacco Advertising and the Young l Relevance judgements from TREC and experimenter l System was INQUERY (vector space with some bells and whistles)

Sample TREC query

Evaluation l Precision at 30 documents l Baseline: (Trial 1) –How well does initial search go? –One topic has more relevant docs than the other l Experimental condition (Trial 2) –Subjects get tutorial on relevance feedback –Modify query in one of four modes »no r.f., opaque, transparent, penetration

Precision vs. RF condition (from Koenemann & Belkin 96)

Effectiveness Results l Subjects with R.F. did 17-34% better performance than no R.F. l Subjects with penetration case did 15% better as a group than those in opaque and transparent cases.

Number of iterations in formulating queries ( from Koenemann & Belkin 96)

Number of terms in created queries (from Koenemann & Belkin 96)

Behavior Results l Search times approximately equal l Precision increased in first few iterations l Penetration case required fewer iterations to make a good query than transparent and opaque l R.F. queries much longer –but fewer terms in penetrable case -- users were more selective about which terms were added in.

The Guides System l An innovative search interface –Character-oriented relevance feedback –Note use of the vector space model »Each character gets a vector »User’s activity defines the user’s vector »The character whose vector is closest at any time becomes the active one –(Show Video) –Developed at Apple in the 80’s by Abbe Don, Tim Oren, and Brenda Laurel –

Relevance Feedback Summary l Iterative query modification can improve precision and recall for a standing query l In at least one study, users were able to make good choices by seeing which terms were suggested for R.F. and selecting among them l So … “more like this” can be useful!