Download presentation
Presentation is loading. Please wait.
Published byMike Carver Modified over 9 years ago
1
Focused Relevance Feedback Timothy Chappell (t.chappell@connect.qut.edu.au)t.chappell@connect.qut.edu.au Supervisor: Shlomo Geva
2
Introduction
3
Query-based Information Retrieval Standard search engine paradigm Based on a model of users searching for documents by predicting patterns of text Require users to have a fairly good idea of what they are looking for
4
Relevance Feedback Starts with a normal search query User examines the results returned and provides the search engine with feedback Search engine uses the feedback to learn more about what the user is looking for Search engine modifies the user’s query to return more relevant results
5
Relevance Feedback Document-level approach 1/6 – User enters query into search engine, search engine produces a ranked list of results that match the query 2/6 – Search engine presents the top-ranking results to the user 3/6 – User looks at the results and marks them as relevant/not relevant 4/6 – Search engine utilises feedback, finding other documents that are similar to the relevant results 5/6 – Search engine reranks the remaining documents with the relevance information 6/6 – The new top-ranking results are presented to the user
6
Focused Relevance Feedback Research aimed at determining whether relevance feedback approaches are effective when applied at a higher resolution Users provide feedback in terms of relevant passages, not documents
7
Focused Relevance Feedback Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui ipsum Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui ipsum At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium voluptatum deleniti atque corrupti quos dolores et quas molestias excepturi sint occaecati cupiditate non provident, similique sunt in culpa qui officia deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio. Nam libero tempore, cum soluta nobis est eligendi optio At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium voluptatum deleniti atque corrupti quos dolores et quas molestias excepturi sint occaecati cupiditate non provident, similique sunt in culpa qui officia deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio. Nam libero tempore, cum soluta nobis est eligendi optio
8
Methodology
9
Evaluation Standard method for evaluating the effectiveness of ranking systems is average precision over recall. Recall: the % of relevant documents returned Precision: the % of documents returned that are relevant Ruthven and Lalmas, 2003 [5]
10
Test Collection Test collection that approximates ‘real-world data’ needed The Wikipedia XML Corpus, a collection of documents from Wikipedia converted to XML was used Denoyer and Gallinari, 2006 [1]
11
Relevance Assessments Assessment data from the INEX 2008 Ad Hoc track used to provide relevance information Data consists of segments of text that users have identified as being relevant in relation to particular topics Used for both simulating user feedback for the focused relevance feedback algorithms and evaluating their performance Kamps, Geva, Trotman, Woodley and Koolen, 2009 [3]
12
Evaluation Platform Need for a consistent platform for evaluating different relevance feedback algorithms Written in C and Java Relevance feedback algorithms developed as plugins for the evaluation platform (as dynamic libraries) Will be used as the basis for a new track in INEX 2010.
13
Evaluation Platform Evaluation Platform Document Collection Assessments Relevance Feedback Algorithm Evaluation platform provides a set of documents and a topic/query to relevance feedback algorithm. Simulates a user interacting with the system
14
Evaluation Platform The plugins are evaluated side by side over a run of 20 different topics For each result returned by the relevance feedback algorithm, a line of a TREC or INEX run is output trec_eval, inex_eval or internal metrics can be used to evaluate algorithm performance
15
Relevance Feedback algorithms Written in Java using the Apache Lucene search engine as a base Most effective algorithm tested was based on the Rocchio relevance feedback approach Tested against the University of Waterloo's Okapi BM25 run from INEX 2008, BICER, which was the best-performing in-context engine Jakarta, 2004[2] Rocchio, 1971[4]
16
Relevance Feedback algorithms RecallLuceneRocchioBICER 0%68.32%75.62%95.15% 10%39.48%70.42%62.65% 20%35.95%70.42%51.06% 30%34.29%69.81%45.38% 40%32.52%61.55%41.84% 50%32.13%54.73%39.63% 60%31.72%51.13%38.27% 70%31.24%47.14%35.97% 80%30.38%41.91%30.60% 90%26.98%35.28%22.43% 100%21.43%21.78%2.70% Average precision30.34%51.22%38.96% R-Precision24.32%49.89%38.82%
17
Relevance Feedback algorithms
18
References [1]L. Denoyer and P. Gallinari. The Wikipedia XML Corpus. In SigIR Forum, 2006. [2]A. Jakarta. Apache Lucene-a high-performance, full-featured text search engine library, 2004. [3]J. Kamps, S. Geva, A. Trotman, A. Woodley and M. Koolen. Overview of the INEX 2008 ad hoc track. In Advances in Focused Retrieval, pages 1-28. Springer, 2009.
19
References [4]J. J. Rocchio. Relevance feedback in information retrieval. In The SMART retrieval system - experiments in automatic document processing, pages 313-323. Englewood Cliffs, NJ: Prentice-Hall, 1971. [5]I. Ruthven and M. Lalmas. A survey on the use of relevance feedback for information access systems. Knowl. Eng. Rev., 18(2):95-145, 2003.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.