Skövde, Jan 19. -2004Information Access: Leif Grönqvist1 Systematic Evaluation of Swedish IR Systems using a Relevance Judged Document Collection Leif.

Slides:



Advertisements
Similar presentations
ELibrary Topic Search Basics eLibrary topic search allows users to locate articles and multimedia resources –Relevant to K-12 curricula and user.
Advertisements

Rationale for a multilingual corpus for machine translation evaluation Debbie Elliott Anthony Hartley Eric Atwell Corpus Linguistics 2003, Lancaster, England.
Even More TopX: Relevance Feedback Ralf Schenkel Joint work with Osama Samodi, Martin Theobald.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
1 Evaluations in information retrieval. 2 Evaluations in information retrieval: summary The following gives an overview of approaches that are applied.
SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Chapter 5: Introduction to Information Retrieval
Växjö: 23. Jan -04Evaluation of Vector Space...1 Evaluation of Vector Space Models Obtained by Latent Semantic Indexing Leif Grönqvist
Lecture 11 Search, Corpora Characteristics, & Lucene Introduction.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
Overview of Collaborative Information Retrieval (CIR) at FIRE 2012 Debasis Ganguly, Johannes Leveling, Gareth Jones School of Computing, CNGL, Dublin City.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Search Engines and Information Retrieval
Modern Information Retrieval
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
Göteborg 26. Jan -04Evaluation of Vector Space Models Obtained by Latent Semantic Indexing1 Leif Grönqvist Växjö University (Mathematics.
1 Configurable Indexing and Ranking for XML Information Retrieval Shaorong Liu, Qinghua Zou and Wesley W. Chu UCLA Computer Science Department {sliu, zou,
INFO 624 Week 3 Retrieval System Evaluation
Information Access I Measurement and Evaluation GSLT, Göteborg, October 2003 Barbara Gawronska, Högskolan i Skövde.
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
Learning to Advertise. Introduction Advertising on the Internet = $$$ –Especially search advertising and web page advertising Problem: –Selecting ads.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
Evaluating the Performance of IR Sytems
Computer comunication B Information retrieval Repetition Retrieval models Wildcards Web information retrieval Digital libraries.
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
ISP 433/633 Week 6 IR Evaluation. Why Evaluate? Determine if the system is desirable Make comparative assessments.
Important Task in Patents Retrieval Recall is an Important Factor Given Query Patent -> the Task is to Search all Related Patents Patents have Complex.
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
Search and Retrieval: Relevance and Evaluation Prof. Marti Hearst SIMS 202, Lecture 20.
Search Engines and Information Retrieval Chapter 1.
CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.
Minimal Test Collections for Retrieval Evaluation B. Carterette, J. Allan, R. Sitaraman University of Massachusetts Amherst SIGIR2006.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
CLEF 2004 – Interactive Xling Bookmarking, thesaurus, and cooperation in bilingual Q & A Jussi Karlgren – Preben Hansen –
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Evaluation INST 734 Module 5 Doug Oard. Agenda Evaluation fundamentals  Test collections: evaluating sets Test collections: evaluating rankings Interleaving.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Chapter 6: Information Retrieval and Web Search
Leif Grönqvist 1 Tagging a Corpus of Spoken Swedish Leif Grönqvist Växjö University School of Mathematics and Systems Engineering
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Basic Implementation and Evaluations Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Information Retrieval
Semi-automatic Product Attribute Extraction from Store Website
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Indri at TREC 2004: UMass Terabyte Track Overview Don Metzler University of Massachusetts, Amherst.
Relevance Feedback Prof. Marti Hearst SIMS 202, Lecture 24.
Information Retrieval in Practice
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Walid Magdy Gareth Jones
Multimedia Information Retrieval
IR Theory: Evaluation Methods
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Cumulated Gain-Based Evaluation of IR Techniques
Relevance and Reinforcement in Interactive Browsing
Retrieval Evaluation - Reference Collections
Presentation transcript:

Skövde, Jan Information Access: Leif Grönqvist1 Systematic Evaluation of Swedish IR Systems using a Relevance Judged Document Collection Leif Grönqvist GSLT, For the GSLT course: Information Access, 2003

Skövde, Jan Information Access: Leif Grönqvist2 Overview Introduction Design of testbed –Documents –Topics –Relevance judgments Experiment setup Evaluation Conclusion

Skövde, Jan Information Access: Leif Grönqvist3 Introduction The classical IR task: find an ordered list of documents relevant to a user query Difficult to evaluate –Relevance is subjective –Different depending on context –But very important! Test collections for Swedish not very common: CLEF, ?

Skövde, Jan Information Access: Leif Grönqvist4 Why Swedish? Very different from English: Compounds without spaces “New” letters (åäö) Complex morphology Different tokenization Other stop words

Skövde, Jan Information Access: Leif Grönqvist5 The test collection Documents Topics Relevance judgments

Skövde, Jan Information Access: Leif Grönqvist6 Document collection Newspaper articles from GP and HD articles, 40 MTokens Good to have more than one newspaper: –Same content, different author (not always) 10% of my newspaper article collection Copyright is a problem

Skövde, Jan Information Access: Leif Grönqvist7 Topics Borrowed from CLEF 52/90, but not the most difficult Examples: –Filmer av bröderna Kaurismäki. Description: Sök efter information om filmer som regisserats av någon av de båda bröderna Aki och Mika Kaurismäki. Narrative: Relevanta dokument namnger en eller flera titlar på filmer som regisserats av Aki eller Mika Kaurismäki. –Finlands första EU-kommissionär Description: Vem utsågs att vara den första EU- kommissionären för Finland i Europeiska unionen? Narrative: Ange namnet på Finlands första EU-kommissionär. Relevanta dokument kan också nämna sakområdena för den nya kommissionärens uppdrag.

Skövde, Jan Information Access: Leif Grönqvist8 Relevance judgments Only a subset for each topic –Selected by earlier experiments –Similar approach to TREC and CLEF 100 documents for 5 strategies: –100  N  500 –Important to include relevant and irrelevant documents A scale of relevance proposed by Sormonen: Irrelevant (0)  Marginally relevant (1)  Fairly relevant (2)  Highly relevant (3) Manually annotated

Skövde, Jan Information Access: Leif Grönqvist9 Statistics Some difficult topics got very few relevant documents

Skövde, Jan Information Access: Leif Grönqvist10 Statistics per relevance category

Skövde, Jan Information Access: Leif Grönqvist11 The InQuery system Handles big document collections performs all the indexing Batch runs for many setups Standard format outputs to fit evaluation presentation software

Skövde, Jan Information Access: Leif Grönqvist12 Our system setups We want to test an ordinary IR system using some common term weighting as baseline Compared to combined systems using one or many of: –The MALT tagger from Växjö: fast and therefore suitable for large IR systems –A stemmer, maybe Carlberger et. al. –A stop list: out own Tagging is not trivial to use Maybe more features added later

Skövde, Jan Information Access: Leif Grönqvist13 Evaluation metrics Recall & precision is problematic: –Ranked lists – how much better is position 1 than pos 5 and 10? –How long should the lists be? –Relevance scale – how much better is “highly relevant” than “fairly relevant” –What about the unknown documents not judged? Too many unknown leads to more manual judgments…

Skövde, Jan Information Access: Leif Grönqvist14 Conclusion A testbed for IR, but under construction 1890/9848 documents relevant to a topic We will test if stop lists, stemming, and/or tagging improves document search –Not just a binary relevance measure –Swedish Precision & recall is problematic

Skövde, Jan Information Access: Leif Grönqvist15 Thank you! Questions And probably suggestions