Thomas Mandl: Robust CLEF 2007 - Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim

Slides:

Advertisements

Similar presentations

SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.

Advertisements

PRES A Score Metric for Evaluating Recall- Oriented IR Applications Walid Magdy Gareth Jones Dublin City University SIGIR, 22 July 2010.

Thomas Mandl: Robust CLEF Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim

Current and Future Research Directions University of Tehran Database Research Group 1 October 2009 Abolfazl AleAhmad, Ehsan Darrudi, Hadi.

CLEF 2008 Multilingual Question Answering Track UNED Anselmo Peñas Valentín Sama Álvaro Rodrigo CELCT Danilo Giampiccolo Pamela Forner.

Cross Language IR Philip Resnik Salim Roukos Workshop on Challenges in Information Retrieval and Language Modeling Amherst, Massachusetts, September 11-12,

Information Access I Measurement and Evaluation GSLT, Göteborg, October 2003 Barbara Gawronska, Högskolan i Skövde.

Web Logs and Question Answering Richard Sutcliffe 1, Udo Kruschwitz 2, Thomas Mandl University of Limerick, Ireland 2 - University of Essex, UK 3.

Advance Information Retrieval Topics Hassan Bashiri.

CLEF 2007 Multilingual Question Answering Track Danilo Giampiccolo, CELCT Anselmo Peñas, UNED.

Cross-Language Retrieval INST 734 Module 11 Doug Oard.

 Official Site: facility.org/research/evaluation/clef-ip-10http:// facility.org/research/evaluation/clef-ip-10.

Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.

Spanish Question Answering Evaluation Anselmo Peñas, Felisa Verdejo and Jesús Herrera UNED NLP Group Distance Learning University of Spain CICLing 2004,

LREC Combining Multiple Models for Speech Information Retrieval Muath Alzghool and Diana Inkpen University of Ottawa Canada.

A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.

Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.

Evaluating Cross-language Information Retrieval Systems Carol Peters IEI-CNR.

The Evolution of Shared-Task Evaluation Douglas W. Oard College of Information Studies and UMIACS University of Maryland, College Park, USA December 4,

August 21, 2002Szechenyi National Library Support for Multilingual Information Access Douglas W. Oard College of Information Studies and Institute for.

CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.

22 August 2003CLEF 2003 Answering Spanish Questions from English Documents Abdessamad Echihabi, Douglas W. Oard, Daniel Marcu, Ulf Hermjakob USC Information.

Impressions of 10 years of CLEF Donna Harman Scientist Emeritus National Institute of Standards and Technology.

1 The Domain-Specific Track at CLEF 2008 Vivien Petras & Stefan Baerisch GESIS Social Science Information Centre, Bonn, Germany Aarhus, Denmark, September.

CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( The Multiple Language Question Answering Track at CLEF 2003.

Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?

Some Personal Observations Donna Harman NIST. Language issues I see learning about accessing information both within and across different languages as.

1 Cross-Lingual Query Suggestion Using Query Logs of Different Languages SIGIR 07.

CLEF 2004 – Interactive Xling Bookmarking, thesaurus, and cooperation in bilingual Q & A Jussi Karlgren – Preben Hansen –

“ SINAI at CLEF 2005 : The evolution of the CLEF2003 system.” Fernando Martínez-Santiago Miguel Ángel García-Cumbreras University of Jaén.

Cross-Language Evaluation Forum CLEF Workshop 2004 Carol Peters ISTI-CNR, Pisa, Italy.

The CLEF 2003 cross language image retrieval task Paul Clough and Mark Sanderson University of Sheffield

Cross-Language Evaluation Forum (CLEF) IST Expected Kick-off Date: August 2001 Carol Peters IEI-CNR, Pisa, Italy Carol Peters: blabla Carol.

MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.

IIIT Hyderabad’s CLIR experiments for FIRE-2008 Sethuramalingam S & Vasudeva Varma IIIT Hyderabad, India 1.

Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim René Hackl, Ralph Kölle, Thomas Mandl, Alexandra Ploedt, Jan-Hendrik Scheufen,

D L T Cross-Language French-English Question Answering using the DLT System at CLEF 2003 Aoife O’Gorman Igal Gabbay Richard F.E. Sutcliffe Documents and.

Thomas Mandl: GeoCLEF Track Overview th Workshop of the Cross-Language Evaluation Forum (CLEF) Århus, 18 th Sept

Using Surface Syntactic Parser & Deviation from Randomness Jean-Pierre Chevallet IPAL I2R Gilles Sérasset CLIPS IMAG.

1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.

How robust is CLIR? Proposal for a new robust task at CLEF Thomas Mandl Information Science Universität Hildesheim 6 th Workshop.

CLEF2003 Forum/ August 2003 / Trondheim / page 1 Report on CLEF-2003 ML4 experiments Extracting multilingual resources from corpora N. Cancedda, H. Dejean,

Information Retrieval at NLC Jianfeng Gao NLC Group, Microsoft Research China.

CLEF 2007 Workshop Budapest, Hungary, 19–21 September 2007 Nicola Ferro Information Management Systems (IMS) Research Group Department of Information Engineering.

Clarity Cross-Lingual Document Retrieval, Categorisation and Navigation Based on Distributed Services

CLEF Kerkyra Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Arantxa Otegi UNIPD: Giorgio Di Nunzio UH: Thomas Mandl.

Cross-Language Evaluation Forum CLEF 2003 Carol Peters ISTI-CNR, Pisa, Italy Martin Braschler Eurospider Information Technology AG.

1 Flexible and Efficient Toolbox for Information Retrieval MIRACLE group José Miguel Goñi-Menoyo (UPM) José Carlos González-Cristóbal (UPM-Daedalus) Julio.

Thomas Mandl: GeoCLEF Track Overview Cross-Language Evaluation Forum (CLEF) Thomas Mandl, (U. Hildesheim) 8 th Workshop.

QA Pilot Task at CLEF 2004 Jesús Herrera Anselmo Peñas Felisa Verdejo UNED NLP Group Cross-Language Evaluation Forum Bath, UK - September 2004.

Stiftung Wissenschaft und Politik German Institute for International and Security Affairs CLEF 2005: Domain-Specific Track Overview Michael Kluck SWP,

Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq

What’s happening in iCLEF? (the iCLEF Flickr Challenge) Julio Gonzalo (UNED), Paul Clough (U. Sheffield), Jussi Karlgren (SICS), Javier Artiles (UNED),

The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.

1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.

The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.

Combining Text and Image Queries at ImageCLEF2005: A Corpus-Based Relevance-Feedback Approach Yih-Cheng Chang Department of Computer Science and Information.

Multilingual Search Shibamouli Lahiri

1 The Domain-Specific Track at CLEF 2007 Vivien Petras, Stefan Baerisch & Max Stempfhuber GESIS Social Science Information Centre, Bonn, Germany Budapest,

Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,

1 INFILE - INformation FILtering Evaluation Evaluation of adaptive filtering systems for business intelligence and technology watch Towards real use conditions.

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 1 Introduction to the Monolingual and Domain-Specific Tasks of the Cross-language.

LEARNING IN A PAIRWISE TERM-TERM PROXIMITY FRAMEWORK FOR INFORMATION RETRIEVAL Ronan Cummins, Colm O’Riordan (SIGIR’09) Speaker : Yi-Ling Tai Date : 2010/03/15.

CLEF Budapest1 Measuring the contribution of Word Sense Disambiguation for QA Proposers: UBC: Agirre, Lopez de Lacalle, Otegi, Rigau, FBK: Magnini.

Multilingual Search using Query Translation and Collection Selection Jacques Savoy, Pierre-Yves Berger University of Neuchatel, Switzerland

Walid Magdy Gareth Jones

College of Information

SPANISH FAMILY NIGHT NOCHE FAMILIAR EN ESPANOL

Cheshire at GeoCLEF 2008: Text and Fusion Approaches for GIR

CLEF 2008 Multilingual Question Answering Track

Presentation transcript:

Thomas Mandl: Robust CLEF Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim 8 th Workshop of the Cross-Language Evaluation Forum (CLEF) Budapest 19 Sept Robust Task - Result Overview and Lessons Learned from Robustness Evaluation

Thomas Mandl: Robust CLEF Overview 2Robust?Robust?

3 Robustness Metaphorically A robust tool works under a variety of conditions

Thomas Mandl: Robust CLEF Overview 4Robustness?Robustness? Robust … means … capable of functioning correctly, (or at the very minimum, not failing catastrophically) under a great many conditions. ( Robust IR means the capability of an IR system to work well (and reach at least a minimal performance) under a variety of conditions (topics, difficulty, collections, users, languages …)

Thomas Mandl: Robust CLEF Overview 5 Variety of conditions … Variance between topics

Thomas Mandl: Robust CLEF Overview 6 System Variance

Thomas Mandl: Robust CLEF Overview 7 History of Robust IR Evaluation TREC –Mono-lingual Retrieval – CLEF –Mono-, bi- and Multilingual Retrieval –2006 six languages –2007 three languages

Thomas Mandl: Robust CLEF Overview 8 Robust Task 2007 Again … –Use topics and relevance assessment from previous CLEF campaigns –Take a different perspective and use a robust evaluation measure (GMAP) –Emphasize the diffficult (= low performing) topics

Thomas Mandl: Robust CLEF Overview 9 Training and Test CLEF 2001, 2002 and 2003 for training CLEF 2004, 2005 and 2006 for testing

Thomas Mandl: Robust CLEF Overview 10CollectionsCollections LanguageTarget CollectionTraining Topics Test Topics EnglishLos Angeles Times FrenchLe Monde 1994 Swiss News Agency Portuguese P ú blico

Thomas Mandl: Robust CLEF Overview 11 Robust Task languages (collections and topics) 3 mono-lingual tasks 1 bi-lingual task (English to French) some 300,000 documents about 1 gigabyte of text

Thomas Mandl: Robust CLEF Overview 12ParticipationParticipation 63 runs submitted by 7 groups 2006: 133 runs by 8 groups

Thomas Mandl: Robust CLEF Overview 13ResultsResults

Thomas Mandl: Robust CLEF Overview 14 Results Mono English

Thomas Mandl: Robust CLEF Overview 15 Results Mono Portuguese

Thomas Mandl: Robust CLEF Overview 16ResultsResults

Thomas Mandl: Robust CLEF Overview 17 Results Mono French

Thomas Mandl: Robust CLEF Overview 18 Results Bi-lingual X -> French

Thomas Mandl: Robust CLEF Overview 19ApproachesApproaches Adoption of traditional and “advanced” CLIR methods –BM 25 (Miracle) –N-gram translation (CoLesIR) – (Uni NE) Adoption of “robust” heuristics –Expansion with an external resource (SINAI)

Thomas Mandl: Robust CLEF Overview 20 Percentage of Bad Topics Mono PTMono ENMono FRBi -> FR Best System Average Percentage of Topics which received an MAP below 0.1

Thomas Mandl: Robust CLEF Overview 21TopicsTopics Large improvements are still possible Difficult topics can be solved better TaskTopicAverageBest SystemSystem Nr. 1 Mono PT Mono EN Mono FR Bi -> FR

Thomas Mandl: Robust CLEF Overview 22 Correlation between Measures? Often IR measures correlation highly For a larger topic set – as used in the robust task – the correlation might be even higher –More topics make a test more reliable If correlation is high, it makes no sense to use alternative measures

Thomas Mandl: Robust CLEF Overview 23 Analysis with Reduced Topic Sets Mono-lingual English

Thomas Mandl: Robust CLEF Overview 24 Analysis with Reduced Topic Sets Bi-lingual -> FR

Thomas Mandl: Robust CLEF Overview 25 Analysis with Reduced Topic Sets Mono-lingual Portuguese

Thomas Mandl: Robust CLEF Overview 26 Analysis with Reduced Topic Sets Mono-lingual French

Thomas Mandl: Robust CLEF Overview 27 Analysis with Reduced Topic Sets Multi-lingual 2006

Thomas Mandl: Robust CLEF Overview 28 Robust 2006 MAP GMAP

Thomas Mandl: Robust CLEF Overview 29 Thanks for your Attention