Download presentation
Presentation is loading. Please wait.
Published byLucas Scott Modified over 8 years ago
1
Thomas Mandl: Robust CLEF 2007 - Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim mandl@uni-hildesheim.de 8 th Workshop of the Cross-Language Evaluation Forum (CLEF) Budapest 19 Sept. 2007 Robust Task - Result Overview and Lessons Learned from Robustness Evaluation
2
Thomas Mandl: Robust CLEF 2007 - Overview 2Robust?Robust?
3
3 Robustness Metaphorically A robust tool works under a variety of conditions
4
Thomas Mandl: Robust CLEF 2007 - Overview 4Robustness?Robustness? Robust … means … capable of functioning correctly, (or at the very minimum, not failing catastrophically) under a great many conditions. (http://www.reference.com/) Robust IR means the capability of an IR system to work well (and reach at least a minimal performance) under a variety of conditions (topics, difficulty, collections, users, languages …)
5
Thomas Mandl: Robust CLEF 2007 - Overview 5 Variety of conditions … Variance between topics
6
Thomas Mandl: Robust CLEF 2007 - Overview 6 System Variance
7
Thomas Mandl: Robust CLEF 2007 - Overview 7 History of Robust IR Evaluation TREC –Mono-lingual Retrieval –2003 - 2005 CLEF –Mono-, bi- and Multilingual Retrieval –2006 six languages –2007 three languages
8
Thomas Mandl: Robust CLEF 2007 - Overview 8 Robust Task 2007 Again … –Use topics and relevance assessment from previous CLEF campaigns –Take a different perspective and use a robust evaluation measure (GMAP) –Emphasize the diffficult (= low performing) topics
9
Thomas Mandl: Robust CLEF 2007 - Overview 9 Training and Test CLEF 2001, 2002 and 2003 for training CLEF 2004, 2005 and 2006 for testing
10
Thomas Mandl: Robust CLEF 2007 - Overview 10CollectionsCollections LanguageTarget CollectionTraining Topics Test Topics EnglishLos Angeles Times 199441-200251-350 FrenchLe Monde 1994 Swiss News Agency 94 41-140251-350 Portuguese P ú blico 1995 -201-350
11
Thomas Mandl: Robust CLEF 2007 - Overview 11 Robust Task 2007 3 languages (collections and topics) 3 mono-lingual tasks 1 bi-lingual task (English to French) some 300,000 documents about 1 gigabyte of text
12
Thomas Mandl: Robust CLEF 2007 - Overview 12ParticipationParticipation 63 runs submitted by 7 groups 2006: 133 runs by 8 groups
13
Thomas Mandl: Robust CLEF 2007 - Overview 13ResultsResults
14
Thomas Mandl: Robust CLEF 2007 - Overview 14 Results Mono English
15
Thomas Mandl: Robust CLEF 2007 - Overview 15 Results Mono Portuguese
16
Thomas Mandl: Robust CLEF 2007 - Overview 16ResultsResults
17
Thomas Mandl: Robust CLEF 2007 - Overview 17 Results Mono French
18
Thomas Mandl: Robust CLEF 2007 - Overview 18 Results Bi-lingual X -> French
19
Thomas Mandl: Robust CLEF 2007 - Overview 19ApproachesApproaches Adoption of traditional and “advanced” CLIR methods –BM 25 (Miracle) –N-gram translation (CoLesIR) – (Uni NE) Adoption of “robust” heuristics –Expansion with an external resource (SINAI)
20
Thomas Mandl: Robust CLEF 2007 - Overview 20 Percentage of Bad Topics Mono PTMono ENMono FRBi -> FR Best System26171823 Average32272025 Percentage of Topics which received an MAP below 0.1
21
Thomas Mandl: Robust CLEF 2007 - Overview 21TopicsTopics Large improvements are still possible Difficult topics can be solved better TaskTopicAverageBest SystemSystem Nr. 1 Mono PT2220.01080.04780.0183 Mono EN2660.02170.11200.0357 Mono FR1920.01570.02470.0160 Bi -> FR2820.03420.1588
22
Thomas Mandl: Robust CLEF 2007 - Overview 22 Correlation between Measures? Often IR measures correlation highly For a larger topic set – as used in the robust task – the correlation might be even higher –More topics make a test more reliable If correlation is high, it makes no sense to use alternative measures
23
Thomas Mandl: Robust CLEF 2007 - Overview 23 Analysis with Reduced Topic Sets Mono-lingual English
24
Thomas Mandl: Robust CLEF 2007 - Overview 24 Analysis with Reduced Topic Sets Bi-lingual -> FR
25
Thomas Mandl: Robust CLEF 2007 - Overview 25 Analysis with Reduced Topic Sets Mono-lingual Portuguese
26
Thomas Mandl: Robust CLEF 2007 - Overview 26 Analysis with Reduced Topic Sets Mono-lingual French
27
Thomas Mandl: Robust CLEF 2007 - Overview 27 Analysis with Reduced Topic Sets Multi-lingual 2006
28
Thomas Mandl: Robust CLEF 2007 - Overview 28 Robust 2006 MAP GMAP
29
Thomas Mandl: Robust CLEF 2007 - Overview 29 Thanks for your Attention
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.