Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thomas Mandl: Robust CLEF 2007 - Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim

Similar presentations


Presentation on theme: "Thomas Mandl: Robust CLEF 2007 - Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim"— Presentation transcript:

1 Thomas Mandl: Robust CLEF 2007 - Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim mandl@uni-hildesheim.de 8 th Workshop of the Cross-Language Evaluation Forum (CLEF) Budapest 19 Sept. 2007 Robust Task - Result Overview and Lessons Learned from Robustness Evaluation

2 Thomas Mandl: Robust CLEF 2007 - Overview 2Robust?Robust?

3 3 Robustness Metaphorically A robust tool works under a variety of conditions

4 Thomas Mandl: Robust CLEF 2007 - Overview 4Robustness?Robustness? Robust … means … capable of functioning correctly, (or at the very minimum, not failing catastrophically) under a great many conditions. (http://www.reference.com/) Robust IR means the capability of an IR system to work well (and reach at least a minimal performance) under a variety of conditions (topics, difficulty, collections, users, languages …)

5 Thomas Mandl: Robust CLEF 2007 - Overview 5 Variety of conditions … Variance between topics

6 Thomas Mandl: Robust CLEF 2007 - Overview 6 System Variance

7 Thomas Mandl: Robust CLEF 2007 - Overview 7 History of Robust IR Evaluation TREC –Mono-lingual Retrieval –2003 - 2005 CLEF –Mono-, bi- and Multilingual Retrieval –2006 six languages –2007 three languages

8 Thomas Mandl: Robust CLEF 2007 - Overview 8 Robust Task 2007 Again … –Use topics and relevance assessment from previous CLEF campaigns –Take a different perspective and use a robust evaluation measure (GMAP) –Emphasize the diffficult (= low performing) topics

9 Thomas Mandl: Robust CLEF 2007 - Overview 9 Training and Test CLEF 2001, 2002 and 2003 for training CLEF 2004, 2005 and 2006 for testing

10 Thomas Mandl: Robust CLEF 2007 - Overview 10CollectionsCollections LanguageTarget CollectionTraining Topics Test Topics EnglishLos Angeles Times 199441-200251-350 FrenchLe Monde 1994 Swiss News Agency 94 41-140251-350 Portuguese P ú blico 1995 -201-350

11 Thomas Mandl: Robust CLEF 2007 - Overview 11 Robust Task 2007 3 languages (collections and topics) 3 mono-lingual tasks 1 bi-lingual task (English to French) some 300,000 documents about 1 gigabyte of text

12 Thomas Mandl: Robust CLEF 2007 - Overview 12ParticipationParticipation 63 runs submitted by 7 groups 2006: 133 runs by 8 groups

13 Thomas Mandl: Robust CLEF 2007 - Overview 13ResultsResults

14 Thomas Mandl: Robust CLEF 2007 - Overview 14 Results Mono English

15 Thomas Mandl: Robust CLEF 2007 - Overview 15 Results Mono Portuguese

16 Thomas Mandl: Robust CLEF 2007 - Overview 16ResultsResults

17 Thomas Mandl: Robust CLEF 2007 - Overview 17 Results Mono French

18 Thomas Mandl: Robust CLEF 2007 - Overview 18 Results Bi-lingual X -> French

19 Thomas Mandl: Robust CLEF 2007 - Overview 19ApproachesApproaches Adoption of traditional and “advanced” CLIR methods –BM 25 (Miracle) –N-gram translation (CoLesIR) – (Uni NE) Adoption of “robust” heuristics –Expansion with an external resource (SINAI)

20 Thomas Mandl: Robust CLEF 2007 - Overview 20 Percentage of Bad Topics Mono PTMono ENMono FRBi -> FR Best System26171823 Average32272025 Percentage of Topics which received an MAP below 0.1

21 Thomas Mandl: Robust CLEF 2007 - Overview 21TopicsTopics Large improvements are still possible Difficult topics can be solved better TaskTopicAverageBest SystemSystem Nr. 1 Mono PT2220.01080.04780.0183 Mono EN2660.02170.11200.0357 Mono FR1920.01570.02470.0160 Bi -> FR2820.03420.1588

22 Thomas Mandl: Robust CLEF 2007 - Overview 22 Correlation between Measures? Often IR measures correlation highly For a larger topic set – as used in the robust task – the correlation might be even higher –More topics make a test more reliable If correlation is high, it makes no sense to use alternative measures

23 Thomas Mandl: Robust CLEF 2007 - Overview 23 Analysis with Reduced Topic Sets Mono-lingual English

24 Thomas Mandl: Robust CLEF 2007 - Overview 24 Analysis with Reduced Topic Sets Bi-lingual -> FR

25 Thomas Mandl: Robust CLEF 2007 - Overview 25 Analysis with Reduced Topic Sets Mono-lingual Portuguese

26 Thomas Mandl: Robust CLEF 2007 - Overview 26 Analysis with Reduced Topic Sets Mono-lingual French

27 Thomas Mandl: Robust CLEF 2007 - Overview 27 Analysis with Reduced Topic Sets Multi-lingual 2006

28 Thomas Mandl: Robust CLEF 2007 - Overview 28 Robust 2006 MAP GMAP

29 Thomas Mandl: Robust CLEF 2007 - Overview 29 Thanks for your Attention


Download ppt "Thomas Mandl: Robust CLEF 2007 - Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim"

Similar presentations


Ads by Google