CLEF Workshop ECDL 2003 Trondheim 21.-22.08.2003 Michael Kluck slide 1 Introduction to the Monolingual and Domain-Specific Tasks of the Cross-language.

Slides:



Advertisements
Similar presentations
Complex queries in the PATENTSCOPE search system Cyberspace September 2013 Sandrine Ammann Marketing & Communications Officer.
Advertisements

SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Thomas Mandl: Robust CLEF Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim
Sheffield at ImageCLEF 2003 Paul Clough and Mark Sanderson.
CLEF 2008 Multilingual Question Answering Track UNED Anselmo Peñas Valentín Sama Álvaro Rodrigo CELCT Danilo Giampiccolo Pamela Forner.
Multilingual experiments of CLEF 2003 Eija Airio, Heikki Keskustalo, Turid Hedlund, Ari Pirkola University of Tampere, Finland Department of Information.
XML Document Mining Challenge Bridging the gap between Information Retrieval and Machine Learning Ludovic DENOYER – University of Paris 6.
Information Access I Measurement and Evaluation GSLT, Göteborg, October 2003 Barbara Gawronska, Högskolan i Skövde.
CLEF 2007 Multilingual Question Answering Track Danilo Giampiccolo, CELCT Anselmo Peñas, UNED.
Cross-Language Retrieval INST 734 Module 11 Doug Oard.
 Official Site: facility.org/research/evaluation/clef-ip-10http:// facility.org/research/evaluation/clef-ip-10.
LANGUAGE AND PATENTS Gillian Davies Montréal, July 2005.
VeldwERK: What happens when you step into the CEFR Seminar on Curriculum Convergences Council of Europe, Strasbourg 29th November, 2011 Daniela Fasoglio,
Spanish Question Answering Evaluation Anselmo Peñas, Felisa Verdejo and Jesús Herrera UNED NLP Group Distance Learning University of Spain CICLing 2004,
LREC Combining Multiple Models for Speech Information Retrieval Muath Alzghool and Diana Inkpen University of Ottawa Canada.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Evaluating Cross-language Information Retrieval Systems Carol Peters IEI-CNR.
August 21, 2002Szechenyi National Library Support for Multilingual Information Access Douglas W. Oard College of Information Studies and Institute for.
CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.
22 August 2003CLEF 2003 Answering Spanish Questions from English Documents Abdessamad Echihabi, Douglas W. Oard, Daniel Marcu, Ulf Hermjakob USC Information.
Impressions of 10 years of CLEF Donna Harman Scientist Emeritus National Institute of Standards and Technology.
1 Intra- and interdisciplinary cross- concordances for information retrieval Philipp Mayr GESIS – Leibniz Institute for the Social Sciences, Bonn, Germany.
IATE EU tool for translation-oriented terminology work
1 The Domain-Specific Track at CLEF 2008 Vivien Petras & Stefan Baerisch GESIS Social Science Information Centre, Bonn, Germany Aarhus, Denmark, September.
CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( The Multiple Language Question Answering Track at CLEF 2003.
JRC-Ispra, , Slide 1 Next Steps / Technical Details Bruno Pouliquen & Ralf Steinberger Addressing the Language Barrier Problem in the Enlarged.
1 Cross-Lingual Query Suggestion Using Query Logs of Different Languages SIGIR 07.
CLEF 2004 – Interactive Xling Bookmarking, thesaurus, and cooperation in bilingual Q & A Jussi Karlgren – Preben Hansen –
CLEF 2005: Multilingual Retrieval by Combining Multiple Multilingual Ranked Lists Luo Si & Jamie Callan Language Technology Institute School of Computer.
“ SINAI at CLEF 2005 : The evolution of the CLEF2003 system.” Fernando Martínez-Santiago Miguel Ángel García-Cumbreras University of Jaén.
Cross-Language Evaluation Forum CLEF Workshop 2004 Carol Peters ISTI-CNR, Pisa, Italy.
The PATENTSCOPE search system: CLIR February 2013 Sandrine Ammann Marketing & Communications Officer.
Page: 1 EDUG Symposium „Dewey-flokkunarkerfið - trending DDC topics in Iceland and other parts of Europe” 22 May :15 – 15:45 News from WebDewey.
The CLEF 2003 cross language image retrieval task Paul Clough and Mark Sanderson University of Sheffield
Information Retrieval and Web Search Cross Language Information Retrieval Instructor: Rada Mihalcea Class web page:
Cross-Language Evaluation Forum (CLEF) IST Expected Kick-off Date: August 2001 Carol Peters IEI-CNR, Pisa, Italy Carol Peters: blabla Carol.
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
Language Identification of Web Data for Building Linguistic Corpora Marija Stupar, Tereza Jurić, Nikola Ljubešić Faculty of Humanities and Social Sciences.
A merging strategy proposal: The 2-step retrieval status value method Fernando Mart´inez-Santiago · L. Alfonso Ure ˜na-L´opez · Maite Mart´in-Valdivia.
Using Surface Syntactic Parser & Deviation from Randomness Jean-Pierre Chevallet IPAL I2R Gilles Sérasset CLIPS IMAG.
1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.
How robust is CLIR? Proposal for a new robust task at CLEF Thomas Mandl Information Science Universität Hildesheim 6 th Workshop.
Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.
CLEF 2007 Workshop Budapest, Hungary, 19–21 September 2007 Nicola Ferro Information Management Systems (IMS) Research Group Department of Information Engineering.
Clarity Cross-Lingual Document Retrieval, Categorisation and Navigation Based on Distributed Services
CLEF Kerkyra Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Arantxa Otegi UNIPD: Giorgio Di Nunzio UH: Thomas Mandl.
Cross-Language Evaluation Forum CLEF 2003 Carol Peters ISTI-CNR, Pisa, Italy Martin Braschler Eurospider Information Technology AG.
Thomas Mandl: GeoCLEF Track Overview Cross-Language Evaluation Forum (CLEF) Thomas Mandl, (U. Hildesheim) 8 th Workshop.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Iterative Translation Disambiguation for Cross-Language.
DECIPHER ES (60), FI (29), UK (52). BUSINESS DETAILS 2 Sektors Agrifood and forestry: FI, UK, ES Life science: UK, ES Advanced manufacturing:
Stiftung Wissenschaft und Politik German Institute for International and Security Affairs CLEF 2005: Domain-Specific Track Overview Michael Kluck SWP,
Customization in the PATENTSCOPE search system Cyberworld November 2013 Sandrine Ammann Marketing & communications officer.
The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
Thomas Mandl: Robust CLEF Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim
1 The Domain-Specific Track at CLEF 2007 Vivien Petras, Stefan Baerisch & Max Stempfhuber GESIS Social Science Information Centre, Bonn, Germany Budapest,
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
English I Week, August 10, nd semester Lecture 22.
Share Enhancements David Webster. Introduction Me: David Webster Alfresco Engineer Joined April 2010 UI The Session: Share Enhancements:
RECENT TRENDS IN SMT By M.Balamurugan, Phd Research Scholar,
Multilingual Search using Query Translation and Collection Selection Jacques Savoy, Pierre-Yves Berger University of Neuchatel, Switzerland
English II Week, August 10, nd semester

Artificial Intelligence applied to IPC and Nice classifications
Large scale multilingual and multimodal integration
COUNTRIES NATIONALITIES LANGUAGES.
Using Multilingual Neural Re-ranking Models for Low Resource Target Languages in Cross-lingual Document Detection Using Multilingual Neural Re-ranking.

CLEF 2008 Multilingual Question Answering Track
Presentation transcript:

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 1 Introduction to the Monolingual and Domain-Specific Tasks of the Cross-language Evaluation Forum 2003 Michael Kluck (Informationszentrum Sozialwissenschaften – IZ, Bonn/Berlin, Humboldt-University Berlin)

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 2 Monolingual Task Languages: –Dutch, Finnish, French, German, Italian, Spanish, Swedish –New: Russian (with reduced topic set, because of the time span of the data) –exclusion of English (widely used in TRE etc., overflow of runs; only newcomers) Aim: –Building a starting-point for CLIR –Enlarge and balance the pool –Use of recently introduced or new languages in the CLEF campaign

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 3 Monolingual runs by 22 participants Lang.Deliver- ed runs Judged runs % DE EN ES FI FR IT NL RU SV18 100

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 4 Domain-Specific Task Amaryllis –could not be continued because of lack of funding in France –trying to get social science data from INIST failed GIRT –New bigger corpus GIRT4 in German from social science literature and current research information –Parallel corpus in English, although with smaller amount of text compared to the German part

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 5 Features of GIRT4 Bigger than GIRT3, now: 320,638 documents –151,319 original German –151,319 translated into English Pseudo-parallel corpus: –Title, Controlled-Term, Classification-Text available in German and English for all documents –Abstract available for 96% in German, only for 15 % in English -> reduced amount of text for the English part –Translated texts (Abstract) are sometimes result of machine translation by SYSTRAN (EU) –Renumbered

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 6 Field Availability in GIRT4 Equal distribution for the German and English part: –Title: 1 per doc On average: –Controlled-Terms: per doc –Classification-Text: 2.02 per doc Different distribution for the German and English part: On average: –Method-Term DE 2.35 per doc EN 1.93 per doc –Abstract DE 0.96 per doc EN 0.15 per doc

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 7 GIRT4 Tasks Monolingual –DE topics -> DE data –EN topics -> EN data Bilingual –EN or RU topics -> DE data –DE or RU topics -> EN data Additional instruments –German-English thesaurus –German-Russian translation table (not fully up-to-date) Concordance list of document numbers –Will be available by end of August 2003

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 8 Assessment of GIRT4 17,031 docs, +65 % Started with the German part Then identified the identical English documents (if they had been indicated as relevant hits) Continued with those hits in the English part that have been indicated as relevant (without having counterparts in the German part) During assessment it showed up that the search results in the different language parts have not been fully congruent –For a given topic the result hits in the English part have not been identical with those in the German part (without knowing which was belonging to what run)

CLEF Workshop ECDL 2003 Trondheim Michael Kluck slide 9 GIRT4 runs by 4 participants DataTopic lang. judged runs GIRT4 DE DE13Mono- lingual 17 GIRT4 EN 4 GIRT4 DE EN1Bilin- gual 5 GIRT4 DE RU2 GIRT4 EN DE1 GIRT4 EN RU1