 Ad-hoc - This track tests mono- and cross- language text retrieval. Tasks in 2009 will test both CL and IR aspects.

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Distributed search for complex heterogeneous media Werner Bailer, José-Manuel López-Cobo, Guillermo Álvaro, Georg Thallinger Search Computing Workshop.
LogCLEF 2009 Log Analysis for Digital Societies (LADS) Thomas Mandl, Maristella Agosti, Giorgio Maria Di Nunzio, Alexander Yeh, Inderjeet Mani, Christine.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Overview of Collaborative Information Retrieval (CIR) at FIRE 2012 Debasis Ganguly, Johannes Leveling, Gareth Jones School of Computing, CNGL, Dublin City.
CLEF 2008 Multilingual Question Answering Track UNED Anselmo Peñas Valentín Sama Álvaro Rodrigo CELCT Danilo Giampiccolo Pamela Forner.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Information Retrieval in Practice
Search Engines and Information Retrieval
Information Retrieval in Practice
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
1 Information Retrieval and Web Search Introduction.
Overview of Search Engines
 Official Site: facility.org/research/evaluation/clef-ip-10http:// facility.org/research/evaluation/clef-ip-10.
Information Retrieval in Practice
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
WP5.4 - Introduction  Knowledge Extraction from Complementary Sources  This activity is concerned with augmenting the semantic multimedia metadata basis.
August 21, 2002Szechenyi National Library Support for Multilingual Information Access Douglas W. Oard College of Information Studies and Institute for.
Search Engines and Information Retrieval Chapter 1.
CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( Bridging Languages for Question Answering: DIOGENE at CLEF-2003.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.
Impressions of 10 years of CLEF Donna Harman Scientist Emeritus National Institute of Standards and Technology.
CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( The Multiple Language Question Answering Track at CLEF 2003.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
JULIO GONZALO, VÍCTOR PEINADO, PAUL CLOUGH & JUSSI KARLGREN CLEF 2009, CORFU iCLEF 2009 overview tags : image_search, multilinguality, interactivity, log_analysis,
CLEF 2004 – Interactive Xling Bookmarking, thesaurus, and cooperation in bilingual Q & A Jussi Karlgren – Preben Hansen –
Cross-Language Evaluation Forum CLEF Workshop 2004 Carol Peters ISTI-CNR, Pisa, Italy.
The CLEF 2003 cross language image retrieval task Paul Clough and Mark Sanderson University of Sheffield
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Interactive Probabilistic Search for GikiCLEF Ray R Larson School of Information University of California, Berkeley Ray R Larson School of Information.
1 CS430: Information Discovery Lecture 18 Usability 3.
Search Engine Architecture
UNED at iCLEF 2008: Analysis of a large log of multilingual image searches in Flickr Victor Peinado, Javier Artiles, Julio Gonzalo and Fernando López-Ostenero.
Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,
1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.
CLEF 2007 Workshop Budapest, Hungary, 19–21 September 2007 Nicola Ferro Information Management Systems (IMS) Research Group Department of Information Engineering.
CLEF Kerkyra Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Arantxa Otegi UNIPD: Giorgio Di Nunzio UH: Thomas Mandl.
Results of the 2000 Topic Detection and Tracking Evaluation in Mandarin and English Jonathan Fiscus and George Doddington.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Thomas Mandl: GeoCLEF Track Overview Cross-Language Evaluation Forum (CLEF) Thomas Mandl, (U. Hildesheim) 8 th Workshop.
Information Retrieval
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
Evaluating Answer Validation in multi- stream Question Answering Álvaro Rodrigo, Anselmo Peñas, Felisa Verdejo UNED NLP & IR group nlp.uned.es The Second.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
L&I SCI 110: Information science and information theory Instructor: Xiangming(Simon) Mu Sept. 9, 2004.
What’s happening in iCLEF? (the iCLEF Flickr Challenge) Julio Gonzalo (UNED), Paul Clough (U. Sheffield), Jussi Karlgren (SICS), Javier Artiles (UNED),
The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
The CLEF 2005 interactive track (iCLEF) Julio Gonzalo 1, Paul Clough 2 and Alessandro Vallin Departamento de Lenguajes y Sistemas Informáticos, Universidad.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
1 INFILE - INformation FILtering Evaluation Evaluation of adaptive filtering systems for business intelligence and technology watch Towards real use conditions.
© NCSR, Frascati, July 18-19, 2002 CROSSMARC big picture Domain-specific Web sites Domain-specific Spidering Domain Ontology XHTML pages WEB Focused Crawling.
CLEF Budapest1 Measuring the contribution of Word Sense Disambiguation for QA Proposers: UBC: Agirre, Lopez de Lacalle, Otegi, Rigau, FBK: Magnini.
Information Retrieval in Practice
Information Retrieval in Practice
From CLEF to TrebleCLEF Promoting Technology Transfer
Digital Video Library - Jacky Ma.
Information Retrieval and Web Search
Search Engine Architecture
Information Retrieval and Web Search
Information Retrieval and Web Search
CSE 635 Multimedia Information Retrieval
Search Engine Architecture
Information Retrieval and Web Search
CLEF 2008 Multilingual Question Answering Track
Presentation transcript:

 Ad-hoc - This track tests mono- and cross- language text retrieval. Tasks in 2009 will test both CL and IR aspects in a multilingual context.  will evaluate retrieval algorithms on multilingual collections of catalog records; as in 2008, the collections are derived from the English, French and German archives of The European Library; a query expansion pilot task is also offered  Robust-WSD aims at assessing whether word sense disambiguated data does impact on system performance; mono- and bilingual tasks will be offered on an English WSD collection.

 Interactive Cross-Language Retrieval (iCLEF) Interactive retrieval of images using the Flickr database ( will again be studied.  Flickr is a dynamic, image database with labels provided by creators and viewers in a self- organizing ontology of tags.  This labeling activity is naturally multilingual, reactive, and cooperative. The focus is on measuring relevance, user confidence/satisfaction and user behaviour on a large scale. To serve this purpose, a single multilingual interface to Flickr will be user by all participants. Coordinators are UNED (ES), SICS (SE) & U. Sheffield (UK). See for details.

 Multiple Language Question Answering  ResPubliQA: Given a pool of 500 independent natural language questions, systems must return the passage (not the exact answer) that answers each question from the JRC-Acquis collection of EU documentation. Both questions and documents are translated and aligned for a subset of languages (at least Bulgarian, Dutch, English, French, German, Italian, Portuguese, Romanian and Spanish). See  QAST: The aim of the third QAST exercise is to evaluate QA technology in a real multilingual speech scenario in which written and oral questions (factual and definitional) in different languages are formulated against a set of audio recordings related to speech events in those languages. The proposed scenario is the European Parliament sessions in English, Spanish and French. See  GikiCLEF: Following the previous GikiP pilot at GeoCLEF 2008, the task focuses on open list questions over Wikipedia that require geographic reasoning, complex information extraction, and cross- lingual processing, at least for Dutch, English, German, Norwegian, Portuguese and Romanian. See

 Cross-Language Image Retrieval (ImageCLEF)  This track evaluates retrieval from visual collections; both text and visual retrieval techniques are exploitable. Five challenging tasks are foreseen:  multilingual ad-hoc retrieval from a photo collection concentrating on diversity in the results;  retrieval from a large scale, heterogeneous collection of Wikipedia images with user-generated textual metadata, and queries in several languages;  detection of semantic categories from robotic images (non- annotated collection, concepts to be detected). Results of a visual and a text retrieval system will be made available to participants.  Track coordinators are U. Sheffield (UK), U. Applied Sciences Western Switzerland (CH), Oregon Health and Science U. (US), RWTH Aachen (DE), U. Geneva (CH), CWI (NL), IDIAP (CH). See also:

 Multilingual Information Filtering  INFILE (information filtering evaluation) extends the TREC 2002 filtering track as follows: it uses a corpus of 100,000 Agence France Press comparable newswires for Arabic, English and French; Evaluation is performed using an automatic querying of test systems with a simulated user feedback. Each system can use the feedback at any time to increase performance. Test systems will provide boolean decisions for each document and filter profile. A curve of the evolution of efficiency will be computed along with more classical measures tested in TREC. INFILE is also open to monolingual participation. Coordinators are CEA (FR), U. Lille (FR), ELDA (FR). See

 Cross-Language Video Retrieval (VideoCLEF)  VideoCLEF offers classification and retrieval tasks on a video collection containing episodes of dual language television programming. The collection will extend the Dutch/English corpus used for the 2008 VideoCLEF pilot track. Task participants will be provided with speech recognition transcripts, metadata and shot-level keyframes for the video data. Two classification tasks will be evaluated: "Subject Classification", which involves automatically tagging videos with subject labels, and "Affect and Appeal", which involves classifying videos according to characteristics beyond their semantic content. A semantic keyframe extraction task and an exercise on identifying related English-language resources to support viewer comprehension of Dutch-language video are also planned. The track is coordinated by Dublin City University (IE) and Delft University of Technology (NL). See

 Intellectual Property (CLEF-IP) – New this Year  The CLEF IP track in 2009 will utilize a collection of more than 1M patent documents mainly derived from EPO sources, the collection will cover English French and German with at least 100,000 documents in each language. Queries and relevance judgements will produced by two methods. The first is using queries produced by Intellectual Property Experts and reviewed by them in a fairly conventional way. The second is an automatic method using patent citations from seed patents. Search results will be reviewed to ensure the majority of test and training queries produce results in more than one language. We will primarily report results retrieving across all three languages. In 2009 we will stick to the Cranfield evaluation model: in subsequent years we expect to offer refined retrieval process models and assessment tools.  The track is coordinated by: Information Retrieval Facility & Matrixware (AT) See projects/clef-ip09-track/ projects/clef-ip09-track/

 Log File Analysis (LogCLEF) – New this Year  LogCLEF deals with the analysis of queries as expression of user behavior. The goal is the analysis and classification of queries in order to improve search systems. LogCLEF has two tasks:  Log Analysis and Geographic Query Identification (LAGI): The recognition of the geographic component within a query stream is a key problem for geographic information retrieval (GIR). Geographic queries require specific treatment and often a geographically oriented output (e.g. a map). The task is to (1) classify geographic queries and (2) identify their geographic and non-geographic elements. A real search engine log file and logs from The European Library (TEL) will be used.  Log Analysis for Digital Societies (LADS): This task will use logs from The European Library (TEL) and intends to analyze user behavior with a focus on multilingual search. The task is open to different approaches. Potential targets are query reformulation, multilingual search behavior and community identification.  The coordinators are: U. Hildesheim (DE), U. Padua (IT), Mitre Corp. (US). See

 Grid Experiments Multilingual information access (MLIA) is increasingly part of many complex systems, such as digital libraries, enterprise portals, Web search engines. But do we really know how MLIA components (stemmers, IR models, relevance feedback, translation techniques, etc.) behave with respect to languages? is launching a cooperative effort where a series of large-scale, systematic grid experiments will aim at improving our comprehension of MLIA systems and gaining an exhaustive picture of their behaviour with respect to languages. Participants will be asked to take part in a series of experiments that have been carefully designed to ensure that the tested MLIA components are really comparable over groups and differences come only from the languages and tasks at hand. Task Coordinators are U. Padua (IT) and NIST (US). Details available at

 16 trackuri – 13 grupe  Pot exista interschimbări de persoane între grupe  Java  Arhitectură web