Thomas Mandl: GeoCLEF Track Overview 2007 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl, (U. Hildesheim) 8 th Workshop.

Slides:



Advertisements
Similar presentations
Thomas Mandl, Julia Maria Schulz LREC 2010, Web Logs & QA, /10 Log-Based Evaluation Resources for Question Answering Thomas Mandl, Julia Maria.
Advertisements

LogCLEF 2009 Log Analysis for Digital Societies (LADS) Thomas Mandl, Maristella Agosti, Giorgio Maria Di Nunzio, Alexander Yeh, Inderjeet Mani, Christine.
SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Chapter 5: Introduction to Information Retrieval
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
PRES A Score Metric for Evaluating Recall- Oriented IR Applications Walid Magdy Gareth Jones Dublin City University SIGIR, 22 July 2010.
Automatic Question Generation from Queries Natural Language Computing, Microsoft Research Asia Chin-Yew LIN
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen SIMS, UC Berkeley Susan Dumais Adaptive Systems & Interactions Microsoft.
Thomas Mandl: Robust CLEF Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim
Search Results Need to be Diverse Mark Sanderson University of Sheffield.
ImageCLEF breakout session Please help us to prepare ImageCLEF2010.
Disambiguating Queries for Geographic Information Retrieval Carolyn Hafernik Thesis Proposal May 10, 2006 Computer Science Advisor: Lisa Ballesteros.
Overview of Collaborative Information Retrieval (CIR) at FIRE 2012 Debasis Ganguly, Johannes Leveling, Gareth Jones School of Computing, CNGL, Dublin City.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
The XLDB Group at GeoCLEF 2005 Nuno Cardoso, Bruno Martins, Marcírio Chaves, Leonardo Andrade, Mário J. Silva
Ang Sun Ralph Grishman Wei Xu Bonan Min November 15, 2011 TAC 2011 Workshop Gaithersburg, Maryland USA.
Cross Language IR Philip Resnik Salim Roukos Workshop on Challenges in Information Retrieval and Language Modeling Amherst, Massachusetts, September 11-12,
Web Logs and Question Answering Richard Sutcliffe 1, Udo Kruschwitz 2, Thomas Mandl University of Limerick, Ireland 2 - University of Essex, UK 3.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Web Search – Summer Term 2006 II. Information Retrieval (Basics Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.
With or without users? Julio Gonzalo UNEDhttp://nlp.uned.es.
 Ad-hoc - This track tests mono- and cross- language text retrieval. Tasks in 2009 will test both CL and IR aspects.
 Official Site: facility.org/research/evaluation/clef-ip-10http:// facility.org/research/evaluation/clef-ip-10.
Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.
Spanish Question Answering Evaluation Anselmo Peñas, Felisa Verdejo and Jesús Herrera UNED NLP Group Distance Learning University of Spain CICLing 2004,
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
August 21, 2002Szechenyi National Library Support for Multilingual Information Access Douglas W. Oard College of Information Studies and Institute for.
Search Engines and Information Retrieval Chapter 1.
CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.
“Cross-Media and Personalized Learning Applications on top of Digital Libraries” 20 September 2007, Budapest, Hungary M. Agosti 1, T. Coppotelli 1, G.M.
© 2008 The MITRE Corporation. All rights reserved. Approved for Public Release; Distribution Unlimited. Case # Alexander Yeh MITRE Corp. October.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
The CLEF 2003 cross language image retrieval task Paul Clough and Mark Sanderson University of Sheffield
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
Extracting Metadata for Spatially- Aware Information Retrieval on the Internet Clough, Paul University of Sheffield, UK Presented By Mayank Singh.
Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim René Hackl, Ralph Kölle, Thomas Mandl, Alexandra Ploedt, Jan-Hendrik Scheufen,
Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.
Thomas Mandl: GeoCLEF Track Overview th Workshop of the Cross-Language Evaluation Forum (CLEF) Århus, 18 th Sept
1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.
How robust is CLIR? Proposal for a new robust task at CLEF Thomas Mandl Information Science Universität Hildesheim 6 th Workshop.
GeoCLEF Breakout Notes Fred Gey, Ray Larson, Paul Clough.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
CLEF 2007 Workshop Budapest, Hungary, 19–21 September 2007 Nicola Ferro Information Management Systems (IMS) Research Group Department of Information Engineering.
CLEF Kerkyra Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Arantxa Otegi UNIPD: Giorgio Di Nunzio UH: Thomas Mandl.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
What Does the User Really Want ? Relevance, Precision and Recall.
Post-Ranking query suggestion by diversifying search Chao Wang.
 General domain question answering system.  The starting point was the architecture described in Brill, Eric. ‘Processing Natural Language without Natural.
The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
AQUAINT AQUAINT Evaluation Overview Ellen M. Voorhees.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
Thomas Mandl: Robust CLEF Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
GeoCLEF topic creation Mark Sanderson. 21/03/2016© The University of Sheffield / Department of Marketing and Communications Topics 25 adhoc topics Developed.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 15: Text Classification & Naive Bayes 1.
Using Blog Properties to Improve Retrieval Gilad Mishne (ICWSM 2007)
CLEF Budapest1 Measuring the contribution of Word Sense Disambiguation for QA Proposers: UBC: Agirre, Lopez de Lacalle, Otegi, Rigau, FBK: Magnini.
Developments in Evaluation of Search Engines
F. López-Ostenero, V. Peinado, V. Sama & F. Verdejo
6 ~ GIR.
Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007
Presentation transcript:

Thomas Mandl: GeoCLEF Track Overview Cross-Language Evaluation Forum (CLEF) Thomas Mandl, (U. Hildesheim) 8 th Workshop of the Cross-Language Evaluation Forum (CLEF) Budapest 20 Sept GeoCLEF Track Overview: Query Classification and Geographic Search

Thomas Mandl: GeoCLEF Track Overview GeoCLEF Administration Joint effort of –Fredric Gey (U. California at Berkeley) –Diana Santos (Linguateca, SINTEF ICT, Norway) –Mark Sanderson (U. Sheffield) –Nicola Ferro, Giorgio Di Nunzio (U. Padua) –Thomas Mandl, Christa Womser-Hacker (U. Hildesheim) –and many others...

Thomas Mandl: GeoCLEF Track Overview Geographic Information Systems

Thomas Mandl: GeoCLEF Track Overview Initial Aim of GeoCLEF Aim: to evaluate retrieval of multilingual documents with an emphasis on geographic search (GIR) –“find me news stories about riots near Dublin” (Fred CLEF Workshop 2005)

Thomas Mandl: GeoCLEF Track Overview ParticipationParticipation CLEF Year Nr. of Participants Nr. of submitted Experiments New: Query Classification Task 6 participants

Thomas Mandl: GeoCLEF Track Overview Search Task 2007 Three languages 600,000 + docs 25 topics (75 in three years now) Intention behind topics –geographically challenging

Thomas Mandl: GeoCLEF Track Overview Reliability?Reliability? 25 topics are sufficient under most circumstances to reliably order systems (Sanderson & Zobel 2005)

Thomas Mandl: GeoCLEF Track Overview Partial Swap Rate Analysis Average Correlation between system rankings of full and partial topic set for German Mono-lingual task 2006

Thomas Mandl: GeoCLEF Track Overview Search Task How much and which geo knowledge and reasoning is necessary? Each year, keyword based systems do well on the task

Thomas Mandl: GeoCLEF Track Overview Query Classification Task Goal: find geo queries in a log of real queries New in 2007 Organized by Xing Xie (Microsoft Research Asia, Beijing, China)

Thomas Mandl: GeoCLEF Track Overview DataData Query log from the MSN search engine –in English – queries (collected August 2006) –500 queries were labelled and used for evaluation 100 queries for training 400 for testing

Thomas Mandl: GeoCLEF Track Overview TaskTask Find queries with a geographic scope –Extract where component –Extract geo-relation-type –Extract what component –Classify what type {information, yellow page, map} Example: lottery information Florida, US in YES Lottery in Florida

Thomas Mandl: GeoCLEF Track Overview Geo-relation-typesGeo-relation-types 27 classes Examples: –In –On –Near –Along –Distance –North_of –North_west_of –North_to –…

Thomas Mandl: GeoCLEF Track Overview Evaluation Set 1.Choose 800 queries randomly from the query set. 2.Remove the typos and the ambiguous queries from the 800 ones manually. 3.Select the queries with special geo-relations from the remainder queries in the query set manually and add them to the evaluation set. 4.Select 500 queries for the final evaluation set.

Thomas Mandl: GeoCLEF Track Overview Query Types in Evaluation Set

Thomas Mandl: GeoCLEF Track Overview Evaluation Metrics Three assessors –individually assessed all system answers –reached an agreement Fully Correct classified query instances Recall, precision and combined F1-Score

Thomas Mandl: GeoCLEF Track Overview ResultsResults TeamPrecisionRecallF1 Ask Csusm Linguit Miracle Talp Xldb

Thomas Mandl: GeoCLEF Track Overview ApproachesApproaches Gazeteers for location identification –Large base of geo names Pre-defined Rules Issues –Low Perfomance –Few training classes for many geo-types

Thomas Mandl: GeoCLEF Track Overview Parallel Session on Friday 9:00 – 10:30 Please come to the Breakout Session on Friday 11:00 – 12:00 and help us to promote GeoCLEF More on GeoCLEF

Thomas Mandl: GeoCLEF Track Overview OverviewOverview More on the geographic search task … Mark Sanderson: Topic Development and Relevance Assessment Giorgio Di Nunzio: Results Diana Santos: Approaches and Interpretation