The XLDB Group at GeoCLEF 2005 Nuno Cardoso, Bruno Martins, Marcírio Chaves, Leonardo Andrade, Mário J. Silva

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Chapter 5: Introduction to Information Retrieval
A Geographic Knowledge Base for Semantic Web Applications Marcirio Silveira Chaves Mário J. Silva Bruno Martins 20º Brazilian Symposium on Databases -
Nuno Cardoso, Bruno Martins, Marcirio Chaves, Leonardo Andrade and Mário J. Silva XLDB Group - Department of Informatics Faculdade de Ciências da Universidade.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
The XLDB Group at GeoCLEF 2005 Nuno Cardoso, Bruno Martins, Marcirio Chaves, Leonardo Andrade, Mário J. Silva XLDB Group - Department of Informatics Faculdade.
Disambiguating Queries for Geographic Information Retrieval Carolyn Hafernik Thesis Proposal May 10, 2006 Computer Science Advisor: Lisa Ballesteros.
Overview of Collaborative Information Retrieval (CIR) at FIRE 2012 Debasis Ganguly, Johannes Leveling, Gareth Jones School of Computing, CNGL, Dublin City.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
1 Oct 30, 2006 LogicSQL-based Enterprise Archive and Search System How to organize the information and make it accessible and useful ? Li-Yan Yuan.
Information Retrieval in Practice
Search Engines and Information Retrieval
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
A Flexible Workbench for Document Analysis and Text Mining NLDB’2004, Salford, June Gulla, Brasethvik and Kaada A Flexible Workbench for Document.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Information Retrieval in Practice
A novel log-based relevance feedback technique in content- based image retrieval Reporter: Francis 2005/6/2.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Advance Information Retrieval Topics Hassan Bashiri.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Overview of Search Engines
Introduction The large amount of traffic nowadays in Internet comes from social video streams. Internet Service Providers can significantly enhance local.
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
Search Engines and Information Retrieval Chapter 1.
CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( Bridging Languages for Question Answering: DIOGENE at CLEF-2003.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
1 Cross-Lingual Query Suggestion Using Query Logs of Different Languages SIGIR 07.
Defining Text Mining Preprocessing Transforming unstructured data stored in document collections into a more explicitly structured intermediate format.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
The CLEF 2003 cross language image retrieval task Paul Clough and Mark Sanderson University of Sheffield
© Paul Buitelaar – November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas.
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
Extracting Metadata for Spatially- Aware Information Retrieval on the Internet Clough, Paul University of Sheffield, UK Presented By Mayank Singh.
IIIT Hyderabad’s CLIR experiments for FIRE-2008 Sethuramalingam S & Vasudeva Varma IIIT Hyderabad, India 1.
Component Based SW Development and Domain Engineering 1 Component Based Software Development and Domain Engineering.
Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
GeoCLEF Breakout Notes Fred Gey, Ray Larson, Paul Clough.
CLEF 2008 Workshop September 17-19, 2008 Aarhus, Denmark.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.
Thomas Mandl: GeoCLEF Track Overview Cross-Language Evaluation Forum (CLEF) Thomas Mandl, (U. Hildesheim) 8 th Workshop.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
INAOE at GeoCLEF 2008: A Ranking Approach based on Sample Documents Esaú Villatoro-Tello Manuel Montes-y-Gómez Luis Villaseñor-Pineda Language Technologies.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
An Ontological Approach to Financial Analysis and Monitoring.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
Information and Communication Technologies 1 Overview of GeoCLEF 2007 IR techniques IE/NLP techniques GIR techniques Systems Resources Experiments Translation.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
11 Thoughts on STS regarding Machine Reading Ralph Weischedel 12 March 2012.
Geographic IR Challenges by Nuno Cardoso Faculty of Sciences, University of Lisbon, LASIGE Presentation held at SINTEF ICT, Oslo, Norway, 4 th December,
Information Retrieval in Practice
Information Organization: Overview
F. López-Ostenero, V. Peinado, V. Sama & F. Verdejo
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
6 ~ GIR.
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Presented by: Hassan Sayyadi
Lecture 8 Information Retrieval Introduction
Information Organization: Overview
Cheshire at GeoCLEF 2008: Text and Fusion Approaches for GIR
Presentation transcript:

The XLDB Group at GeoCLEF 2005 Nuno Cardoso, Bruno Martins, Marcírio Chaves, Leonardo Andrade, Mário J. Silva

XLDB Group at GeoCLEF Tumba! Portuguese Web search engine  Public service since 2002  See it in action at GREASE – Geographic Reasoning for Search Engines The result: GeoTumba (under development)

XLDB Group at GeoCLEF Definitions, Assumptions and Approach Geo-scope = footprint = focus = … Documents have geo-scopes  One sense per discourse assumption Queries have geo-scopes GeoIR: similarity considering undiferentiated terms + geo-scopes

XLDB Group at GeoCLEF Geo-IR architecture Data loading Indexing and Mining Geo-Retrieval

XLDB Group at GeoCLEF GKB – Geographic Knowledge Base Information Sources used in instance built for GeoCLEF:  Wikipedia – names of countries and capitals in four languages  World Gazetteer – cities and agglomerations with ppl > 100,000  Statistics: 12,293 features, 14,759 relationships TGN planned, but licensing completed only after runs submitted Feature types & relationships:

XLDB Group at GeoCLEF Text Mining (CAGE) 1. Identification of Geo-references 2. Scope Assignment

XLDB Group at GeoCLEF Geotumba Software Configuration used at GeoCLEF 2005 QuerCol  Automatic Query Expansion component developed for ad hoc task  Transforms CLEF topics into queries 2 Scope assignment algorithms  Most frequent geographic term  Graph Rank Ranking by  Primary: geographic similarity  Secondary: simplified BM25 similarity on nondifferentiated terms

XLDB Group at GeoCLEF Evaluation Goals Scope Ranking: compare ranking with scopes vs. use of geographic terms as non- differentiated query terms Scope Assigning: compare most frequent geographic term vs. graph algorithm Location Terms Expansion: evaluate geographic terms expansion Topic Translation: evaluate EBMT technique

XLDB Group at GeoCLEF

10 Results Scope Ranking: geo-scopes not as successful  We blame the assembled ontology Scope Assigning: graph algorithm always better  Scopes as expected, given the ontology Location Terms Expansion: manually generated queries performed better Topic Translation: monolingual runs better than bilingual

XLDB Group at GeoCLEF Final Observations and Future Work Amount and quality of geographic knowledge have strong influence on GeoIR performance Meaning of “geographic relevance” needs clarification Future  Better ranking algorithm, less sensitive to the absence of geographic terms in the ontology.  Evalute the effect of variation of geographic knowledge density on retrieval performance  Study the semantics of geographic relationships in queries.