6 ~ GIR.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Geographical Information Retrieval Instituto Superior Técnico - INESC-ID Data Management and Information Retrieval Group (DMIR) - TagusPark Por Bruno Martins.
SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Search in Source Code Based on Identifying Popular Fragments Eduard Kuric and Mária Bieliková Faculty of Informatics and Information.
A Geographic Knowledge Base for Semantic Web Applications Marcirio Silveira Chaves Mário J. Silva Bruno Martins 20º Brazilian Symposium on Databases -
Nuno Cardoso, Bruno Martins, Marcirio Chaves, Leonardo Andrade and Mário J. Silva XLDB Group - Department of Informatics Faculdade de Ciências da Universidade.
The XLDB Group at GeoCLEF 2005 Nuno Cardoso, Bruno Martins, Marcirio Chaves, Leonardo Andrade, Mário J. Silva XLDB Group - Department of Informatics Faculdade.
Geographical Tools - Maps Australia and Its Regional and Global Contexts.
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
The XLDB Group at GeoCLEF 2005 Nuno Cardoso, Bruno Martins, Marcírio Chaves, Leonardo Andrade, Mário J. Silva
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
Retrieving Documents with Geographic References Using a Spatial Index Structure Based on Ontologies Database Laboratory University of A Coruña A Coruña,
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
1 Extending PRIX for Similarity-based XML Query Group Members: Yan Qi, Jicheng Zhao, Dan Situ, Ning Liao.
Cláudio Baptista, UFCG A Model for Geographic Knowledge Extraction on Web Documents Cláudio E. C. Campelo and Cláudio de Souza.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
GTECH 361 Lecture 02 Introduction to ArcGIS. Today’s Objectives explore a map and get information about map features preview geographic data and metadata.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Toward Semantic Web Information Extraction B. Popov, A. Kiryakov, D. Manov, A. Kirilov, D. Ognyanoff, M. Goranov Presenter: Yihong Ding.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR.
Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Institute for System Programming of RAS.
Personalized Information Retrieval in Context David Vallet Universidad Autónoma de Madrid, Escuela Politécnica Superior,Spain.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
Text Mining In InQuery Vasant Kumar, Peter Richards August 25th, 1999.
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
Extracting Metadata for Spatially- Aware Information Retrieval on the Internet Clough, Paul University of Sheffield, UK Presented By Mayank Singh.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
GeoNames is … Gazetteer aggregator of open geo data I am... Marc Wick GeoNames.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
CLEF 2008 Workshop September 17-19, 2008 Aarhus, Denmark.
Ontea: Pattern based Annotation Platform Michal Laclavík.
Ontology Mapping in Pervasive Computing Environment C.Y. Kong, C.L. Wang, F.C.M. Lau The University of Hong Kong.
1 Context-Aware Internet Sharma Chakravarthy UT Arlington December 19, 2008.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni
Automatic Question Answering  Introduction  Factoid Based Question Answering.
Ranking Definitions with Supervised Learning Methods J.Xu, Y.Cao, H.Li and M.Zhao WWW 2005 Presenter: Baoning Wu.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
INAOE at GeoCLEF 2008: A Ranking Approach based on Sample Documents Esaú Villatoro-Tello Manuel Montes-y-Gómez Luis Villaseñor-Pineda Language Technologies.
Survey Jaehui Park Copyright  2008 by CEBT Introduction  Members Jung-Yeon Yang, Jaehui Park, Sungchan Park, Jongheum Yeon  We are interested.
An Ontological Approach to Financial Analysis and Monitoring.
Information and Communication Technologies 1 Overview of GeoCLEF 2007 IR techniques IE/NLP techniques GIR techniques Systems Resources Experiments Translation.
1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.
Geographic IR Challenges by Nuno Cardoso Faculty of Sciences, University of Lisbon, LASIGE Presentation held at SINTEF ICT, Oslo, Norway, 4 th December,
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
Dist(q,d) is the metric distance between footprints q and d dist MBR (q) is the diagonal length for the MBR of the query The DIGMAP GeoParser is a software.
Neighborhood - based Tag Prediction
Linguistic Graph Similarity for News Sentence Searching
Visual Information Retrieval
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Web News Sentence Searching Using Linguistic Graph Similarity
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Associative Query Answering via Query Feature Similarity
Summarizing Entities: A Survey Report
Web IR: Recent Trends; Future of Web Search
Exploring Scholarly Data with Rexplore
Extracting Semantic Concept Relations
Types of Maps: Definition: 1. Physical Map:
Azores Genealogy Research Resources
Content Based Image Retrieval
Semi-Automatic Data-Driven Ontology Construction System
Context-Aware Internet
Information Retrieval and Web Design
Visual Grounding.
Presentation transcript:

6 ~ GIR

Motivation Former GIR : capturing and handling geonames and associated feature ignored other terms with important geographic connotation : spatial relationship (in, near, on the shore of, etc) feature type (cities, mountains, airports, etc) there is disambiguation geonames use a graph-ranking algorithm to analyse the captured feature and assign one single feature as the scope of each document other partial geographic contexts of the document were ignored incorrectly assigned scopes often lead to poor results

Problem Definition Rebuilt the query procesing module all geographic information present on a query is captured giving special attention to feature type and spatial relationship, as guides for the geographic query expansion Using text mining methods to capture and extract disambiguate geonames from text so that geographic scope can be inferred for each document

Objective Generation of geographics signatures for both query (QSig) and documents (DSig) DSig is generated for each document by a text mining module QSig is generated through a geographic query expansion module Geographic query expansion focused on feature, features type and spatial relationship Geographic ranking improvement

New Architecture of Geographic IR Topic titles as query string

Geographic Ontology Using GKB 2.0 (Geographic Knowledge Base) All modules rely on geographic ontology support relationship between feature and feature type a better property assignment for feature and feature type a better control of information source enrichment in physical domain, with the addition of new feature type airports, circuits, and mountains, along with their instance

Statistic of Geographic Ontology

Query Processing (1) Geographic query parsing module with the help of Geo. Ontology & manual-crafted context rule Split into <what, spatial relation, where> Recognize feature and feature type Features (ISO-19109) – an unambiguous location. It can be described by one or more placenames. For example: Paris. Feature Types (ISO-19109) – classes of features. For example, island, mountain, lake (physical), city, continent, NUT-3 (administrative). A feature has only one feature type. Relations – Links joining features OR feature types: part of, adjacent, capital of, etc. Examples: [Oslo] part-of [Norway], [city] part-of [country]. Example : Ship traffic in portuguese island Ship traffic in portuguese island

Query Processing (2) Perform : Term Expansion expand the thematic ~ what Blind Relevance Feedback Geographic Expansion expand the geographic ~ where based on query type driven by spatial relationship, feature & feature type

Query Processing (3) Example : CLEF topic #74 Ship traffic in Portuguese island Ship traffic : thematic part ~ what in : spatial relationship Portuguese : feature ~ grounded geoname Island : feature type Mapped into the corresponding ontological concept

Geographic QE 1. Ship Traffic in Portugal 2. Ship Traffic in island 3. Ship Traffic in Portuguese island Europe UK Portugal 1 London Lisbon 2 3 Isle of Wight Isle of Man Sao Miguel Isl Madeira Isl.

Geographic QE (2) Scope of the interest : All geographic concepts of type island that are part of Portugal QSig : São Miguel, Madeira, Santa Maria, Formigas, Terceira, Graciosa, São Jorge, Pico, Faial, Flores, Corvo, Porto Santo, Desertas and Selvagens

Term Expansion (1) ~ Blind Relevance Feedback ~ Before Relevance Feedback

Term Expansion (2) ~ Blind Relevance Feedback ~ After Relevance Feedback

Text Mining (1) Relies on a gazetteer of text pattern generated from the geographic ontology Containing all concept represented by their feature name and respective feature type [<feature type> <feature name>] And [<feature type> $ <feature name>] parse the document for geoname generating DSig Example : Lisbon Airport Airport of Lisbon

Text Mining (2) Gazzeter : city $ Lisbon: 1 Lisbon city: 1 district $ Lisbon: 2 Lisbon district: 2 Street $ Lisbon: 3 Lisbon Street: 3 (...) Lisbon: 1,2,3,(...) LA072694-0011: 5668[1.00]; 2230[0.33]; 4555[0.33]; 4556[0.33]; 4557[0.33] LA072694-0012: 5388[1.00]; 5389[1.00]; 5390[1.00]; 12097[1.00]; 6653[0.67] ID ConfMeas Normalized into [0,1] Left side : text pattern Right side : identifier of the geographic concept in ontology

Sidra Sidra5 : text indexing and ranking module with geographic capabilities based on MG4J Generating Geo and Term Index Based on QE Term – Query Signature and GeoIndex – Term Index to rank document result

Flow chart of searches in Sidra5

GeoScore Spatial Distance Similarity ~ AdjSim(s1,s2) ~ Population ~ PopSim(s1,s2) ~ 10% 20% 20% Spatial Adjacency Similarity ~ DistSim(s1,s2) ~ 50% Ontology Similarity ~ OntSim(s1,s2) ~ Geographic Similarity ~ GeoSim(s1,s2) Geographic Score ~ GeoScore(s1,s2)

Geographic Score ~ GeoScore(s1,s2)

Example Computing of GeoScore

Document Scoring Textual Scoring

Experiment Type

Experiment Result IR GIR IR / GIR MAP Result

Conclusion The best experiment setup is to generate an initial run with classic text retrieval, and use the full geographic ranking modules for the generation of the final run GIR system is very dependent on the quality of the geographic ontology, and has some limitations in the text mining step

Terima kasih