Download presentation
Presentation is loading. Please wait.
1
6 ~ GIR
2
Motivation Former GIR :
capturing and handling geonames and associated feature ignored other terms with important geographic connotation : spatial relationship (in, near, on the shore of, etc) feature type (cities, mountains, airports, etc) there is disambiguation geonames use a graph-ranking algorithm to analyse the captured feature and assign one single feature as the scope of each document other partial geographic contexts of the document were ignored incorrectly assigned scopes often lead to poor results
3
Problem Definition Rebuilt the query procesing module
all geographic information present on a query is captured giving special attention to feature type and spatial relationship, as guides for the geographic query expansion Using text mining methods to capture and extract disambiguate geonames from text so that geographic scope can be inferred for each document
4
Objective Generation of geographics signatures for both query (QSig) and documents (DSig) DSig is generated for each document by a text mining module QSig is generated through a geographic query expansion module Geographic query expansion focused on feature, features type and spatial relationship Geographic ranking improvement
5
New Architecture of Geographic IR
Topic titles as query string
6
Geographic Ontology Using GKB 2.0 (Geographic Knowledge Base)
All modules rely on geographic ontology support relationship between feature and feature type a better property assignment for feature and feature type a better control of information source enrichment in physical domain, with the addition of new feature type airports, circuits, and mountains, along with their instance
7
Statistic of Geographic Ontology
8
Query Processing (1) Geographic query parsing module
with the help of Geo. Ontology & manual-crafted context rule Split into <what, spatial relation, where> Recognize feature and feature type Features (ISO-19109) – an unambiguous location. It can be described by one or more placenames. For example: Paris. Feature Types (ISO-19109) – classes of features. For example, island, mountain, lake (physical), city, continent, NUT-3 (administrative). A feature has only one feature type. Relations – Links joining features OR feature types: part of, adjacent, capital of, etc. Examples: [Oslo] part-of [Norway], [city] part-of [country]. Example : Ship traffic in portuguese island Ship traffic in portuguese island
9
Query Processing (2) Perform : Term Expansion
expand the thematic ~ what Blind Relevance Feedback Geographic Expansion expand the geographic ~ where based on query type driven by spatial relationship, feature & feature type
10
Query Processing (3) Example : CLEF topic #74
Ship traffic in Portuguese island Ship traffic : thematic part ~ what in : spatial relationship Portuguese : feature ~ grounded geoname Island : feature type Mapped into the corresponding ontological concept
11
Geographic QE 1. Ship Traffic in Portugal 2. Ship Traffic in island
3. Ship Traffic in Portuguese island Europe UK Portugal 1 London Lisbon 2 3 Isle of Wight Isle of Man Sao Miguel Isl Madeira Isl.
12
Geographic QE (2) Scope of the interest :
All geographic concepts of type island that are part of Portugal QSig : São Miguel, Madeira, Santa Maria, Formigas, Terceira, Graciosa, São Jorge, Pico, Faial, Flores, Corvo, Porto Santo, Desertas and Selvagens
13
Term Expansion (1) ~ Blind Relevance Feedback ~
Before Relevance Feedback
14
Term Expansion (2) ~ Blind Relevance Feedback ~
After Relevance Feedback
15
Text Mining (1) Relies on a gazetteer of text pattern generated from the geographic ontology Containing all concept represented by their feature name and respective feature type [<feature type> <feature name>] And [<feature type> $ <feature name>] parse the document for geoname generating DSig Example : Lisbon Airport Airport of Lisbon
16
Text Mining (2) Gazzeter : city $ Lisbon: 1 Lisbon city: 1
district $ Lisbon: 2 Lisbon district: 2 Street $ Lisbon: 3 Lisbon Street: 3 (...) Lisbon: 1,2,3,(...) LA : 5668[1.00]; 2230[0.33]; 4555[0.33]; 4556[0.33]; 4557[0.33] LA : 5388[1.00]; 5389[1.00]; 5390[1.00]; 12097[1.00]; 6653[0.67] ID ConfMeas Normalized into [0,1] Left side : text pattern Right side : identifier of the geographic concept in ontology
17
Sidra Sidra5 : text indexing and ranking
module with geographic capabilities based on MG4J Generating Geo and Term Index Based on QE Term – Query Signature and GeoIndex – Term Index to rank document result
18
Flow chart of searches in Sidra5
19
GeoScore Spatial Distance Similarity ~ AdjSim(s1,s2) ~ Population
~ PopSim(s1,s2) ~ 10% 20% 20% Spatial Adjacency Similarity ~ DistSim(s1,s2) ~ 50% Ontology Similarity ~ OntSim(s1,s2) ~ Geographic Similarity ~ GeoSim(s1,s2) Geographic Score ~ GeoScore(s1,s2)
20
Geographic Score ~ GeoScore(s1,s2)
21
Example Computing of GeoScore
22
Document Scoring Textual Scoring
23
Experiment Type
24
Experiment Result IR GIR IR / GIR MAP Result
25
Conclusion The best experiment setup is to generate an initial run with classic text retrieval, and use the full geographic ranking modules for the generation of the final run GIR system is very dependent on the quality of the geographic ontology, and has some limitations in the text mining step
26
Terima kasih
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.