Evaluating semantic similarity using GML in Geographic Information Systems Fernando Ferri 1, Anna Formica 2, Patrizia Grifoni 1, and Maurizio Rafanelli.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

2 Introduction A central issue in supporting interoperability is achieving type compatibility. Type compatibility allows (a) entities developed by various.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
Mining User Similarity Based on Location History Yu Zheng, Quannan Li, Xing Xie Microsoft Research Asia.
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,
Clustering Categorical Data The Case of Quran Verses
Semantic Similarity in a Taxonomy By Ankit Ramteke ( ) Bibek Behera ( ) Karan Chawla ( )
Computer Science Dept. Fall 2003 Object models Object models describe the system in terms of object classes An object class is an abstraction over a set.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
Identity Management Based on P3P Authors: Oliver Berthold and Marit Kohntopp P3P = Platform for Privacy Preferences Project.
Ontology Notes are from:
A Framework for Ontology-Based Knowledge Management System
SSP Re-hosting System Development: CLBM Overview and Module Recognition SSP Team Department of ECE Stevens Institute of Technology Presented by Hongbing.
Page 1 Building Reliable Component-based Systems Chapter 7 - Role-Based Component Engineering Chapter 7 Role-Based Component Engineering.
Deriving Semantic Description Using Conceptual Schemas Embedded into a Geographic Context Centre for Computing Research, IPN Geoprocessing Laboratory Miguel.
Using Interfaces to Analyze Compositionality Haiyang Zheng and Rachel Zhou EE290N Class Project Presentation Dec. 10, 2004.
Chapter 2Modeling 資工 4B 陳建勳. Introduction.  Traditional information retrieval systems usually adopt index terms to index and retrieve documents.
Evaluating Hypotheses
June 19-21, 2006WMS'06, Chania, Crete1 Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Approximate Queries by Relaxing Structural Constraints in GIS Arianna D’Ulizia Fernando Ferri Patrizia Grifoni IRPPS-CNR, Rome, Italy First International.
QoM: Qualitative and Quantitative Measure of Schema Matching Naiyana Tansalarak and Kajal T. Claypool (Kajal Claypool - presenter) University of Massachusetts,
Using Information Content to Evaluate Semantic Similarity in a Taxonomy Presenter: Cosmin Adrian Bejan Philip Resnik Sun Microsystems Laboratories.
Copyright © 2006, Open Geospatial Consortium, Inc., All Rights Reserved. The OGC and Emergency Services: GML for Location Transport & Formats & Mapping.
Conceptual modelling. Overview - what is the aim of the article? ”We build conceptual models in our heads to solve problems in our everyday life”… ”By.
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
UML Class Diagrams: Basic Concepts. Objects –The purpose of class modeling is to describe objects. –An object is a concept, abstraction or thing that.
Domain-Specific Software Engineering Alex Adamec.
An Automatic Segmentation Method Combined with Length Descending and String Frequency Statistics for Chinese Shaohua Jiang, Yanzhong Dang Institute of.
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Copyright © Cengage Learning. All rights reserved. CHAPTER 11 ANALYSIS OF ALGORITHM EFFICIENCY ANALYSIS OF ALGORITHM EFFICIENCY.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
OO Analysis and Design CMPS OOA/OOD Cursory explanation of OOP emphasizes ▫ Syntax  classes, inheritance, message passing, virtual, static Most.
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
Chapter 8 Architecture Analysis. 8 – Architecture Analysis 8.1 Analysis Techniques 8.2 Quantitative Analysis  Performance Views  Performance.
Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Querying Structured Text in an XML Database By Xuemei Luo.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
1 Mpeg-4 Overview Gerhard Roth. 2 Overview Much more general than all previous mpegs –standard finished in the last two years standardized ways to support:
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Presented by: Ashgan Fararooy Referenced Papers and Related Work on:
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
OLAP Recap 3 characteristics of OLAP cubes: Large data sets ~ Gb, Tb Expected Query : Aggregation Infrequent updates Star Schema : Hierarchical Dimensions.
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Concept similarity in Formal Concept Analysis-An information.
Ontology Mapping in Pervasive Computing Environment C.Y. Kong, C.L. Wang, F.C.M. Lau The University of Hong Kong.
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
WIGOS Data model – standards introduction.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation Bioinformatics, July 2003 P.W.Load,
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Organizing Structured Web Sources by Query Schemas: A Clustering Approach Bin He Joint work with: Tao Tao, Kevin Chen-Chuan Chang Univ. Illinois at Urbana-Champaign.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Data Models. 2 The Importance of Data Models Data models –Relatively simple representations, usually graphical, of complex real-world data structures.
Vers national spatial data infrastructure training program Serving Transportation Data Through the NSDI Features, GML, and Application Schemas.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
CCNT Lab of Zhejiang University
Distribution and components
UML Class Diagrams: Basic Concepts
Session 3: Information Modelling and Information Communities
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Presentation transcript:

Evaluating semantic similarity using GML in Geographic Information Systems Fernando Ferri 1, Anna Formica 2, Patrizia Grifoni 1, and Maurizio Rafanelli 2 1 IRPPS-CNR, via Nizza 128, Roma, Italy 2 IASI-CNR, viale Manzoni 30, Roma, Italy

Summary Motivation Related works Coding a Part-of Hierarchy using GML Similarity evaluation Conclusion

Motivation (1) In Geographic Information Systems (GISs) semantic similarity plays an important role, as it supports the identification of objects that are conceptually close, but not identical. GML (Geography Markup Language) is emerging as the dominant standard for exchanging geographic data across the Internet. A semantic similarity model facilitates comparison of entities and allows information retrieval and integration to handle semantically similar concepts. The goal of a similarity model is to obtain flexible and better matches between user-expected and system-retrieved information.

Motivation (2) Given the relevance of the Is-in relationship in the geographic context, we focus on GML elements organized according to Part-of (meronymic) hierarchies. The semantics essentially concerns parts which are similar to and inseparable from the whole.

Related works (1) Similarity of hierarchically related concepts has been widely investigated in the literature [Resnik] [Rodriguez, Egenhofer]. From the various proposals, we followed the probabilistic approach of Lin, which is based on the notion of information content and overcomes the drawbacks of the traditional edge-counting approach.

Related Works (2) Resnik proposes algorithms that take advantage of taxonomic similarity in resolving syntactic and semantic ambiguities. Lin starts from the Resnik’ work and addresses also the information content of the comparing concepts.

Coding a Part-of Hierarchy with GML (1) The real world in the geographic domain can be represented as a set of features, and AbstractFeatureType codifies a geographic feature in GML. Its geometry type is an important property, it is given in the reference coordinate system and describes the extent, position or relative location of the represented concept.

Coding a Part-of Hierarchy with GML (2) The geometric types defined in GML provide the framework for modelling all the geographical concepts. By means of this framework it is possible to model, for example, the concepts composing a communication ways network, such as roads, rivers, canals and other communication infrastructures.

Coding a Part-of Hierarchy with GML (3) AbstractFeatureType MultiLineStringTypeMultiPolygonType…….. ComWayTypeRoadTypeRiverTypeCanalTypeNavSegmentTypeNNavSegmentType This figure shows an example of a type hierarchy that introduces concepts concerning communication infrastructures starting from the GML geometric types.

Coding a Part-of Hierarchy with GML (4) As mentioned in the motivation, due to the relevance of the Is-in relationship in the geographic context, the paper focuses on GML elements organized according to Part-of (meronymic) hierarchies. For instance, in our example a Part-of relationship exists among communication ways (ComWay) and roads, rivers and canals.

Coding a Part-of Hierarchy with GML (5) Usually, in the literature, Part-of hierarchies are modelled in XML using “sequences of elements”, and a similar approach could be followed in GML ComWay RiverRoad NavRiver NNavRiverNavCanalNNavCanal Canal CountryKind However, this approach does not permit to distinguish between elements of the Part-of hierarchy and other elements eventually defined out of the Part- of hierarchy, such as Kind and Country

Coding a Part-of Hierarchy with GML (6) In order to put in evidence meronymic relationships within the GML element hierarchy, a Part-of hierarchy could be modelled by introducing some special geographic types such as PartOfWayType, PartOfRivType, PartOfCanType PartOfWay RiverCanalRoad NavRiverNNavRiverNavCanalNNavCanal ComWay Country Kind PartOfRivPartOfCan Each special type is introduced for modelling a Part-of relationship between a geographic concept and their component concepts

Coding a Part-of Hierarchy with GML (7) ………………………….. This GML code shows how to put in evidence a meronymic relationship within the GML element hierarchy introducing a special geographic type such as PartOfWayType

Evaluating similarity (1) For evaluating concept similarity this paper combines and revisits: the information content approach [Lin98], a proposal inspired by the maximum weighted matching problem in bipartite graphs [FM02].

Evaluating similarity (2) The starting assumption is that the association of probabilities with the Part-of taxonomy allows us the notion of a weighted element hierarchy to be introduced. In particular, in our example the probabilities have been estimated in line with WordNet 2.0. For instance, below the concepts Road and River have been defined, with the related frequencies (the numbers in parenthesis). (95) Road – an open way (generally public) for travel and transportation (55) River – a large natural stream of water (larger than a creek)

Evaluating similarity (3) The probability of a concept The probability of a concept c is defined as: p(c) = freq(c)/N where freq(c) is the frequency of the concept c in the taxonomy, and N is the total number of concepts. In the example probabilities have been assigned according to WordNet.

Evaluating similarity (4) Example: Weighted Concept Hierarchy

Evaluating similarity (5) Following the standard approach of information theory [Ross76], the information content of a concept c can be quantified as: – log p(c) that is, as the probability increases, the informativeness decreases.

Evaluating similarity (6) The information content similarity (ics) of two concepts such as River and Canal is defined as: ics(River, Canal) = 2 log p(ComWay)/(log p(River)+log p(Canal)) = 0,72 where ComWay is the concept representing the maximum information content shared by River and Canal. According to the Lin’s approach the more information two concepts share, the more similar they are.

Evaluating similarity (7) Structural similarity (asim) Inspired by the maximum weighted matching problem in bipartite graphs, we have to identify the set of pairs of typed attributes such that is maximal the sum of the products of the information content similarity of the attributes and the related types.

Evaluating similarity (8) Example label:string length:integer flow:integer deepness:integer label:string profundity:integer capacity:integer length:integer RiverType CanalType

Evaluating similarity (9) In the previous example the set of pairs of attributes that maximizes the sum of the related information content similarity is the following: {(label,label ), (length,length ), ( flow,capacity), ( deepness,profundity) }

Evaluating similarity (10) In fact, by assuming that deepness and profundity are synonyms, we have: ics( label,label)= ics(length,length )= ics( deepness,profundity) = 1 and ics( flow,capacity) = 0.

Evaluating similarity (11) The similarity of the sets of attributes of complexTypes (asim) is therefore defined by the above maximum sum divided by the greatest of the cardinalities of the sets of attributes of the types compared. In the case of RiverType and CanalType we have: asim(RiverType,CanalType) = ¾ = 0.75

Evaluating similarity (12) Concept Similarity (Gsim) The Similarity (Gsim) of the concepts River and Canal is defined as: Gsim(River, Canal) =(ics(River, Canal)*w + asim(River, Canal)*(1-w)) *  t (RiverType,CanalType) where: ics(River, Canal) is the information content similarity asim(River, Canal) is the structural similarity w is a weight, s.t. 0 <= w <= 1.  t is a Boolean function that, given two complexTypes, returns 0 if their least upper bound in the type hierarchy is AbstractFeatureType, otherwise it returns 1.

Evaluating similarity (13) In particular, if we assume w=0.5 Gsim(River, Canal) =(ics(River, Canal)*w + asim(River, Canal)*(1-w)) *  t (RiverType,CanalType) Gsim(River, Canal) = 0.5 ( )*1 = 0.74

Conclusion Thank you