June 19-21, 2006WMS'06, Chania, Crete1 Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies.

Slides:



Advertisements
Similar presentations
Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme Presented by Smitashree Choudhury.
Advertisements

Improved TF-IDF Ranker
Semantic News Recommendation Using WordNet and Bing Similarities 28th Symposium On Applied Computing 2013 (SAC 2013) March 21, 2013 Michel Capelle
Using Semantic Similarity Measures in the Biomedical Domain for Computing Similarity between Genes based on Gene Ontology By : Elham Khabiri Adviser :
Building a Large- Scale Knowledge Base for Machine Translation Kevin Knight and Steve K. Luk Presenter: Cristina Nicolae.
Creating a Similarity Graph from WordNet
1 Towards Fine-grained Service Matchmaking by Using Concept Similarity Alberto Fernández, Axel Polleres, Sascha Ossowski
Ontology Notes are from:
A Framework for Ontology-Based Knowledge Management System
Storing and Retrieving Biological Instances with the Instance Store Daniele Turi, Phillip Lord, Michael Bada, Robert Stevens.
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Presenter: Cosmin Adrian Bejan Alexander Budanitsky and.
Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Using Information Content to Evaluate Semantic Similarity in a Taxonomy Presenter: Cosmin Adrian Bejan Philip Resnik Sun Microsystems Laboratories.
HIKM’2006AMTEx Automatic Document Indexing in Large Medical Collections Angelos Hliaoutakis, Kalliopi Zervanou, Euripides G.M. Petrakis Technical University.
DOG I : an Annotation System for Images of Dog Breeds Antonis Dimas Pyrros Koletsis Euripides Petrakis Intelligent Systems Laboratory Technical University.
Feature Selection for Automatic Taxonomy Induction The Features Input: Two terms Output: A numeric score, or. Lexical-Syntactic Patterns Co-occurrence.
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
Paraskevi Raftopoulou 1,2 Paraskevi Raftopoulou 1,2 and Euripides G.M. Petrakis 2 1 Max-Planck Institute for Informatics, Saarbruecken, Germany
Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : UNIVERSIT´E CATHOLIQUE DE LOUVAIN, BELGIUM ASSOCIATION FOR COMPUTING MACHINERY.
Semantic Similarity over Gene Ontology for Multi-label Protein Subcellular Localization Shibiao WAN and Man-Wai MAK The Hong Kong Polytechnic University.
Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2005 May 16 & 17, 2005 Rachel Kleinsorge.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Machine Learning Approach for Ontology Mapping using Multiple Concept Similarity Measures IEEE/ACIS International Conference on Computer and Information.
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
WordNet ® and its Java API ♦ Introduction to WordNet ♦ WordNet API for Java Name: Hao Li Uni: hl2489.
Semantic Enrichment of Ontology Mappings: A Linguistic-based Approach Patrick Arnold, Erhard Rahm University of Leipzig, Germany 17th East-European Conference.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
1 Query Operations Relevance Feedback & Query Expansion.
Theory and Application of Database Systems A Hybrid Approach for Extending Ontology from Text He Wei.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
Julia Stoyanovich, William Mee, Kenneth A. Ross New England DB Summit 2010 Semantic Ranking and Result Visualization for Life Sciences Publications.
Semantics-Based News Recommendation with SF-IDF+ International Conference on Web Intelligence, Mining, and Semantics (WIMS 2013) June 13, 2013 Marnix Moerland.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval Rui Yan†, ♮, Han Jiang†, ♮, Mirella Lapata‡,
Ontology Mapping in Pervasive Computing Environment C.Y. Kong, C.L. Wang, F.C.M. Lau The University of Hong Kong.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Word sense disambiguation of WordNet glosses Presenter: Chun-Ping Wu Author: Dan Moldovan, Adrian Novischi.
Using Semantic Relatedness for Word Sense Disambiguation
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
1 Measuring the Semantic Similarity of Texts Author : Courtney Corley and Rada Mihalcea Source : ACL-2005 Reporter : Yong-Xiang Chen.
Learning Taxonomic Relations from Heterogeneous Evidence Philipp Cimiano Aleksander Pivk Lars Schmidt-Thieme Steffen Staab (ECAI 2004)
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation Bioinformatics, July 2003 P.W.Load,
Trait ontology approach Marie-Angélique LAPORTE NCEAS June 7 th 2010.
Semantics-Based News Recommendation International Conference on Web Intelligence, Mining, and Semantics (WIMS 2012) June 14, 2012 Michel Capelle
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Document Indexing in Large Medical Collections.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics Semantic distance between two words.
SERVICE ANNOTATION WITH LEXICON-BASED ALIGNMENT Service Ontology Construction Ontology of a given web service, service ontology, is constructed from service.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Bridget McInnes Ted Pedersen Serguei Pakhomov
WordNet: A Lexical Database for English
An Empirical Study of Property Collocation on Large Scale of Knowledge Base 龚赛赛
CS 620 Class Presentation Using WordNet to Improve User Modelling in a Web Document Recommender System Using WordNet to Improve User Modelling in a Web.
A method for WSD on Unrestricted Text
MedSearch is a retrieval system for the medical literature
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Presentation transcript:

June 19-21, 2006WMS'06, Chania, Crete1 Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies Euripides G.M. Petrakis Giannis Varelas Angelos Hliaoutakis Paraskevi Raftopoulou

June 19-21, 2006WMS'06, Chania, Crete2 Semantic Similarity  Relates to computing the conceptual similarity between terms which are not necessarily lexicacally similar “car”-“automobile”-“vehicle”, “drug”- “medicine”  Tool for making knowledge commonly understandable in applications such as IR, information communication in general

June 19-21, 2006WMS'06, Chania, Crete3 Methodology  Terms from different communicating sources are represented by ontologies  Map two terms to an ontology and compute their relationship in that ontology  Terms from different ontologies: Discover linguistic relationships or affinities between terms in different ontologies

June 19-21, 2006WMS'06, Chania, Crete4 Contributions  We investigate several Semantic Similarity Methods and we evaluate their performance  We propose a novel semantic similarity measure for comparing concepts from different ontologies

June 19-21, 2006WMS'06, Chania, Crete5 Ontologies  Tools of information representation on a subject  Hierarchical categorization of terms from general to most specific terms object  artifact  construction  stadium  Domain Ontologies representing knowledge of a domain e.g., MeSH medical ontology  General Ontologies representing common sense knowledge about the world e.g., WordNet

June 19-21, 2006WMS'06, Chania, Crete6 WordNet  A vocabulary and a thesaurus offering a hierarchical categorization of natural language terms More than 100,000 terms  Nouns, verbs, adjectives and adverbs are grouped into synonym sets (synsets)  Synsets represent terms or concepts with similar meaning stadium, bowl, arena, sports stadium – (a large structure for open-air sports or entertainments)

June 19-21, 2006WMS'06, Chania, Crete7 WordNet Hierarchies  The synsets are also organized into senses Senses: Different meanings of the same term  The synsets are related to other synsets higher or lower in the hierarchy by different types of relationships e.g. Hyponym/Hypernym (Is-A relationships) Meronym/Holonym (Part-Of relationships)  Nine noun and several verb Is-A hierarchies

June 19-21, 2006WMS'06, Chania, Crete8 A Fragment of the WordNet Is-A Hierarchy

June 19-21, 2006WMS'06, Chania, Crete9 MeSH  MeSH: ontology for medical and biological terms by the N.L.M.  Organized in IS-A hierarchies More than 15 taxonomies, more than 22,000 terms  No part-of relationships  The terms are organized into synsets called “entry terms’’

June 19-21, 2006WMS'06, Chania, Crete10 A Fragment of the MeSH Is-A Hierarchy

June 19-21, 2006WMS'06, Chania, Crete11 Semantic Similarity Methods  Map terms to an ontology and compute their relationship in that ontology  Four main categories of methods: Edge counting: path length between terms Information content: as a function of their probability of occurrence in a corpus Feature based: similarity between their properties (e.g., definitions) or based on their relationships to other similar terms Hybrid: combine the above ideas

June 19-21, 2006WMS'06, Chania, Crete12 Example  Edge counting distance between “conveyance” and “ceramic” is 2  An information content method, would associate the two terms with their common subsumer and with their probabilities of occurrence in a corpus

June 19-21, 2006WMS'06, Chania, Crete13 X-Similarity  Relies on matching between synsets and set description sets  A,B: synsets or term description sets  Do the same with all IS-A, Part-Of relationships and take their maximum

June 19-21, 2006WMS'06, Chania, Crete14 WordNet term: “Hypothyroidism”MeSH term: “Hyperthyroidism” hypothyroidism An underactive thyroid gland; a glandular disorder Resulting from insufficient production of thyroid hormones. Hypothyroidism glandular disease, disorder, condition, state myxedema, cretinism hyperthyroidism Hypersecretion of Thyroid Hormones from Thyroid Gland. Elevated levels of thyroid hormones increase Basal Metabolic Rate. Hyperthyroidism disease, thyroid, Endocrine System Diseases, diseases thyrotoxicosis, thyrotoxicoses Example  S (Hypothyroidism, Hyperthyroidism) = 0.387

June 19-21, 2006WMS'06, Chania, Crete15 Evaluation  The most popular methods are evaluated  All methods applied on a set of 38 term pairs  Their similarity values are correlated with scores obtained by humans  The higher the correlation of a method the better the method is

June 19-21, 2006WMS'06, Chania, Crete16 Evaluation on WordNet MethodTypeCorrelation Rada 1989Edge Counting0.59 Wu 1994Edge Counting0.74 Li 2003Edge Counting0.82 Leackok 1998Edge Counting0.82 Richardson 1994Edge Counting0.63 Resnik 1999Info. Content0.79 Lin 1993Info. Content0.82 Lord 2003Info. Content0.79 Jiang 1998Info. Content0.83 Tversky 1977Feature Based0.73 X-SimilarityFeature Based0.74 Rodriguez 2003Hybrid0.71

June 19-21, 2006WMS'06, Chania, Crete17 Evaluation on MeSH MethodTypeCorrelation Rada 1989Edge Counting0.50 Wu 1994Edge Counting0.67 Li 2003Edge Counting0.70 Leackok 1998Edge Counting0.74 Richardson 1994Edge Counting0.64 Resnik 1999Info. Content0.71 Lin 1993Info. Content0.72 Lord 2003Info. Content0.70 Jiang 1998Info. Content0.71 Tversky 1977Feature Based0.67 X-SimilarityFeature Based0.71 Rodriguez 2003Hybrid0.71

June 19-21, 2006WMS'06, Chania, Crete18 Cross Ontology Measures  We used 40 MeSH terms pairs  One of the terms is a also a WordNet term  We measured correlation with scores obtained by experts MethodTypeCorrelation X-SimilarityFeature-Based0.70 RodriguezHybrid0.55

June 19-21, 2006WMS'06, Chania, Crete19 Comments  Edge counting/Info. Content methods work by exploiting structure information  Good methods take the position of the terms into account Higher similarity for terms which are close together but lower in the hierarchy e.g., [Li et.al. 2003]  X – Similarity performs at least as good as other Feature-Based methods  Outperforms other Cross-Ontology methods

June 19-21, 2006WMS'06, Chania, Crete20 Conclusions  Semantic similarity methods approximated the human notion of similarity reaching correlation up to 83%  Cross ontology similarity is a difficult problem that required further investigation  Work towards integrating Sem. Sim within IntelliSearch information Retrieval System for Web documents

June 19-21, 2006WMS'06, Chania, Crete21 Try our system on the Web Implementation: Giannis Varelas Spyros Argyropoulos

June 19-21, 2006WMS'06, Chania, Crete22