WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.

Slides:



Advertisements
Similar presentations
Building Wordnets Piek Vossen, Irion Technologies.
Advertisements

Improved TF-IDF Ranker
1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth.
Lexical Semantics and Word Senses Hongning Wang
Creating a Similarity Graph from WordNet
Large dataset for object and scene recognition A. Torralba, R. Fergus, W. T. Freeman 80 million tiny images Ron Yanovich Guy Peled.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Presenter: Cosmin Adrian Bejan Alexander Budanitsky and.
Creating a Bilingual Ontology: A Corpus-Based Approach for Aligning WordNet and HowNet Marine Carpuat Grace Ngai Pascale Fung Kenneth W.Church.
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Written by Alexander Budanitsky Graeme Hirst Retold by.
Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Semantic Video Classification Based on Subtitles and Domain Terminologies Polyxeni Katsiouli, Vassileios Tsetsos, Stathes Hadjiefthymiades P ervasive C.
Using Information Content to Evaluate Semantic Similarity in a Taxonomy Presenter: Cosmin Adrian Bejan Philip Resnik Sun Microsystems Laboratories.
Course G Web Search Engines 3/9/2011 Wei Xu
WORDNET Approach on word sense techniques - AKILAN VELMURUGAN.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Institute for System Programming of RAS.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
WordNet ® and its Java API ♦ Introduction to WordNet ♦ WordNet API for Java Name: Hao Li Uni: hl2489.
LANGUAGE NETWORKS THE SMALL WORLD OF HUMAN LANGUAGE Akilan Velmurugan Computer Networks – CS 790G.
Oana Adriana Şoica Building and Ordering a SenDiS Lexicon Network.
Related terms search based on WordNet / Wiktionary and its application in ontology matching RCDL'2009 St. Petersburg Institute for Informatics and Automation.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Semantic Enrichment of Ontology Mappings: A Linguistic-based Approach Patrick Arnold, Erhard Rahm University of Leipzig, Germany 17th East-European Conference.
1 Query Operations Relevance Feedback & Query Expansion.
Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.
Word Sense Disambiguation in Queries Shaung Liu, Clement Yu, Weiyi Meng.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
WordNet: Connecting words and concepts Christiane Fellbaum Cognitive Science Laboratory Princeton University.
WordNet: Connecting words and concepts Peng.Huang.
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
HyperLex: lexical cartography for information retrieval Jean Veronis Presented by: Siddhanth Jain( ) Samiulla Shaikh( )
Wordnet - A lexical database for the English Language.
WordNet Enhancements: Toward Version 2.0 WordNet Connectivity Derivational Connections Disambiguated Definitions Topical Connections.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Word sense disambiguation of WordNet glosses Presenter: Chun-Ping Wu Author: Dan Moldovan, Adrian Novischi.
Using Semantic Relatedness for Word Sense Disambiguation
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
1 Chen Yirong, Lu Qin, Li Wenjie, Cui Gaoying Department of Computing The Hong Kong Polytechnic University Chinese Core Ontology Construction from a Bilingual.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
An Applied Ontological Approach to Computational Semantics Sam Zhang.
Utkal University We Work On Image Processing Speech Processing Knowledge Management.
Annotation Framework & ImageCLEF 2014 JAN BOTOREK, PETRA BUDÍKOVÁ
Knowledge Structure Vijay Meena ( ) Gaurav Meena ( )
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.
Lexical Semantics and Word Senses Hongning Wang
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
SERVICE ANNOTATION WITH LEXICON-BASED ALIGNMENT Service Ontology Construction Ontology of a given web service, service ontology, is constructed from service.
2016/9/301 Exploiting Wikipedia as External Knowledge for Document Clustering Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, and Xiaohua Zhou Proceeding.
Lexicons, Concept Networks, and Ontologies
Generating sets of synonyms between languages
WordNet: A Lexical Database for English
Bulgarian WordNet Svetla Koeva Institute for Bulgarian Language
WordNet WordNet, WSD.
A method for WSD on Unrestricted Text
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Text Mining Application Programming Chapter 3 Explore Text
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Dynamic Word Sense Disambiguation with Semantic Similarity
Presentation transcript:

WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G

Overview  What is WSD ?  How wordnet is analyzed as a Complex Network  What are the results  Project Methodology  Area of study  Key Findings/Results  New approaches  Improvement techniques  Conclusion

Project Description  Objective  Study on WSD Effects of WSD in Word Sense Ontology Characteristics of WordNet  Results  How do match words with other words Parameters taken for study of word sense Improvise them by making necessary changes Study network characteristics

WordNet - overview  Machine readable semantic dictionary interlinked by semantic relations  Developed at Princeton University as a large lexical database for English language  Most widely used linguistic resource  Free for public (GPL )  Forms a scale free network with small average shortest path having words as nodes and concepts as links  Easily navigable

WordNet (Structure)  Shows the relation in the form of  Noun, Verb, Adjective, adverb Synonym Hypernym (Is a kind of …) Hyponym (… Is a kind of) Troponym (particular ways to …) Meronym (parts of …) ---- about 25 relations  Also available for online navigation

WordNet online - by Princeton University

WordNet Browser

WordNet (working)  WSD:  Corpus based approaches  Set of samples that enables the system  Knowledge based approaches  Machine readable dictionary with relations  WordNet Research  Open source Ranking of synsets derived from word frequencies in the British National Corpus Top 1000  Content manipulation of text Dataset I – controlled and calibrated study Dataset II – collected using mechanical trunk using pairs

Word Sense Disambiguation (WSD)  Task of determining the meaning of an ambiguous word in the given context  Bank Edge of a river or Financial institution that accepts money  Refers to the resolution of lexical semantic ambiguity and its goal is to attribute the correct senses to words (AI-complete problem)

WSD: Area of Research  Assigning correct sense to words having electronic dictionary as source of word definitions  Open research field in Natural Language Processing (NLP)  Hard Problem which is a popular area for research  Used in speech synthesis by identifying the correct sense of the word

JavaScript Visual WordNet

Visual Thesaurus

WordNet – Theoretical aspects  Wordnet – word sense ontology  Symbols are words  Synset: list of words and semantic relations between them  Word sense disambiguation Wordnet structure using latent semantics Variable lexical notation for a concept Citibase – Thesaurus Semantic relatedness And few others…

WSD: using latent semantics  Measures the semantic distance of concepts  Relatedness and between-ness are calculated  Matrix form of wordnet data structure is used  Can be used to integrate with other applications  Uses Singular Value Decomposition (SVD) algorithm  Example: Multiple synsets are  {car, gondola}  {car, railway car}  {car, automobile} {Motor vehicle}, {Coupe}, {Sedan}, {Taxi}

MDS-example , 2, 3, 4, 10, 12 5, 6, 7, 8, 9, 11, 13 Geodesic Distance Matrix MDS k-means S 15

WSD: using latent semantics

WSD: variable lexical notations for a concept GGeneric concept notation: D = I ∪ J ∪ K ∴ J = D − (I ∪ K) = (D − I ) ∩ (D − K) = D ∩ (I ∪ K) J = D ∩ ( I ∩ K) since, B = D ∪ E ∪ F D = B − (E ∪ F) =(B − E) ∩ (B − F) = B ∩ (E ∪ F) D =B ∩ (E ∩ F) Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications ¯¯¯¯ ¯ ¯ ¯ ¯

WSD: variable lexical notations for a concept J = D ∩ ( I ∩ K) =( B ∩ (E ∩ F) ) ∩ ( I ∩ K) J = B ∩ ( (E ∩ F) ∩ ( I ∩ K) ) when J = fly, D = fish lure I = spinner k = troll And introducing boolean operators, AND for ∩ OR for ∪ NOT for ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

WSD: variable lexical notations for a concept  (“fly”) becomes : (“fisherman's lure” OR “fish lure”) AND ( (NOT “spinner”) AND (NOT “troll”) ) then B = lure, E = ground bait, F = stool pigeon  (“fly”) becomes : (“bait” OR “decoy” OR “lure”) AND ( ((NOT “ground bait”) AND (NOT “stoolpigeon”) AND((NOT “spinner”)AND(NOT “troll”)) ) Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

Thesaurus as a complex network  As a Directed Graph  sink composed of the 73,046 terms with kout = 0  source are the 30,260 terms with at least one outgoing link (kout > 0) – Root words absolute source : without incoming links kin = 0 normal source : (kout > 0 and kin > 0) bridge source : without outgoing links to root words (kout(source) = 0) 1 – Normal source 2 – Bridge source 3 – Absolute source 4 – sink Source: arXiv:cond-mat/ v1 2003

WSD: Semantic relatedness and word sense disambiguation Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications  Concepts that occur more frequently and closer with each others are “more related” to each others than the concepts that appear less frequently and farther one

WordNet Relationship  Semantic relatedness  Involves relationships among words car-wheel (meronym) hot-cold (antonym) pencil-paper (functional) penguin-antarctica (association) Bank-trust company (synonym)  Probability and Distance calculation  Frequency of synsets or words  Performance in NLP applications

WordNet Relationship Browser

WordNet Connect  Program to find all possible connections between two words in WordNet  Used in computing Semantic Opposition among word sense ontology  WordNet lexical database dictionary is used to read the semantic relations  Capabilities like number of paths, shortest path, overall network structure is studied

WordNet Connect

Future work  WordNet structure in terms of complex network  Key assumptions  WordNet lexical dictionary analyzed under the scope of source node, target node with an additional reference node  Achieve a cost effective path which is conditionally related to mean reference node  Control the path traversal with a relation of focus  Include Common File Number to make it more efficient

Conclusion  A single visualization can not reveal the entire structure of wordnet  There are different ways of analyzing the effectiveness of the overall system  A new method to evaluate the usefullness of the WordNet network structure

Questions and Comments