23- November-091 WordNet and Extended WordNet Sriram Rajaraman.

Slides:



Advertisements
Similar presentations
Building Wordnets Piek Vossen, Irion Technologies.
Advertisements

Using Link Grammar and WordNet on Fact Extraction for the Travel Domain.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth.
Creating a Similarity Graph from WordNet
TextNet – A Text-Based Intelligent System Sanda Harabagiu Dan Moldovan as (mis-)interpreted by Peter Clark.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Creating a Bilingual Ontology: A Corpus-Based Approach for Aligning WordNet and HowNet Marine Carpuat Grace Ngai Pascale Fung Kenneth W.Church.
Structured lexicons and Lexical semantics Especially WordNet ® See D Jurafsky & JH Martin: Speech and Language Processing, Upper Saddle River NJ (2000):
Using resources WordNet and the BNC. WordNet: History 1985: a group of psychologists and linguists start to develop a “lexical database” –Princeton University.
Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
NATURAL LANGUAGE TOOLKIT(NLTK) April Corbet. Overview 1. What is NLTK? 2. NLTK Basic Functionalities 3. Part of Speech Tagging 4. Chunking and Trees 5.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
1 Indo WordNet A WordNet for Hindi Centre for Technology Development for Indian Languages Computer Science and Engineering Department, IIT Bombay.
Course G Web Search Engines 3/9/2011 Wei Xu
Mining and Summarizing Customer Reviews
Indo WordNet A WordNet for Hindi
WORDNET Approach on word sense techniques - AKILAN VELMURUGAN.
BT Exact Technologies - Adastral Park, Ipswich July - October 2003 Linguistic Web Services for Semantic Web Dr. Vassil T. Vassilev London Metropolitan.
1 Natural Language Processing (2a) Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.
Lecture 18 Ontologies and Wordnet Topics Ontologies Wordnet Overview of MeaningReadings: Text 13.5 NLTK book Chapter 2 March 25, 2013 CSCE 771 Natural.
WordNet ® and its Java API ♦ Introduction to WordNet ♦ WordNet API for Java Name: Hao Li Uni: hl2489.
Formal Language Theory. Homework Read documentation on Graphviz – –
Oana Adriana Şoica Building and Ordering a SenDiS Lexicon Network.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
1 Query Operations Relevance Feedback & Query Expansion.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
WORDNET. THE WORDNET SYSTEM  Lexicographer files  Code: Lexico files  database  Search Routines and Interfaces.
WordNet: Connecting words and concepts Christiane Fellbaum Cognitive Science Laboratory Princeton University.
Integrating Semantic Dictionaries for English, French and Bulgarian into the NooJ System for the Purposes of Information Retrieval Svetla Koeva, Max Silbetztein.
WordNet: Connecting words and concepts Peng.Huang.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Wordnet - A lexical database for the English Language.
WordNet Enhancements: Toward Version 2.0 WordNet Connectivity Derivational Connections Disambiguated Definitions Topical Connections.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Element Level Semantic Matching Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan Paper by Fausto.
Word Meaning and Similarity
Utkal University We Work On Image Processing Speech Processing Knowledge Management.
Annotation Framework & ImageCLEF 2014 JAN BOTOREK, PETRA BUDÍKOVÁ
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Lexical Semantics and Word Senses Hongning Wang
Detecting and Exploiting Figurative Language in WordNet Wim Peters Department of Computer Science University of Sheffield.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
Extending Princeton WordNet withcompositional semantics Luchezar Jackov Institute for Bulgarian Language Bulgarian Academy of Sciences.
SERVICE ANNOTATION WITH LEXICON-BASED ALIGNMENT Service Ontology Construction Ontology of a given web service, service ontology, is constructed from service.
1 Representing and Reasoning on XML Documents: A Description Logic Approach D. Calvanese, G. D. Giacomo, M. Lenzerini Presented by Daisy Yutao Guo University.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Lexicons, Concept Networks, and Ontologies
Web News Sentence Searching Using Linguistic Graph Similarity
Natural Language Processing (NLP)
ConceptNet: Search ontology classes via human senses ---A proposal
ArtsSemNet: From Bilingual Dictionary To Bilingual Semantic Network
WordNet: A Lexical Database for English
Extracting Semantic Concept Relations
Bulgarian WordNet Svetla Koeva Institute for Bulgarian Language
WordNet WordNet, WSD.
A method for WSD on Unrestricted Text
Knowledge Representation for Natural Language Understanding
Lecture 19 Word Meanings II
Natural Language Processing (NLP)
Automatic generation of UW Dictionary through WordNet
Natural Language Processing (NLP)
Presentation transcript:

23- November-091 WordNet and Extended WordNet Sriram Rajaraman

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 2 Objective Introduce the idea of an semantic lexicon ontology, especially WordNet and eXtended WordNet

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 3 Focus Introduction WordNet eXtended WordNet Summary

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 4 Reference 1. WordNet: 2. eXtended WordNet: 3. Christiane Fellbaum,MIT,”WordNet : an electronic lexical database”, MIT Press, 1999, c George A. Miller, Richard Beckwith, Christiane Fellbaum,Derek Gross, and Katherine Miller, “Introduction to WordNet: An On-line Lexical Database”, core working paper 5. Rada Mihalcea, Dan I. Moldovan,” eXtended WordNet: progress report ” Proceedings of NAACL Workshop on WordNet and Other Lexical Resources, Sanda M. Harabagiu, George A. Miller, Dan I. Moldovan, “WordNet 2 - A Morphologically and Semantically Enhanced Resource”, SIGLEX 1999

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 5 Focus Introduction WordNet eXtended WordNet Summary

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 6 Introduction Traditional Dictionary What is available:  spelling  pronunciation  inflected and derivative forms  etymology  part of speech  definitions  illustrative uses of alternative senses  synonyms and antonyms  special usage notes

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 7 Tree Ref: Main Entry: tree Pronunciation: \ ˈ trē\ Function: noun Etymology: Middle English, from Old English trēow; akin to Old Norse trē tree, Greek drys, Sanskrit dāru wood Date: before 12th century - a woody perennial plant having a single usually elongate main stem generally with few or no branches on its lower part

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 8 Drawback of traditional dictionary What is missing:  It does not say, for example, that trees have roots, or that they consist of cells having cellulose walls, or even that they are living organisms  “Sense” of the super ordinate term aka hypernym (living plant or industrial plant)  Coordinate terms (bushes, shrubs, …)  Hyponyms - types of trees (pine, tropical,deciduous..)  Information assumed to be known to everyone ( trees have barks and leaves, they grow from seeds, they make their own food by photosynthesis- probably information for encyclopedia!)

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 9 How can we improve ? The missing information is structural – every word points upwards to its super-ordinate (hypernym), but not sideward to its co-ordinates or downward to the hyponym. Restriction due to alphabetical ordering, budget and size constraints- which can be overcome in an electronic lexical database

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 10 Focus Introduction WordNet eXtended WordNet Summary

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 11 What is WordNet? WordNet is a lexical database for the English language. WordNet 3.0 has [1]:  – 117,097 nouns (average noun has 1.23 senses)  – 11,488 verbs (average verb has 2.16 sense)  – 22,141 adjectives  – 4,601 adverbs Created and maintained at the Cognitive Science Laboratory of Princeton University Accessible (Also Downloadable) Interfaces available in, c, dot Net, java, perl, php, python, sql etc..(JWNL, WordNet.Net, RTiA wordNet, pywordne..)

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 12 WordNet Structure Words are organized as synsets in WordNet There are four disjoint kinds of synsets, containing either Nouns verbs Adjectives Adverbs

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 13 What is a synset?  Basic unit of WordNet  A group of synonymous words which refer to a common semantic concept  Words may belong to more than one synset – first sense is the most frequent sense  Words also include collocations (“eye contact’, “mix up”)  Example Example

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 14 Synset example “car” as in  {car, auto, automobile, machine, motorcar}  {car, railcar, railway car, railroad car}. “Chocolate” as in-

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 15 How are synsets related? A list of pointers associated with each sysnet to express the relationship between synsets WordNet defines 17 relations  10 between synsets  5 between wordsense  "gloss" (between a synset and a sentence, i.e a textual definition for each synset)  "frame" (between a synset and a verb construction pattern)

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 16 WordNet relations

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 17

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 18 Applications of WordNet Information Extraction Information Retreival Question Answering Word Sense Disambiguation Text Inference Coreference, coherence and metonymy Knowledge acquisition Internet Search engine

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 19 Limitations of WordNet Designed as a semantic lexicon, not a knowledge base Limited connections between topically related words Lack of morphological relationship(special algorithm does that) Lack of selectional restriction And more…. [6]

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 20 Focus Introduction WordNet eXtended WordNet Summary

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 21 eXtended WordNet[2] A project at the Human Language Technology Research Institute, at The University of Texas at Dallas( Provides several important enhancements (over WordNet2.0) intended to remedy the present limitations of WordNet Current Version: eXtended WordNet 2.0 (xwn )

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 22 Objective of eXtended WordNet Exploit the rich information, available in synset glosses (gloss is a sentence, i.e a textual definition for each synset) Semantic and logical enhancements to WordNet Increase the connectivity among the synsets by at least one order of magnitude Enable access to a broader context for each concept

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 23 What eXtended WordNet does?[5] Preprocessing and Parsing  Separation of glosses into definition and examples, tokenization and identification of compound words Word Sense Disambiguation  All words in a gloss is tagged with appropriate senses and linked to corresponding synsets Logical Form Transformation  Gloss  Logical Forms Topical Relations  Connections are established between the words, based on the context/topic

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 24 Extended WordNet tennis court “Tennis court: A court on which tennis is played.” playcourt tennis object location-ofdef {“tennis”, “lawn tennis”}

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 25 eXtended WordNet format Consists of four XML files--one for each part of speech:  Noun  Verb  Adjective  Adverb The xml tags contains attributes that specify the relationships

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 26 eXtended WordNet- Applications Core Knowledge Base for applications -  Question Answering  Information Retrieval  Information Extraction  Summarization  Natural Language Generation  Inferences  Other knowledge intensive applications

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 27 Focus Introduction WordNet eXtended WordNet Summary

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 28 Further Reading W3C- RDF/OWL Representation of WordNet  eXtended WordNet Format/algorithm  Current research at Princeton  Related Projects (APIs, Web Interface, Extension) 

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 29 Back up

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 30 WordNet Statistics Ref: Monosemous Words and SensesPolysemous wordsPolysemous senses Noun Verb Adjective Adverb

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 31 WordNet relations-0 Relations between synsets:  Synonymy  Hypernymy (superordination)  Hyponymy (subordination)  Holonymy (whole to part relation)  Meronymy (part to whole relation)  Antonymy  Troponymy (particular way to do something)

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 32 WordNet relations-1 Antonymy relation: (sweet) Definition: having a pleasant taste (as of sugar) Has the antonym: (sour) Definition: having a sharp biting taste. Troponymy relation: (dream) Definition: experience while sleeping. Has the troponym: (fantasize) Definition: have fantasies.

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 33 WordNet relations-2 Synonymy relation: (motor vehicle, automotive vehicle) Definition: a self propelled wheeled vehicle that does not run on rails. Hypernymy relation: (vehicle) Definition: a conveyance that transports people or objects. Hyponymy relation: (ambulance) Definition: a vehicle that takes people to and from hospitals

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 34 WordNet relations-3 Holonymy relation: (bicycle wheel) Definition: the wheel of a bicycle Has the holonym: (bicycle, bike, wheel) Definition: has two wheels; moved by foot pedals Meronymy relation: (bicycle wheel) Definition: the wheel of a bicycle Has the meronym: (spoke, radius) Definition: a radial member of a wheel joining the hub to the rim.

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 35 WordNet relations

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 36 Example: “limb”

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 37 WordNet Task Force Aims to support the deployment in RDF/OWL of WordNet Proposes inclusion of RDF or OWL versions of wordnets and lexical ontologies into the official distributions Integrating existing datamodels in order to provide a unified OWL vocabulary for RDF versions of wordnets. Distilling the most agreed-upon parts of practices for developing ontologies out of wordnets, and including them in a set of recommendations.

WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 38 Conversion (contd) g(Synset_ID,Gloss). The g operator specifies the gloss for a synset. Gloss is a string. Maps to: wn:gloss(Synset_ID, Gloss) hyp(Synset_ID_A,Synset_ID_B). The hyp operator specifies that the second synset is a hypernym of the first synset. This relation holds for nouns and verbs. The reflexive operator, hyponym, implies that the first synset is a hyponym of the second synset. Maps to: wn:hyponymOf(Synset_ID_A, Synset_ID_B) More details at -