Presentation is loading. Please wait.

Presentation is loading. Please wait.

23- November-091 WordNet and Extended WordNet Sriram Rajaraman.

Similar presentations


Presentation on theme: "23- November-091 WordNet and Extended WordNet Sriram Rajaraman."— Presentation transcript:

1 23- November-091 WordNet and Extended WordNet Sriram Rajaraman

2 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 2 Objective Introduce the idea of an semantic lexicon ontology, especially WordNet and eXtended WordNet

3 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 3 Focus Introduction WordNet eXtended WordNet Summary

4 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 4 Reference 1. WordNet: http://wordnet.princeton.edu/http://wordnet.princeton.edu/ 2. eXtended WordNet: http://xwn.hlt.utdallas.edu/http://xwn.hlt.utdallas.edu/ 3. Christiane Fellbaum,MIT,”WordNet : an electronic lexical database”, MIT Press, 1999, c1998. 4. George A. Miller, Richard Beckwith, Christiane Fellbaum,Derek Gross, and Katherine Miller, “Introduction to WordNet: An On-line Lexical Database”, core working paper 5. Rada Mihalcea, Dan I. Moldovan,” eXtended WordNet: progress report ” Proceedings of NAACL Workshop on WordNet and Other Lexical Resources, 2001 6. Sanda M. Harabagiu, George A. Miller, Dan I. Moldovan, “WordNet 2 - A Morphologically and Semantically Enhanced Resource”, SIGLEX 1999

5 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 5 Focus Introduction WordNet eXtended WordNet Summary

6 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 6 Introduction Traditional Dictionary What is available:  spelling  pronunciation  inflected and derivative forms  etymology  part of speech  definitions  illustrative uses of alternative senses  synonyms and antonyms  special usage notes

7 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 7 Tree Ref: http://www.merriam-webster.com/dictionary/Tree Main Entry: tree Pronunciation: \ ˈ trē\ Function: noun Etymology: Middle English, from Old English trēow; akin to Old Norse trē tree, Greek drys, Sanskrit dāru wood Date: before 12th century - a woody perennial plant having a single usually elongate main stem generally with few or no branches on its lower part

8 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 8 Drawback of traditional dictionary What is missing:  It does not say, for example, that trees have roots, or that they consist of cells having cellulose walls, or even that they are living organisms  “Sense” of the super ordinate term aka hypernym (living plant or industrial plant)  Coordinate terms (bushes, shrubs, …)  Hyponyms - types of trees (pine, tropical,deciduous..)  Information assumed to be known to everyone ( trees have barks and leaves, they grow from seeds, they make their own food by photosynthesis- probably information for encyclopedia!)

9 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 9 How can we improve ? The missing information is structural – every word points upwards to its super-ordinate (hypernym), but not sideward to its co-ordinates or downward to the hyponym. Restriction due to alphabetical ordering, budget and size constraints- which can be overcome in an electronic lexical database

10 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 10 Focus Introduction WordNet eXtended WordNet Summary

11 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 11 What is WordNet? WordNet is a lexical database for the English language. WordNet 3.0 has [1]:  – 117,097 nouns (average noun has 1.23 senses)  – 11,488 verbs (average verb has 2.16 sense)  – 22,141 adjectives  – 4,601 adverbs Created and maintained at the Cognitive Science Laboratory of Princeton University Accessible online @ http://wordnetweb.princeton.edu/perl/webwn (Also Downloadable) Interfaces available in, c, dot Net, java, perl, php, python, sql etc..(JWNL, WordNet.Net, RTiA wordNet, pywordne..)

12 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 12 WordNet Structure Words are organized as synsets in WordNet There are four disjoint kinds of synsets, containing either Nouns verbs Adjectives Adverbs

13 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 13 What is a synset?  Basic unit of WordNet  A group of synonymous words which refer to a common semantic concept  Words may belong to more than one synset – first sense is the most frequent sense  Words also include collocations (“eye contact’, “mix up”)  Example Example

14 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 14 Synset example “car” as in  {car, auto, automobile, machine, motorcar}  {car, railcar, railway car, railroad car}. “Chocolate” as in-

15 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 15 How are synsets related? A list of pointers associated with each sysnet to express the relationship between synsets WordNet defines 17 relations  10 between synsets  5 between wordsense  "gloss" (between a synset and a sentence, i.e a textual definition for each synset)  "frame" (between a synset and a verb construction pattern)

16 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 16 WordNet relations

17 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 17

18 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 18 Applications of WordNet Information Extraction Information Retreival Question Answering Word Sense Disambiguation Text Inference Coreference, coherence and metonymy Knowledge acquisition Internet Search engine

19 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 19 Limitations of WordNet Designed as a semantic lexicon, not a knowledge base Limited connections between topically related words Lack of morphological relationship(special algorithm does that) Lack of selectional restriction And more…. [6]

20 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 20 Focus Introduction WordNet eXtended WordNet Summary

21 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 21 eXtended WordNet[2] A project at the Human Language Technology Research Institute, at The University of Texas at Dallas(http://xwn.hlt.utdallas.edu)http://xwn.hlt.utdallas.edu Provides several important enhancements (over WordNet2.0) intended to remedy the present limitations of WordNet Current Version: eXtended WordNet 2.0 (xwn 2.0-1.1)

22 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 22 Objective of eXtended WordNet Exploit the rich information, available in synset glosses (gloss is a sentence, i.e a textual definition for each synset) Semantic and logical enhancements to WordNet Increase the connectivity among the synsets by at least one order of magnitude Enable access to a broader context for each concept

23 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 23 What eXtended WordNet does?[5] Preprocessing and Parsing  Separation of glosses into definition and examples, tokenization and identification of compound words Word Sense Disambiguation  All words in a gloss is tagged with appropriate senses and linked to corresponding synsets Logical Form Transformation  Gloss  Logical Forms Topical Relations  Connections are established between the words, based on the context/topic

24 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 24 Extended WordNet tennis court “Tennis court: A court on which tennis is played.” playcourt tennis object location-ofdef {“tennis”, “lawn tennis”}

25 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 25 eXtended WordNet format Consists of four XML files--one for each part of speech:  Noun  Verb  Adjective  Adverb The xml tags contains attributes that specify the relationships

26 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 26 eXtended WordNet- Applications Core Knowledge Base for applications -  Question Answering  Information Retrieval  Information Extraction  Summarization  Natural Language Generation  Inferences  Other knowledge intensive applications

27 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 27 Focus Introduction WordNet eXtended WordNet Summary

28 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 28 Further Reading W3C- RDF/OWL Representation of WordNet  http://www.w3.org/TR/wordnet-rdf/ http://www.w3.org/TR/wordnet-rdf/ eXtended WordNet Format/algorithm  http://xwn.hlt.utdallas.edu/wsd.html http://xwn.hlt.utdallas.edu/wsd.html Current research at Princeton  http://wordnet.cs.princeton.edu/projects.html http://wordnet.cs.princeton.edu/projects.html Related Projects (APIs, Web Interface, Extension)  http://wordnet.princeton.edu/wordnet/related-projects/ http://wordnet.princeton.edu/wordnet/related-projects/

29 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 29 Back up

30 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 30 WordNet Statistics Ref:http://wordnet.princeton.edu/wordnet/man2.1/wnstats.7WN.htmlwordnet.princeton.edu/wordnet/man2.1/wnstats.7WN.html Monosemous Words and SensesPolysemous wordsPolysemous senses Noun1013211577643783 Verb6261522718629 Adjective16889525214413 Adverb38507511870

31 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 31 WordNet relations-0 Relations between synsets:  Synonymy  Hypernymy (superordination)  Hyponymy (subordination)  Holonymy (whole to part relation)  Meronymy (part to whole relation)  Antonymy  Troponymy (particular way to do something)

32 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 32 WordNet relations-1 Antonymy relation: (sweet) Definition: having a pleasant taste (as of sugar) Has the antonym: (sour) Definition: having a sharp biting taste. Troponymy relation: (dream) Definition: experience while sleeping. Has the troponym: (fantasize) Definition: have fantasies.

33 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 33 WordNet relations-2 Synonymy relation: (motor vehicle, automotive vehicle) Definition: a self propelled wheeled vehicle that does not run on rails. Hypernymy relation: (vehicle) Definition: a conveyance that transports people or objects. Hyponymy relation: (ambulance) Definition: a vehicle that takes people to and from hospitals

34 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 34 WordNet relations-3 Holonymy relation: (bicycle wheel) Definition: the wheel of a bicycle Has the holonym: (bicycle, bike, wheel) Definition: has two wheels; moved by foot pedals Meronymy relation: (bicycle wheel) Definition: the wheel of a bicycle Has the meronym: (spoke, radius) Definition: a radial member of a wheel joining the hub to the rim.

35 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 35 WordNet relations

36 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 36 Example: “limb”

37 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 37 WordNet Task Force Aims to support the deployment in RDF/OWL of WordNet Proposes inclusion of RDF or OWL versions of wordnets and lexical ontologies into the official distributions Integrating existing datamodels in order to provide a unified OWL vocabulary for RDF versions of wordnets. Distilling the most agreed-upon parts of practices for developing ontologies out of wordnets, and including them in a set of recommendations.

38 WordNet and eXtended WordNet 23- November-09University of Texas at Dallas Erik Jonnson School of Engineering and Computer Science 38 Conversion (contd) g(Synset_ID,Gloss). The g operator specifies the gloss for a synset. Gloss is a string. Maps to: wn:gloss(Synset_ID, Gloss) hyp(Synset_ID_A,Synset_ID_B). The hyp operator specifies that the second synset is a hypernym of the first synset. This relation holds for nouns and verbs. The reflexive operator, hyponym, implies that the first synset is a hyponym of the second synset. Maps to: wn:hyponymOf(Synset_ID_A, Synset_ID_B) More details at - http://www.w3.org/TR/wordnet-rdf/#details


Download ppt "23- November-091 WordNet and Extended WordNet Sriram Rajaraman."

Similar presentations


Ads by Google