WordNet Ontology Robert Lehman
WordNet What is WordNet? Why create WordNet? Word Relationships Limitation Applications and Related Projects RDF/OWL Demo
What is WordNet? A large lexical database of the English Language. Contains over 150,000 words Is a knowledge base that maps the way the mind stores and uses language Developed under the direction of George A. Miller (professor of psychology at Princeton University's Department of Psychology).
Why WordNet? People find out information about words in dictionary. Dictionaries are built to be used by humans, not machines. WordNet combines traditional lexicographic information and modern computing. WordNet organizes English words into semantic relations called synonym sets (or synsets for short).
Why WordNet? The goal of WordNet was to develop a system that would be consistent with the knowledge aquired over the years about how human beings process language. For example, a human can quickly verify that canaries can sing because a canary is a songbird. (one level) With more time, we can determine that canaries can fly because a canary is a bird and birds can fly. (two levels) With even more levels, humans will be able to verify that canaries have skin with similar reasoning to the previous two examples.
Word Relationships WordNet distinguishes between different types of words due to different grammatical rules. Nouns, Verbs, Adjectives, and Adverbs Many different semantic relationships between words could have been included in WordNet, but the following were chosen because they apply broadly throughout English and are familiar to a wide variety of users.
Relationships The core concept of WordNet is a synset. A synset is a group of words which have a synonymous meaning. For example, the word 'car' would appear in a synset such as {car, auto, automobile} and also {car, railcar, railroad car} Both sets contain “car,” but they have two different meanings.
Types of Relationships Synonym – Similar word i.e. pipe & tube, sad & unhappy, rise & ascend Antonym – Opposite word i.e. wet & dry, powerful & powerless, slowly & rapidly Can be applied to all four word types.
Noun Relationships Cont. Hyponym – Y is a hyponym of X if every Y is a (kind of) X i.e. dog & canine, willow & tree, tree & plant Hypernym – reverse of Hyponyms Coordinate Term – Y is a coordinate term of X if X and Y share a hypernym i.e. wolf is a coordinate term of dog, and dog is a coordinate term of wolf because both are canines.
Noun Relationships Cont. Holonym – contains. i.e. building & windows, mouth & teeth Meronym – is part of. i.e. window & building, paper & book, evaporation & the water cycle. The following five are for Nouns.
Verb Relationships Hypernym – if activity X is a kind of Y i.e. to perceive is a hypernym of to listen. Troponym – if activity X is done in some matter of Y i.e. to lisp is a troponym if to talk. Entailment – if by doing activity X, you must be doing Y i.e. to sleep is entailed by to snore. Coordinate term – verbs sharing common hypernym i.e. to lisp and to yell because they’re both forms of the verb to talk.
Adjective Relationships Related Nouns i.e. Curious (adj) & curiosity (n) Similar To i.e. quick & fast, Participle of Verb i.e. daring & to dare.
Adverb Relationship Root Adjective i.e. greatly (adverb) & great (adj)
Relationships Words can be connected through lexical relationships For example, the Hypernym is organized as hierachy. i.e. You can conclude that a dog is an mammal based on the links between words. Dog is a canine, canine is a carnivore, carnivore is a placental mammal, placental mammal is a mammal.
Example of Inherited Hypernyms S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds) "the dog barked all night" S:domestic dogCanis familiaris S: (n) canine, canid (any of various fissiped mammals with nonretractile claws and typically long muzzles) S:caninecanid S: (n) carnivore (a terrestrial or aquatic flesh-eating mammal) "terrestrial carnivores have four or five clawed digits on each limb" S:carnivore S: (n) placental, placental mammal, eutherian, eutherian mammal (mammals having a placenta; all mammals except monotremes and marsupials) S:placentalplacental mammaleutherianeutherian mammal S: (n) mammal, mammalian (any warm-blooded vertebrate having the skin more or less covered with hair; young are born alive except for the small subclass of monotremes and nourished with milk) S:mammalmammalian S: (n) vertebrate, craniate (animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain enclosed in a skull or cranium) S:vertebratecraniate S: (n) chordate (any animal of the phylum Chordata having a notochord or spinal column) S:chordate S: (n) animal, animate being, beast, brute, creature, fauna (a living organism characterized by voluntary movement) S:animalanimate beingbeastbrutecreaturefauna S: (n) organism, being (a living thing that has (or can develop) the ability to act or function independently) S:organismbeing S: (n) living thing, animate thing (a living (or once living) entity) S:living thinganimate thing S: (n) whole, unit (an assemblage of parts that is regarded as a single entity) "how big is that part compared to the whole?"; "the team is a unit" S:wholeunit S: (n) object, physical object (a tangible and visible entity; an entity that can cast a shadow) "it was full of rackets, balls and other objects" S:objectphysical object S: (n) physical entity (an entity that has physical existence) S:physical entity S: (n) entity (that which is perceived or known or inferred to have its own distinct existence (living or nonliving)) S:entity
Limitations of WordNet WordNet does not include etymology, pronunciation, forms of irregular verbs, and contains limited usage of words. WordNet is a database of many common words, so it does not cover special domain vocabulary.
Applications Used on various projects dealing with word sense disambiguation, information retrieval, automatic text classification, automatic text summarization, and automatic crossword puzzle generators. Simpli Internet search engine used WordNet to disambiguate and expand keywords to help retrieve information on-line
Interfaces WordNet has been interfaced with many programming languages. For a full list, check out
Related Project There are many projects to make WordNets for other languages. EuroWordNet – produced WordNets for several European languages and linked them together. Not freely available. BalkaNet – produced WordNets for six European Languages (Bulgarian, Czech, Greek, Romanian, Turkish, and Serbian) Also produced a free WordNet creator and editor called VisDic. WordVenture – project is a graphical WordNet editor. Allows exploration, addition, and modification in a Java applet.
RDF/OWL Representation Three main classes: Synset WordSense Word First two have subclasses for lexical groups present in WordNet Each instance of these classes have its own URI The RDF/OWL representation follows the semantic rules between words in WordNet
Demo ml ml
References miller.pdf miller.pdf