Lexical Nets Miriam Butt December 2002. WordNet Main Researchers: George Miller (Princeton), Christiane Fellbaum (Princeton) WordNet is free and runs.

Slides:



Advertisements
Similar presentations
Building Wordnets Piek Vossen, Irion Technologies.
Advertisements

Extraction and Visualisation of Emotion from News Articles Eva Hanser, Paul Mc Kevitt School of Computing & Intelligent Systems Faculty of Computing &
C SC 620 Advanced Topics in Natural Language Processing Lecture 4 1/27/04.
Statistical NLP: Lecture 3
Lexical Semantics and Word Senses Hongning Wang
Creating a Similarity Graph from WordNet
Introduction to Computational Linguisitics The Lexicon.
Ewa Rudnicka, Wojciech Witkowski, Maciej Piasecki G4.19 Research Group Institute of Informatics, Wrocław University of Technology nlp.pwr.wroc.pl plwordnet.pwr.wroc.pl.
Section 4: Language and Intelligence Overview Instructor: Sandiway Fong Department of Linguistics Department of Computer Science.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Stemming, tagging and chunking Text analysis short of parsing.
1 Collocation and translation MA Literary Translation- Lesson 2 prof. Hugo Bowles February
A STUDY ON THE KNOWLEDGE SOURCES OF TURKISH EFL LEARNERS IN LEXICAL INFERENCING İlknur İSTİFÇİ Anadolu University Eskişehir, TURKEY Eskişehir, TURKEY.
Structured lexicons and Lexical semantics Especially WordNet ® See D Jurafsky & JH Martin: Speech and Language Processing, Upper Saddle River NJ (2000):
Using resources WordNet and the BNC. WordNet: History 1985: a group of psychologists and linguists start to develop a “lexical database” –Princeton University.
C SC 620 Advanced Topics in Natural Language Processing Lecture Notes 2 1/20/04.
Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.
The Study of Meaning in Language
VERB PHRASE. What are verbs? Verbs provide the focal point of the clause. The main verb in a clause determines the other clause elements that can occur.
1 Indo WordNet A WordNet for Hindi Centre for Technology Development for Indian Languages Computer Science and Engineering Department, IIT Bombay.
Antonym Creation Tool Presented By Thapar University WordNet Development Team.
Important lexical items nouns, verbs, adjectives, adverbs, prepositions, and different lexical fields (e.g. colors, days of the week, action verbs, etc.)
1 Natural Language Processing (2a) Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
Linguistic modeling of professional terminology Olga Klevtsova, Tyumen State University, Russia.
1 Statistical NLP: Lecture 10 Lexical Acquisition.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
WordNet ® and its Java API ♦ Introduction to WordNet ♦ WordNet API for Java Name: Hao Li Uni: hl2489.
1 Define a model 2 Populate the lexicon. Core Model.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
1 Query Operations Relevance Feedback & Query Expansion.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.
Vocabulary Acquisition Learning and teaching vocabulary.
WORDNET. THE WORDNET SYSTEM  Lexicographer files  Code: Lexico files  database  Search Routines and Interfaces.
Lexical Semantics Chapter 16
WordNet: Connecting words and concepts Christiane Fellbaum Cognitive Science Laboratory Princeton University.
Integrating Semantic Dictionaries for English, French and Bulgarian into the NooJ System for the Purposes of Information Retrieval Svetla Koeva, Max Silbetztein.
WordNet: Connecting words and concepts Peng.Huang.
Semantics The study of meaning in language. Semantics is…  The study of meaning in language.  It deals with the meaning of words (Lexical semantics)
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Semantics 1: Lexical Semantics Ling400. What is semantics? Semantics is the study of the linguistic meaning of morphemes, words, phrases, sentences.Semantics.
Wordnet - A lexical database for the English Language.
Semantic distance & WordNet Serge B. Potemkin Moscow State University Philological faculty.
WordNet Enhancements: Toward Version 2.0 WordNet Connectivity Derivational Connections Disambiguated Definitions Topical Connections.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Word sense disambiguation of WordNet glosses Presenter: Chun-Ping Wu Author: Dan Moldovan, Adrian Novischi.
Ontology Engineering: from Cognitive Science to the Semantic Web Maria Teresa Pazienza University of Roma Tor Vergata, Italy 1.
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
Lecture 19 Word Meanings II Topics Description Logic III Overview of MeaningReadings: Text Chapter 189NLTK book Chapter 10 March 27, 2013 CSCE 771 Natural.
Annotation Framework & ImageCLEF 2014 JAN BOTOREK, PETRA BUDÍKOVÁ
Knowledge Structure Vijay Meena ( ) Gaurav Meena ( )
Lexical Semantics and Word Senses Hongning Wang
ENGLISH 5050: English Syntax and Morphology All quotations, unless otherwise noted, are from Chapter 2 of The Grammar Book, 2nd edition. Robert F. van.
Introduction to Computational Linguisitics The Lexicon.
SPAG What we need to know….
Parts of Speech Noun, Verb, Pronoun, Adjective, Adverb, Preposition, Conjunction, Interjection 1.
Statistical NLP: Lecture 3
Entity-Relationship Model
Multimedia Content-Based Retrieval
Ontology Engineering: from Cognitive Science to the Semantic Web
About WordNet 瞿裕忠
Lecture 19 Word Meanings II
CSC 594 Topics in AI – Applied Natural Language Processing
WordNet: A Lexical Database for English
Bulgarian WordNet Svetla Koeva Institute for Bulgarian Language
WordNet WordNet, WSD.
The Study of Meaning in Language
Weak Slot-and-Filler Structures
Linguistic Essentials
Lecture 19 Word Meanings II
Presentation transcript:

Lexical Nets Miriam Butt December 2002

WordNet Main Researchers: George Miller (Princeton), Christiane Fellbaum (Princeton) WordNet is free and runs on Unix, Windows and Macintosh Aim: to create a lexical thesaurus (not a dictionary) which models the lexical organization used by humans. It is designed to be compatible with a range of applications (i.e, one module among many).

WordNet To Date: different word forms ( simple words, collocations) Each of these word classes is organized by different principles. Types of Words: nouns, verbs, adjectives, adverbs (no function words)

Organization by Likeness Fillenbaum and Jones (1965): If you give a person a/n noun, 79% of the time you get a noun adjective, 65% of the time you get an adjective verb, 43% of the time you get a verb Synsets: words are arranged in clusters of synonym sets to help identify the meaning and differentiate it from other meanings. { board, plank}

Organization by Likeness Semantic Relations: The overall organizing principle of WordNet is in terms of semantic relations (rather than orthography). So, one can ask WordNet about all these. Synonymy (dog, canine) Antonymy (rich, poor) Hyponomy (tree, maple) --- ISA Relation Meronymy (tree, limb) --- HASA Relation Entailments (snore, sleep), for verbs

Nouns Basic Organizing Principle: Lexical Inheritance System To Date: word forms organized in synsets. Some collocations, but no proper  is equivalent to ISA relation    organism ~  inverse ISA relation

Nouns ISA Top-Down organizaton: 25 unique “beginners” (Table 1) Additional Organization within Beginners: this was found to be useful (Figure 1) Planned: add details which distinguish concepts. E.g., for canary: (1) Attributes: small, yellow (adjectives) (2) Parts: beak, wings (nouns) (3) Functions: sing, fly (verbs)

Nouns Parts and Meronymy: can definite part-whole relations in a number of ways and can ask WordNet about them (p is most frequent). Component Parts: W m #p  W h, W m is component of W h (door, car) Member: W m #m  W h, W m is member of W h (tree, forest) Stuff: W m #s  W h, W m is the stuff W h is made of (aluminum, airplane)

Nouns The organization of nouns also includes information about antonyms (man, woman), all this makes for a relatively complex network (Figure 2).

Adjectives Basic Organizing Principle: Related Senses (similarity) To Date: word forms organized into synsets Descriptive, relational, color adjectives. !  expresses binary opposition heavy !  light and also light !  heavy

Adjectives Not all adjectives have direct antonyms. But one can find indirect antonyms via the similarity pointer. moist &  wet !  dry &  similarity pointer See Figure 1 for an example network. One possible organizing principle could be gradation (Table 1), but this was found not to be useful (only 2% of adjectives work this way).

Adjectives Relational Adjectives mean something like “of, relating/pertaining to, or associated with” and are generally derived from nouns. WordNet provides a pointer to the relevant noun (chemical, chemistry). See attached handout for some sample adjective codings. Syntactic Restrictions: some adjectives can only appear in attributive position (the alert child), some only in predicate position (the child is astir).

Verbs Basic Organizing Principle: Lexical Entailments To Date: word forms ( unique) organized into synsets (generally, languages seem to have less verbs than nouns). Simple Verbs, but also particle verbs (look up)

Verbs Verb Organization: Verbs are sorted into 15 files, based on semantic criteria (cf. also Levin (1993) on English verb classes). 1)Bodily Function and Care: 275 synsets, sweat, shiver, faint, freeze 2) Change: 750 synsets, change, alter, vary, modify 3) Communication: 710 synsets, beg, order, lisp, neigh 4) Competition: 200 synsets, duel, face-off, fight, referee 5) Consumption: 130 synsets, drink, eat

Verbs 6) Contact: 820 synsets, fasten, attach, rub, paw, box 7) Cognition: deduce, induce, infer, guess 8) Creation: 250 synsets, create, invent, sew, bake 9) Motion: 500 synsets, move, travel, run, gallop, fly 10) Emotion/Psych: fear, miss, love, amuse, anger

Verbs 14) Social Interaction: 400 synsets,impeach, petition, quarrel 15) Weather (smallest file): 66 synsets, rain, thunder 11) Stative (“other” category): 200 synsets, equal, differ, be 12) Perception: 200 synsets): watch, spy, gaze, ache, hurt 13) Possession: 300 synsets): have, give, take, rob

Verbs Relational: Relate the differing verbs in clusters using the factors identified by Decompositional Semantics as crucial: Cause, Become (=Change), Manner. WordNet takes a Relational rather than a Decompositional Approach Decompositional: bake = Cause(x, Become (y,BAKED)) TRAVEL Example: swim is a type/manner of TRAVEL through water

Verbs Lexical Entailments: if you snore, that entails you are sleeping. As this seems to be a psychologically useful way to model things, take advantage of it. Further Factors: 1) temporal inclusion/co-extensiveness 2) Causal Relation/Backward Presupposition Example: see Figure 3

Familiarity Index It would be nice to know how “familiar” or “common” a word is. Frequency has often been implicated in finding this out, but it is hard to calculate frequence for all of English. Another idea is to take Polysemy as an indicator of familiarity. WordNet calculates this and you can ask it about the Familiaity Index of a given word. Example: horse is common as a noun, polysemy count = 6

Morphy WordNet has the same problem as many other applications: one needs to figure out the lemma for any given word form. Solution: home grown stemmer called Morphy, which relies on a lexicon and a crude knowledge of English morphology (Table 4).

FrameNet Main Researchers: Charles Fillmore (Berkeley) FrameNet is not yet accessible, though partner projects ar contributing to the effort (e.g., IMS Stuttgart). Aim: to create a computational lexicon which describes the “semantic frames” and valencies of verbs, nouns and adjectives.

FrameNet A Typical Frame: frame(CommericalTransaction) frame-elements{ BUYER, SELLER, PAYMENT, GOODS } scenes( BUYER gets GOODS, SELLER gets PAYMENT ) Frames can be inherited: frame(RealEstateTransaction) inherits(CommercialTransaction) link( BORROWER=BUYER, LOAN=PAYMENT ) frame-elements{ BORROWER, LOAN, LENDER } scenes( LOAN (from LENDER) creates PAYMENT, BUYER gets LOAN )

FrameNet Plan: One would like to automatically tag texts with semantic frames. See Figures 1 and 2 (from Coling 1998). First Experiment: Gildea and Jurafsky (2001) See Tables 8 and 9