Download presentation
Presentation is loading. Please wait.
Published byJonah Richards Modified over 8 years ago
1
11 Chapter 19 Lexical Semantics
2
2 Lexical Ambiguity Most words in natural languages have multiple possible meanings. –“pen” (noun) The dog is in the pen. The ink is in the pen. –“take” (verb) Take one pill every morning. Take the first right past the stoplight. Syntax helps distinguish meanings for different parts of speech of an ambiguous word. –“conduct” (noun or verb) John’s conduct in class is unacceptable. John will conduct the orchestra on Thursday.
3
3 Motivation for Word Sense Disambiguation (WSD) Many tasks in natural language processing require disambiguation of ambiguous words. –Question Answering –Information Retrieval –Machine Translation –Text Mining –Phone Help Systems
4
4 Sense Inventory What is a “sense” of a word? –Homonyms (disconnected meanings) bank: financial institution bank: sloping land next to a river –Polysemes (related meanings with joint etymology) bank: financial institution as corporation bank: a building housing such an institution Sources of sense inventories –Dictionaries –Lexical databases
5
5 WordNet A detailed database of semantic relationships between English words. Developed by famous cognitive psychologist George Miller and a team at Princeton University. About 144,000 English words. Nouns, adjectives, verbs, and adverbs grouped into about 109,000 synonym sets called synsets.
6
6 WordNet Synset Relationships See Figures 19.2, 19.3, and WordNet (in lecture)
7
7 EuroWordNet WordNets for –Dutch –Italian –Spanish –German –French –Czech –Estonian
8
8 WordNet Senses WordNets senses (like many dictionary senses) tend to be very fine-grained. “play” as a verb has 35 senses, including –play a role or part: “Gielgud played Hamlet” –pretend to have certain qualities or state of mind: “John played dead.” Difficult to disambiguate to this level for people and computers. Not clear such fine-grained senses are useful for NLP. Several proposals for grouping senses into coarser, easier to identify senses (e.g. homonyms only).
9
9 Senses Based on Needs of Translation Only distinguish senses that translate to different words in some other language. http://spanish.about.com/cs/vocabulary/a/moreverbpairs.htm –play: tocar vs. jugar –know: conocer vs. saber –be: ser vs. estar –leave: salir vs dejar –take: llevar vs. tomar vs. sacar
10
Cross-Lingual Sense Distinctions: A specific idea See Section 3.4 : Resnik & Yarowsky (2000), Natural Language Engineering 5(3) (on the schedule) Preliminaries: “Parallel” corpora are texts that are translations of each other In order to exploit a parallel corpus, the sentences must be aligned; that is, the sentences that are translations of each other must be matched up. If they are not already aligned after the translation process, automatic methods are used to align the sentences. –A search for “alignment” in the ACL anthology, which contains all ACL publications, gives 4,440 results. ACL are the main publications in Natural Language Processing (NLP). Several aligned parallel corpora are available and used by the NLP research community. One large repository of NLP data, tools and standards: Linguistic Data Consortium. Click on catalog; by type and source; search for “parallel”: 34 hits. Linguistic Data Consortium 10
11
Cross-Lingual Sense Distinctions: A specific idea Preliminaries (continued): Table 4 is the result of combining information from electronic bilingual dictionaries, with mappings to English WordNet senses. Resources to create a resource such as Table 4 are made available by the NLP research community. 11
12
Cross-Lingual Sense Distinctions: A specific idea This specific case study involves creating word-sense training data. Suppose our target word is “interest”. The training data is a corpus of sentences which contain “interest” + sense tags. Where will these sentences come from? From the English sides of the parallel corpora. 12
13
Cross-Lingual Sense Distinctions: A specific idea See Section 3.4 : Resnik & Yarowsky (2000), Natural Language Engineering 5(3) (on the schedule) –Restrict sense inventory to distinctions that are lexicalized cross-lingually –Middle ground between homographs within a single language (too course-grained) and all the fine-grained distinctions found in monolingual dictionaries.. Table 4: linking existing lexical resources such as WordNet with cross-lingual information from bilingual dictionaries. –“This table suggests how multiple parallel bilingual corpora can be used to yield sets of training data [i.e., training data for English word senses without requiring human manual annotation] covering different subsets of the English sense inventory, that in aggregate may yield tagged data for all given sense distinctions when any one language alone may not be adequate” 13
14
Cross-Lingual Sense Distinctions: A specific idea A lexicon such as Table 4 can help us refine the English sense inventory for the purpose of translation between English and these languages: Senses maybe dropped, split, or eliminated (which in Table 4?) 14
15
Semantic Roles Also called Thematic Roles We recently looked at PropBank again Revisit FrameNet 15
16
Semantic Roles: Example of Usefulness Shallow meaning representation that support simple inferences Can generalize over different surface realizations of predicate arguments. For example, the subject fill different roles: –John (agent) broke the window (theme) –John (agent) broke the window (theme) with a rock (instrument) –The rock (instrument) broke the window (theme) –The window (theme) broke –The window (theme) was broken by John (agent) 16
17
Selectional Restrictions Restrictions placed by a verb on the fillers of its semantic roles “I want to eat X” (X is the theme) Selectional restriction? Theme is edible. “I want to eat someplace that’s close to ICSI” (see Lecture) 17
18
Selectional Restrictions Associated with senses, not words: –“They served apple pie” (THEME: edible) –“Which airlines serve Denver?” (THEME: city) Selectional restrictions vary widely. –“hate”: Experiencer: person; THEME: could be almost anything –“multiply” (sense 1): THEME: numbers We will use selectional restictions later to help us with Word-Sense Diambiguation 18
19
Selectional Restrictions Can sometimes identify metaphor –My computer died –The computer ate my disk –Thoughts danced in my head –The death of democracy was imminent –John choked on the exam –He was drowning in self pity Metaphor’s relationship to word sense? 19
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.