Lexical Ambiguity Resolution / Sense Disambiguation

Slides:



Advertisements
Similar presentations
Albert Gatt Corpora and Statistical Methods Lecture 13.
Advertisements

Introduction to Information Retrieval
CS Thesaurus Creation/ Term Clustering Two major applications: 1.Query expansion – fleshing out sparse queries with related words  improves recall.
Supervised Learning Recap
11 Chapter 20 Computational Lexical Semantics. Supervised Word-Sense Disambiguation (WSD) Methods that learn a classifier from manually sense-tagged text.
Word sense disambiguation and information retrieval Chapter 17 Jurafsky, D. & Martin J. H. SPEECH and LANGUAGE PROCESSING Jarmo Ritola -
Machine learning learning... a fundamental aspect of intelligent systems – not just a short-cut to kn acquisition / complex behaviour.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Word Sense Disambiguation Ling571 Deep Processing Techniques for NLP February 28, 2011.
Semi-supervised learning and self-training LING 572 Fei Xia 02/14/06.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Three kinds of learning
CS 4705 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised –Dictionary-based.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Distributional Clustering of English Words Fernando Pereira- AT&T Bell Laboratories, 600 Naftali Tishby- Dept. of Computer Science, Hebrew University Lillian.
Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Semi-Supervised Natural Language Learning Reading Group I set up a site at: ervised/
1 UCB Digital Library Project An Experiment in Using Lexical Disambiguation to Enhance Information Access Robert Wilensky, Isaac Cheng, Timotius Tjahjadi,
Evaluation of Utility of LSA for Word Sense Discrimination Esther Levin, Mehrbod Sharifi, Jerry Ball
Corpus-Based Approaches to Word Sense Disambiguation
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Unsupervised Word Sense Disambiguation Rivaling Supervised Methods Oh-Woog Kwon KLE Lab. CSE POSTECH.
-- CS466 Lecture XVI --1 Vector Models for Person / Place PERSON CENTROID PLACE CENTROID PERSON PLACE KEY.
Word Sense Disambiguation Many words have multiple meanings –E.g, river bank, financial bank Problem: Assign proper sense to each ambiguous word in text.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
11 Chapter 20 Computational Lexical Semantics. Supervised Word-Sense Disambiguation (WSD) Methods that learn a classifier from manually sense-tagged text.
Lexical Semantics & Word Sense Disambiguation CMSC Natural Language Processing May 15, 2003.
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
1 COMP3503 Semi-Supervised Learning COMP3503 Semi-Supervised Learning Daniel L. Silver.
Word Sense Disambiguation Kyung-Hee Sung Foundations of Statistical NLP Chapter 7.
Unsupervised Word Sense Disambiguation REU, Summer, 2009.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Introduction to Machine Learning Dmitriy Dligach.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
-- CS466 Lecture XVIII -- Lexical Ambiguity Resolution / Sense Disambiguation Supervised methods Non-supervised methods –Class-based models –Seed models.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
Graph-based WSD の続き DMLA /7/10 小町守.
Intro to NLP - J. Eisner1 Splitting Words a.k.a. “Word Sense Disambiguation”
Data Mining Practical Machine Learning Tools and Techniques
Information Retrieval: Models and Methods
Semi-Supervised Clustering
Deep learning David Kauchak CS158 – Fall 2016.
IMAGE PROCESSING RECOGNITION AND CLASSIFICATION
Information Retrieval: Models and Methods
Data Mining K-means Algorithm
cs540 - Fall 2015 (Shavlik©), Lecture 25, Week 14
Entity- & Topic-Based Information Ordering
10701 / Machine Learning.
Dipartimento di Ingegneria «Enzo Ferrari»,
Huazhong University of Science and Technology
Vector-Space (Distributional) Lexical Semantics
Compact Query Term Selection Using Topically Related Text
Statistical NLP: Lecture 9
Thesaurus Creation/ Term Clustering
Information Organization: Clustering
Computational Lexical Semantics
Creating Data Representations
Special Topics in Text Mining
Introduction to Text Analysis
Text Categorization Berlin Chen 2003 Reference:
5. Vector Space and Probabilistic Retrieval Models
Clustering Techniques
Junheng, Shengming, Yunsheng 11/09/2018
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

Lexical Ambiguity Resolution / Sense Disambiguation Supervised methods Non-supervised methods Class-based models Seed models Vector models EM Iteration Unsupervised clustering Sense induction Anaphosa Resolution -- CS466 Lecture XVIII --

For sense disambiguation, ** Ambiguous verbs (e.g., to fire) depend heavily on words in local context (in particular, their objects). ** Ambiguous nouns (e.g., plant) depend on wider context. For example, seeing [ greenhouse, nursery, cultivation ] within a window of +/- 10 words is very indicative of sense. -- CS466 Lecture XVI --

Deficiency of “Bag-of-words” Approach context is treated as an unordered bag of words -> like vector model (and also previous neural network models etc.) -- CS466 Lecture XVI --

Words tend to exhibit only one sense in a given collocation or Observations Words tend to exhibit only one sense in a given collocation or word association 2 word Collocations (word to left or word to the right) Prob(container) Prob(vehicle) .99 + .01 - .96 + .04 - oxygen Tank Panzer Empty P (Person) P (Place) .01 .99 .95 .05 .02 .98 .96 .04 In Madison With Dr. Madison Airport Madison mayor Mayor Formally P (sense | collocation) is a low entropy distribution

Collocation Means (originally): - “in the same location” - “co-occurring” in some defined relationship Adjacent (bigram allocations) Verb/Object collocations Co-occurrence within +/- k words collocations Fire her Fire the long rifles Made of lead, iron, silver, … Other Interpretation: An idiomatic (non-compositional high frequency association) Eg. Soap opera, Hong Kong -- CS466 Lecture XVI --

Order and Sequence Matter: plant pesticide  living plant pesticide plant  manufacturing plant a solid lead  advantage or head start a solid wall of lead  metal a hotel in Madison  place I saw Madison in a hotel bar  person -- CS466 Lecture XVI --

Adjacent words more salient than those 20 words away Observation Distance matters Adjacent words more salient than those 20 words away All positions give same weight -- CS466 Lecture XVI --

Words tend to exhibit only one sense in a given discourse or document Observations Words tend to exhibit only one sense in a given discourse or document = word form Very unlikely to have living Plants / manufacturing plants referenced in the same document (tendency to use synonym like factory to minimize ambiguity) communicative efficiency (Grice) Unlikely to have Mr. Madison and Madison City in the same document Unlikely to have Turkey (both country and bird) in the same document -- CS466 Lecture XVI --

Vector Models for Word Sense KEY Sense 1 Centroid Sense 1 Sense 2 Sense 2 Centroid -- CS466 Lecture XVI --

S1 – S2 for all terms in sum vec[sum][term] != 0 Plant 1 2 3 4 5 6 * S1 Xi Sum += V[i] For each vector Sim (1, i) For each term in vecs[docn] Sum[term] += vec[docn] S2 Sim (2,i) S1 > S2 assign sense 1 else sense 2 Sum 1 2 3 4 5 6 * S1 – S2 for all terms in sum vec[sum][term] != 0 -- CS466 Lecture XVI --

Vector Models for Person / Place KEY PERSON CENTROID PERSON PLACE PLACE CENTROID -- CS466 Lecture XVI --

Vector Models for Lexical Ambiguity Resolution / Lexical Classification Treat labeled contexts as vectors Class W-3 W-2 W-1 W0 W1 W2 W3 PLACE long way from Madison to Chicago COMPANY When Madison investors issued a Convert to a traditional vector just like a short query V328 V329 -- CS466 Lecture XVI --

Training Space (Vector Model) Per Pl Pl Pl Per Pl Per Per Pl Per Pl Per Per Pl Person Centroid Place Centroid new example Eve Co Company Centroid Co Eve Co Co Eve Co Co Event Centroid Co -- CS466 Lecture XVI --

Problem with supervised methods Tagged training data is expensive (time, resources) Solution: Class discriminators can serve as effective wordsense discriminators And are much less costly to train if we can tolerate some noise in the models -- CS466 Lecture XVIII --

Pseudo-Class Discriminators What if class lists (like Rogets) are not available? Create small classes optimized for the target ambiguity Class (Crane 1) = heron, stork, eagle, condor, … Class (Crane 2) = derrick, forklift, bulldozers, … Class (Tank 1) = Jeep, Vehicle, Humvee, Bradley, Abrams, … Class (Tank 2) = Vessel, container, flask, pool Include synonyms, hype-nyms, hyponyms, topically related Smaller and potentially more specific but less robust (parent in tree) (child in tree) -- CS466 Lecture XVIII --

Goal: Iterative Refinement Sorting State: Output of (poor) class models Context that match reliable Small sets of hand tagged data Seed Words: Reliable collocations of target words need not be synonyms. Plant 1 Plant 2 Plant life Plant species Microscopic plant Living plant Animal within +/- 10 words Employee within +/- 10 words Assembly plant Plant closure -- CS466 Lecture XIX --