Natural Language Processing Lecture 20: Some Problems in Semantic Disambiguation.

Slides:



Advertisements
Similar presentations
Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Advertisements

A Machine Learning Approach to Coreference Resolution of Noun Phrases By W.M.Soon, H.T.Ng, D.C.Y.Lim Presented by Iman Sen.
Semi-Supervised Learning & Summary Advanced Statistical Methods in NLP Ling 572 March 8, 2012.
War in Korea By: Isabella de Jesus and Anna Marie Jennings.
1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth.
Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Lexical Semantics and Word Senses Hongning Wang
1 Discourse, coherence and anaphora resolution Lecture 16.
Discourse Martin Hassel KTH NADA Royal Institute of Technology Stockholm
Natural Language Processing
Word sense disambiguation and information retrieval Chapter 17 Jurafsky, D. & Martin J. H. SPEECH and LANGUAGE PROCESSING Jarmo Ritola -
5/16/2015CPSC503 Winter CPSC 503 Computational Linguistics Computational Lexical Semantics Lecture 14 Giuseppe Carenini.
Lexical chains for summarization a summary of Silber & McCoy’s work by Keith Trnka.
Word Sense Disambiguation Ling571 Deep Processing Techniques for NLP February 23, 2011.
TOP SECRET - U.S. National Security Council Options (continued) 3) Increase Military Creditability 3) Increase Military Creditability Respond with a credible.
Semi-supervised learning and self-training LING 572 Fei Xia 02/14/06.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
PSY 369: Psycholinguistics Some basic linguistic theory part3.
CS 4705 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised –Dictionary-based.
LSA 311 Computational Lexical Semantics Dan Jurafsky Stanford University Lecture 2: Word Sense Disambiguation.
Semi-Supervised Natural Language Learning Reading Group I set up a site at: ervised/
Word Sense Disambiguation. Word Sense Disambiguation (WSD) Given A word in context A fixed inventory of potential word senses Decide which sense of the.
Lexical Semantics CSCI-GA.2590 – Lecture 7A
Section 2Accounting: The Universal Language of Business What You’ll Learn  How the accounting system works.  Who uses financial accounting reports. 
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Unsupervised Word Sense Disambiguation Rivaling Supervised Methods Oh-Woog Kwon KLE Lab. CSE POSTECH.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 11.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Unsupervised Word Sense Disambiguation REU, Summer, 2009.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.
Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.
Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text.
Disambiguation Read J & M Chapter 17.1 – The Problem Washington Loses Appeal on Steel Duties Sue caught the bass with the new rod. Sue played the.
Page 1 North Korea - No FREEDOM!! Slide created by: Connie Chen 806.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Using Semantic Relatedness for Word Sense Disambiguation
Improving Translation Selection using Conceptual Vectors LIM Lian Tze Computer Aided Translation Unit School of Computer Sciences Universiti Sains Malaysia.
Multiple meaning words are words that are spelled the same and sound the same but have different meanings.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
Evaluation issues in anaphora resolution and beyond Ruslan Mitkov University of Wolverhampton Faro, 27 June 2002.
Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.
The First President of the United States Chapter 8, Section 1.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Lexical Semantics and Word Senses Hongning Wang
Recap: distributional hypothesis What is tezgüino? – A bottle of tezgüino is on the table. – Everybody likes tezgüino. – Tezgüino makes you drunk. – We.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
Intro to NLP - J. Eisner1 Splitting Words a.k.a. “Word Sense Disambiguation”
Simone Paolo Ponzetto University of Heidelberg Massimo Poesio
Natural Language Processing (NLP)
Lecture 21 Computational Lexical Semantics
FrameNet in Python: Examples
U.S.-Japan Opinion Survey 2017 January 8, 2018 Brookings Institution
Statistical NLP: Lecture 9
A Machine Learning Approach to Coreference Resolution of Noun Phrases
The Two Koreas A Split Peninsula.
A Machine Learning Approach to Coreference Resolution of Noun Phrases
Natural Language Processing (NLP)
Statistical NLP : Lecture 9 Word Sense Disambiguation
Natural Language Processing (NLP)
Presentation transcript:

Natural Language Processing Lecture 20: Some Problems in Semantic Disambiguation

Semantics Road Map 1.Lexical semantics 2.Disambiguating words Word sense disambiguation Coreference resolution 3.Semantic role labeling 4.Meaning representation languages 5.Discourse and pragmatics 6.Compositional semantics, semantic parsing

On Banks bank 1 = “sloping land” bank 2 = “financial institution” bank 3 = “biological repository” bank 4 = “building where a bank 1 does its business”

Zeugma Test How can we tell whether two word senses are distinct? The farmers in the valley grew potatoes. The farmers in the valley grew bored. *The farmers in the valley grew potatoes and bored.

Word Sense Disambiguation Input: a word in context Output: its sense (usually from a fixed, predefined set) How? (The options we’ll discuss) -Simplified Lesk Algorithm -Decision List -Supervised Learning

Simplified Lesk The bank can guarantee deposits will eventually cover future tuition costs because it invests in adjustable rate mortgage securities. bank.n.01sloping land (especially the slope beside a body of water) "they pulled the canoe up on the bank"; "he sat on the bank of the river and watched the currents" bank.n.02depository financial institution, bank, banking concern, banking company (a financial institution that accepts deposits and channels the money into lending activities) "he cashed a check at the bank"; "that bank holds the mortgage on my home”

Simplified Lesk Compute the overlap between words in the target word’s context and the “signatures” of each potential sense (i.e., the words in its definition/examples). The sense with the maximum overlap is the predicted sense.

Simplified Lesk The bank can guarantee deposits will eventually cover future tuition costs because it invests in adjustable rate mortgage securities. bank.n.01sloping land (especially the slope beside a body of water) "they pulled the canoe up on the bank"; "he sat on the bank of the river and watched the currents" bank.n.02depository financial institution, bank, banking concern, banking company (a financial institution that accepts deposits and channels the money into lending activities) "he cashed a check at the bank"; "that bank holds the mortgage on my home”

Modifications to Simplified Lesk Expand the signatures for each word sense – Include hypernyms and/or hyponyms – Include their definitions/examples Corpus Lesk: add context words from a sense- labeled corpus to each sense’s signature – Use inverse document frequency to weight words

Decision List (bass) RuleSense fish within window1 striped bass1 guitar within window2 bass player2 piano within window2 tenor within window2 sea bass1 play/V bass2 river within window1 violin within window2 salmon within window1 on bass2 bass are1 1 Yarowsky (1997) 2

Supervised Learning We need labeled data. How can we get it?

Bootstrapping for WSD 1.Produce seeds (dictionary definitions, single defining collocate, or label common collocates)

Bootstrapping for WSD 1.Produce seeds (dictionary definitions, single defining collocate, or label common collocates) 2.Train supervised classifier on labeled examples 3.Label all examples, and keep labels for high- confidence instances

Bootstrapping for WSD 1.Produce seeds (dictionary definitions, single defining collocate, or label common collocates) 2.Train supervised classifier on labeled examples 3.Label all examples, and keep labels for high- confidence instances – Optional: exploit one sense per discourse 4.Go to 2

Coreference Resolution

Mary picked up the ball. She threw it to me.

Entity Linking Mary picked up the ball. She threw it to me.

President Park Geun-hye of South Korea ordered the country’s military on Monday to deliver a strong and immediate response to any North Korean provocation, the latest turn in a war of words that has become a test of resolve for the relatively unproven leaders in both the North and South. “I consider the current North Korean threats very serious,” Ms. Park told the South’s generals. “If the North attempts any provocation against our people and country, you must respond strongly at the first contact with them without any political consideration. “As top commander of the military, I trust your judgment in the face of North Korea’s unexpected surprise provocation,” she added. Since Kim Jong-un took power after the death of his father, Kim Jong-il, in late 2011, the North has taken a series of provocative steps and amplified threats against Washington and Seoul to much louder and more menacing levels. The North has launched a three-stage rocket, tested a nuclear device and threatened to hit major American cities with nuclear-armed ballistic missiles. And Mr. Kim has declared that the Korean Peninsula has reverted to a “state of war.”

President Park Geun-hye of South Korea ordered the country’s military on Monday to deliver a strong and immediate response to any North Korean provocation, the latest turn in a war of words that has become a test of resolve for the relatively unproven leaders in both the North and South. “I consider the current North Korean threats very serious,” Ms. Park told the South’s generals. “If the North attempts any provocation against our people and country, you must respond strongly at the first contact with them without any political consideration. “As top commander of the military, I trust your judgment in the face of North Korea’s unexpected surprise provocation,” she added. Since Kim Jong-un took power after the death of his father, Kim Jong-il, in late 2011, the North has taken a series of provocative steps and amplified threats against Washington and Seoul to much louder and more menacing levels. The North has launched a three-stage rocket, tested a nuclear device and threatened to hit major American cities with nuclear-armed ballistic missiles. And Mr. Kim has declared that the Korean Peninsula has reverted to a “state of war.”

President Park Geun-hye of South Korea ordered the country’s military on Monday to deliver a strong and immediate response to any North Korean provocation, the latest turn in a war of words that has become a test of resolve for the relatively unproven leaders in both the North and South. “I consider the current North Korean threats very serious,” Ms. Park told the South’s generals. “If the North attempts any provocation against our people and country, you must respond strongly at the first contact with them without any political consideration. “As top commander of the military, I trust your judgment in the face of North Korea’s unexpected surprise provocation,” she added. Since Kim Jong-un took power after the death of his father, Kim Jong-il, in late 2011, the North has taken a series of provocative steps and amplified threats against Washington and Seoul to much louder and more menacing levels. The North has launched a three-stage rocket, tested a nuclear device and threatened to hit major American cities with nuclear-armed ballistic missiles. And Mr. Kim has declared that the Korean Peninsula has reverted to a “state of war.”

President Park Geun-hye of South Korea ordered the country’s military on Monday to deliver a strong and immediate response to any North Korean provocation, the latest turn in a war of words that has become a test of resolve for the relatively unproven leaders in both the North and South. “I consider the current North Korean threats very serious,” Ms. Park told the South’s generals. “If the North attempts any provocation against our people and country, you must respond strongly at the first contact with them without any political consideration. “As top commander of the military, I trust your judgment in the face of North Korea’s unexpected surprise provocation,” she added. Since Kim Jong-un took power after the death of his father, Kim Jong-il, in late 2011, the North has taken a series of provocative steps and amplified threats against Washington and Seoul to much louder and more menacing levels. The North has launched a three-stage rocket, tested a nuclear device and threatened to hit major American cities with nuclear-armed ballistic missiles. And Mr. Kim has declared that the Korean Peninsula has reverted to a “state of war.”

President Park Geun-hye of South Korea ordered the country’s military on Monday to deliver a strong and immediate response to any North Korean provocation, the latest turn in a war of words that has become a test of resolve for the relatively unproven leaders in both the North and South. “I consider the current North Korean threats very serious,” Ms. Park told the South’s generals. “If the North attempts any provocation against our people and country, you must respond strongly at the first contact with them without any political consideration. “As top commander of the military, I trust your judgment in the face of North Korea’s unexpected surprise provocation,” she added. Since Kim Jong-un took power after the death of his father, Kim Jong-il, in late 2011, the North has taken a series of provocative steps and amplified threats against Washington and Seoul to much louder and more menacing levels. The North has launched a three-stage rocket, tested a nuclear device and threatened to hit major American cities with nuclear-armed ballistic missiles. And Mr. Kim has declared that the Korean Peninsula has reverted to a “state of war.”

Different Kinds of Noun Phrases Can Co-Refer Indefinite NPs: a smart programmer, some cheesecake Definite NPs: the store around the corner, the friend I was telling you about Pronouns: she Demonstratives: that one, this, those students Names: IBM, Carnegie Mellon

High-Level Recipe for Coreference Resolution 1.Parse the text and identify NPs; then 2.For every pair of NPs, carry out binary classification: coreferential or not? 3.Collect the results into coreferential chains What do we need? -A choice of classifier -Lots of labeled data -Features

Features? Edit distance between the two NPs Are the two NPs the same NER type? Appositive syntax – “Alan Shepherd, the first American astronaut…” Proper/definite/indefinite/pronoun Gender Number Distance in sentences Number of NPs between Grammatical role etc.

Evaluation is more complicated than true-false. One approach: B-Cubed Input: hypothesis chains and reference chains Evaluation (Bagga and Baldwin, 1998)

Coreference Resolution Demo /Coref