Shallow Semantics. LING 2000 - 2006 NLP 2 Semantics and Pragmatics High-level Linguistics (the good stuff!) Semantics: the study of meaning that can be.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
Semantics (Representing Meaning)
Sequence Classification: Chunking Shallow Processing Techniques for NLP Ling570 November 28, 2011.
Max-Margin Matching for Semantic Role Labeling David Vickrey James Connor Daphne Koller Stanford University.
Statistical NLP: Lecture 3
Semantic Role Labeling Abdul-Lateef Yussiff
A Joint Model For Semantic Role Labeling Aria Haghighi, Kristina Toutanova, Christopher D. Manning Computer Science Department Stanford University.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
Robust Textual Inference via Graph Matching Aria Haghighi Andrew Ng Christopher Manning.
Introduction to Semantics and Pragmatics. LING NLP 2 NLP tends to focus on: Syntax – Grammars, parsers, parse trees, dependency structures.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Leksička semantika i pragmatika 1. predavanje. Introduction to Semantics and Pragmatics.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 20, 2004.
Page 1 Generalized Inference with Multiple Semantic Role Labeling Systems Peter Koomen, Vasin Punyakanok, Dan Roth, (Scott) Wen-tau Yih Department of Computer.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Robert Hass CIS 630 April 14, 2010 NP NP↓ Super NP tagging JJ ↓
9/8/20151 Natural Language Processing Lecture Notes 1.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Interpreting Dictionary Definitions Dan Tecuci May 2002.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
Learning Representations of Language for Domain Adaptation Alexander Yates Fei (Irene) Huang Temple University Computer and Information Sciences.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Linguistic Essentials
Combining Lexical Resources: Mapping Between PropBank and VerbNet Edward Loper,Szu-ting Yi, Martha Palmer September 2006.
Using Semantic Relations to Improve Passage Retrieval for Question Answering Tom Morton.
Rules, Movement, Ambiguity
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
Artificial Intelligence: Natural Language
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
Supertagging CMSC Natural Language Processing January 31, 2006.
LING 6520: Comparative Topics in Linguistics (from a computational perspective) Martha Palmer Jan 15,
11 Project, Part 3. Outline Basics of supervised learning using Naïve Bayes (using a simpler example) Features for the project 2.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
NLP. Introduction to NLP Last week, Min broke the window with a hammer. The window was broken with a hammer by Min last week With a hammer, Min broke.
SYNTAX 1 NOV 9, 2015 – DAY 31 Brain & Language LING NSCI Fall 2015.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Lec. 10.  In this section we explain which constituents of a sentence are minimally required, and why. We first provide an informal discussion and then.
Relation Extraction (RE) via Supervised Classification See: Jurafsky & Martin SLP book, Chapter 22 Exploring Various Knowledge in Relation Extraction.
COSC 6336: Natural Language Processing
Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur
Statistical NLP: Lecture 3
SYNTAX.
Two Discourse Driven Language Models for Semantics
Improving a Pipeline Architecture for Shallow Discourse Parsing
CSC 594 Topics in AI – Applied Natural Language Processing
Machine Learning in Natural Language Processing
Probabilistic and Lexicalized Parsing
Linguistic Essentials
CS246: Information Retrieval
David Kauchak CS159 – Spring 2019
Artificial Intelligence 2004 Speech & Natural Language Processing
Progress report on Semantic Role Labeling
Presentation transcript:

Shallow Semantics

LING NLP 2 Semantics and Pragmatics High-level Linguistics (the good stuff!) Semantics: the study of meaning that can be determined from a sentence, phrase or word. Pragmatics: the study of meaning, as it depends on context (speaker, situation, dialogue history)

LING NLP 3 Language to (Simplistic) Logic John went to the book store. go(John, store1) John bought a book. buy(John,book1) John gave the book to Mary. give(John,book1,Mary) Mary put the book on the table. put(Mary,book1,on table1)

What’s missing? Word sense disambiguation Quantification Coreference Interpreting within a phrase Many, many more issues … But it’s still more than you get from parsing!

Some problems in shallow semantics 1.Identifying entities – noun-phrase chunking – named-entity recognition – coreference resolution (involves discourse/pragmatics too) 2.Identifying relationship names – Verb-phrase chunking – Predicate identification (step 0 of semantic role labeling) – Synonym resolution (e.g., get = receive) 3.Identifying arguments to predicates – Information extraction – Argument identification (step 1 of semantic role labeling) 4.Assigning semantic roles (step 2 of semantic role labeling) 5.Sentiment classification – That is, does the relationship express an opinion? – If so, is the opinion positive or negative?

1. Identifying Entities Named Entity Tagging: Identify all the proper names in a text Sally went to see Up in the Air at the local theater. Person Film Noun Phrase Chunking: Find all base noun phrases (that is, noun phrases that don’t have smaller noun phrases nested inside them) Sally went to see Up in the Air at the local theater on Elm Street.

1. Identifying Entities (2) Parsing: Identify all phrase constituents, which will of course include all noun phrases. S NP VP N Sally V NP PP PNP the theater atUp in the Air saw NP Elm St. on PP NP P

1. Identifying Entities (3) Coreference Resolution: Identify all references (aka ‘mentions’) of people, places and things in text, and determine which mentions are ‘co-referential’. John stuck his foot in his mouth.

2. Identifying relationship names Verb phrase chunking: the commonest approach Some issues: 1.Often, prepositions/particles “belong” with the relation name You’re ticking me off. 2.Many relationships are expressed without a verb: Jack Welch, CEO of GE, … 3.Some verbs don’t really express a meaningful relationship by themselves: Jim is the father of 12 boys. 4.Verb sense disambiguation 5.Synonymy ticking off = bothering

2. Identifying relationship names (2) Synonym Resolution: Discovery of Inference Rules from Text (DIRT) (Lin and Pantel, 2001) 1. They collect millions of examples of Subject Verb Object triples by parsing a Web corpus. 2. For a pair of verbs, v1 and v2, they compute mutual information scores between - the vector space model (VSM) for subjects of v1 and the vector space model for the subjects of v2 - the VSM for objects of v1 and VSM for objects of v2 3. They cluster verbs with high MI scores between them givedonate manygiftsoulsgift.your self partnermonthly How toanimalpleasehair yougiftmanydollars pleasebloodyoucar helplifeyoumoney membersenergyyoutoday See (Yates and Etzioni, JAIR 2009) for a more recent approach using probabilistic models.

5. Sentiment Classification Given a review (about a movie, hotel, Amazon product, etc.), a sentiment classification system tries to determine what opinions are expressed in the review. Coarse-level objective: is the review positive, negative, or neutral overall? Fine-grained objective: what are the positive aspects (according to the reviewer), and what are the negative aspects? Question: what technique(s) would you use to solve these two problems?

Semantic Role Labeling a.k.a., Shallow Semantic Parsing

Semantic Role Labeling Semantic role labeling is the computational task of assigning semantic roles to phrases It’s usually divided into three subtasks: 1.Predicate identification 2.Argument Identification 3.Argument Classification -- assigning semantic roles John broke the window with a hammer. Pred B-Arg I-ArgB-ArgI-Arg AgentPatient Means (or instrument)

NLP 14 Same event - different sentences John broke the window with a hammer. John broke the window with the crack. The hammer broke the window. The window broke.

NLP 15 Same event - different syntactic frames John broke the window with a hammer. SUBJ VERB OBJ MODIFIER John broke the window with the crack. SUBJ VERB OBJ MODIFIER The hammer broke the window. SUBJ VERB OBJ The window broke. SUBJ VERB

NLP 16 Semantic role example break(AGENT, INSTRUMENT, PATIENT) AGENT PATIENT INSTRUMENT John broke the window with a hammer. INSTRUMENT PATIENT The hammer broke the window. PATIENT The window broke. Fillmore 68 - The case for case

NLP 17 AGENT PATIENT INSTRUMENT John broke the window with a hammer. SUBJ OBJ MODIFIER INSTRUMENT PATIENT The hammer broke the window. SUBJ OBJ PATIENT The window broke. SUBJ

Semantic roles Semantic roles (or just roles) are slots, belonging to a predicate, which arguments can fill. - There are different naming conventions, but one common set of names for semantic roles are agent, patient, means/instrument, …. Some constraints: 1. Only certain kinds of phrases can fill certain kinds of semantic roles “with a crack” will never be an agent But many are ambiguous: “hammer”  patient or instrument? 2. Syntax provides a clue, but it is not the full answer Subject  Agent? Patient? Instrument?

Slot Filling Pred John broke the window with a hammer Agent Patient Means (or instrument) PhrasesSlots Argument Classification

Slot Filling Pred The hammer broke the window Agent Patient Means (or instrument) PhrasesSlots Argument Classification

Slot Filling Pred The window broke Agent Patient Means (or instrument) PhrasesSlots Argument Classification

Slot Filling and Shallow Semantics Pred John broke the window with a hammer Agent Patient Means (or instrument) PhrasesSlots Shallow Semantics broke(John, the window, with a hammer) Pred AgentPatient Means (or instrument)

Slot Filling and Shallow Semantics Pred broke The window Agent Patient Means (or instrument) PhrasesSlots Shallow Semantics broke( ?x, the window, ?y ) Pred AgentPatient Means (or instrument)

Semantic Role Labeling Techniques

We’ll cover 3 approaches to SRL 1.Basic (Gildea and Jurafsky, Comp. Ling. 2003) 2.Joint inference for argument structure (Toutanova et al., Comp. Ling. 2008) 3.Open-domain (Huang and Yates, ACL 2010)

1. Gildea and Jurafsky Main idea: start with parse tree, and try to identify constituents that are arguments.

G&J (1) Build a (probabilistic) classifier for predicting: - for each constituent, which role is it? - Essentially, a maximum-entropy classifier, although it’s not described that way Features for Argument Classification: 1.Phrase type of constituent 2.Governing category of NPs – S or VP (differentiates between subjects and objects) 3.Position w.r.t. predicate (before or after) 4.Voice of predicate (active or passive verb) 5.Head word of constituent 6.Parse tree path between predicate and constituent

G&J (2) – Parse Tree Path Feature Parse tree path (or just path) feature: Determines the syntactic relationship between predicate and current constituent. In this example, path feature: VB ↑ VP ↑ S ↓ NP

G&J (3) 4086 possible values of the Path feature in training data. A sparse feature!

G&J (4) Build a (probabilistic) classifier for predicting: - for each constituent, which role is it? - Essentially, a maximum-entropy classifier, although it’s not described that way Features for Argument Identification: 1.Predicate word 2.Head word of constituent 3.Parse tree path between predicate and constituent

G&J (5): Results TaskBest Result Argument Identification (only)92% prec., 86% rec.,.89 F1 Argument Classification (only)78.5% assigned correct role

2. Toutanova, Haghighi, and Manning A Global Joint Model for SRL (Comp. Ling., 2008) Main idea(s): Include features that depend on multiple arguments Use multiple parsers as input, for robustness

THM (1): Motivation 1. “The day that the ogre cooked the children is still remembered.” 2. “The meal that the ogre cooked the children is still remembered.” Both sentences have identical syntax. They differ in only 1 word (day vs. meal). If we classify arguments 1 at a time, “the children” will be labeled the same thing in both cases. But in (1), “the children” is the Patient (thing being cooked). And in (2), “the children” is the Beneficiary (people for whom the cooking is done). Intuitively, we can’t classify these arguments independently.

THM(2): Features Features: 1.Whole label sequence 1.[voice:active, Arg1, pred, Arg4, ArgM-TMP] 2.[voice:active, lemma:accelerated, Arg1, pred, Arg4, ArgM-TMP] 3.[voice:active, lemma:accelerated, Arg1, pred, Arg4] (no adjuncts) 4.[voice:active, lemma:accelerated, Arg, pred, Arg] (no adjuncts, no #s) 2.Syntax and semantics in the label sequence 1.[voice:active, NP-Arg1, pred, PP-Arg4] 2.[voice:active, lemma:accelerated, NP-Arg1, pred, PP-Arg4] 3.Repetition features: whether Arg1 (for example) appears multiple times

THM(3): Classifier First, for each sentence, obtain the top-10 most likely parse tree/semantic role label outputs from G&J Build a max-ent classifier to select from these 10, using the features above Also, include top-10 parses from the Charniak parser

THM(4): Results These are on a different data set from G&J, so results not directly comparable. But the local model is similar to G&J, so think of that as the comparison. ModelWSJ (ID & CLS)Brown (ID & CLS) Local Joint (1 parse) Joint (top 5 parses) Results show F1 scores for IDentification and CLaSsification of arguments together. WSJ is the Wall Street Journal test set, a collection of approximately 4,000 news sentences. Brown is a smaller collection of fiction stories. The system is trained on a separate set of WSJ sentences.

3. Huang and Yates Open-Domain SRL by Modeling Word Spans, ACL 2010 Main Idea: One of the biggest problems for SRL systems is that they need lexical features to classify arguments, but lexical features are sparse. We build a simple SRL system that outperforms the previous state-of-the-art on out-of-domain data, by learning new lexical representations.

Simple, open-domain SRL Chrisbrokethewindowwithahammer Proper Noun VerbDet.NounPrep.Det.Noun B-NPB-VPB-NPI-NPB-PPB-NPI-NP POS tag Chunk tag dist. from predicate SRL LabelBreakerPredThing BrokenMeans Baseline Features

HMM label Simple, open-domain SRL Chrisbrokethewindowwithahammer Proper Noun VerbDet.NounPrep.Det.Noun B-NPB-VPB-NPI-NPB-PPB-NPI-NP POS tag Chunk tag dist. from predicate SRL LabelBreakerPredThing BrokenMeans Baseline +HMM

The importance of paths Chris [ predicate broke] [ thing broken a hammer] Chris [ predicate broke] a window with [ means a hammer] Chris [ predicate broke] the desk, so she fetched [ not an arg a hammer] and nails.

Simple, open-domain SRL Chrisbrokethewindowwithahammer None the the- window the- window- with the- window- with-a Word path SRL LabelBreakerPredThing BrokenMeans Baseline +HMM + Paths

Simple, open-domain SRL Chrisbrokethewindowwithahammer None the the- window the- window- with the- window- with-a Word path SRL LabelBreakerPredThing BrokenMeans Baseline +HMM + Paths DetDet-Noun Det- Noun- Prep Det- Noun- Prep-Det POS path None

Simple, open-domain SRL Chrisbrokethewindowwithahammer None the the- window the- window- with the- window- with-a Word path SRL LabelBreakerPredThing BrokenMeans Baseline +HMM + Paths DetDet-Noun Det- Noun- Prep Det- Noun- Prep-Det POS path None HMM path None

Experimental results – F1 All systems were trained on newswire text from the Wall Street Journal (WSJ), and tested on WSJ and fiction texts from the Brown corpus (Brown).

Experimental results – F1 All systems were trained on newswire text from the Wall Street Journal (WSJ), and tested on WSJ and fiction texts from the Brown corpus (Brown).

Span-HMMs

Span-HMM features Chrisbrokethewindowwithahammer Span-HMM for “hammer” SRL LabelBreakerPredThing BrokenMeans Span-HMM Features Span-HMM feature

Span-HMM features Chrisbrokethewindowwithahammer Span-HMM for “hammer” SRL LabelBreakerPredThing BrokenMeans Span-HMM Features Span-HMM feature

Span-HMM features Chrisbrokethewindowwithahammer Span-HMM for “a” SRL LabelBreakerPredThing BrokenMeans Span-HMM Features Span-HMM feature

Span-HMM features Chrisbrokethewindowwithahammer Span-HMM for “a” SRL LabelBreakerPredThing BrokenMeans Span-HMM Features Span-HMM feature

Span-HMM features Chrisbrokethewindowwithahammer SRL LabelBreakerPredThing BrokenMeans Span-HMM Features Span-HMM feature None

Experimental results – SRL F1 All systems were trained on newswire text from the Wall Street Journal (WSJ), and tested on WSJ and fiction texts from the Brown corpus (Brown).

Experimental results – feature sparsity

Benefit grows with distance from predicate