Outline Linguistic Theories of semantic representation  Case Frames – Fillmore – FrameNet  Lexical Conceptual Structure – Jackendoff – LCS  Proto-Roles.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS
Syntax-Semantics Mapping Rajat Kumar Mohanty CFILT.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Relation Extraction.
807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction.
Columbia, 1/29/04 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Columbia University New York City January 29, 2004.
Layering Semantics (Putting meaning into trees) Treebank Workshop Martha Palmer April 26, 2007.
VerbNet Martha Palmer University of Colorado LING 7800/CSCI September 16,
CL Research ACL Pattern Dictionary of English Prepositions (PDEP) Ken Litkowski CL Research 9208 Gue Road Damascus,
Language and Cognition Colombo, June 2011 Day 2 Introduction to Linguistic Theory, Part 4.
E XTRACTING SEMANTIC ROLE INFORMATION FROM UNSTRUCTURED TEXTS Diana Trandab ă 1 and Alexandru Trandab ă 2 1 Faculty of Computer Science, University “Al.
Overview of the Hindi-Urdu Treebank Fei Xia University of Washington 7/23/2011.
The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin.
Language Data Resources Treebanks. A treebank is a … database of syntactic trees corpus annotated with morphological and syntactic information segmented,
Statistical NLP: Lecture 3
Albert Gatt LIN 1080 Semantics Lecture 13. In this lecture We take a look at argument structure and thematic roles these are the parts of the sentence.
ELABORAZIONE DEL LINGUAGGIO NATURALE SEMANTICA: NAMED ENTITIES RELAZIONI.
Semantic Role Labeling Abdul-Lateef Yussiff
A Joint Model For Semantic Role Labeling Aria Haghighi, Kristina Toutanova, Christopher D. Manning Computer Science Department Stanford University.
10/9/01PropBank1 Proposition Bank: a resource of predicate-argument relations Martha Palmer University of Pennsylvania October 9, 2001 Columbia University.
PropBanks, 10/30/03 1 Penn Putting Meaning Into Your Trees Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben.
Towards Parsing Unrestricted Text into PropBank Predicate- Argument Structures ACL4 Project NCLT Seminar Presentation, 7th June 2006 Conor Cafferkey.
CAS LX 502 5b. Theta roles Chapter 6. Roles in an event Pat pushed the cart into the corner with a stick. This sentence describes an event, tying together.
Steven Schoonover.  What is VerbNet?  Levin Classification  In-depth look at VerbNet  Evolution of VerbNet  What is FrameNet?  Applications.
Linguistic Theory Lecture 8 Meaning and Grammar. A brief history In classical and traditional grammar not much distinction was made between grammar and.
The Relevance of a Cognitive Model of the Mental Lexicon to Automatic Word Sense Disambiguation Martha Palmer and Susan Brown University of Colorado August.
10/9/01PropBank1 Proposition Bank: a resource of predicate-argument relations Martha Palmer, Dan Gildea, Paul Kingsbury University of Pennsylvania February.
CIS630 1 Penn Putting Meaning Into Your Trees Martha Palmer Collaborators: Paul Kingsbury, Olga Babko-Malaya, Bert Xue, Scott Cotton Karin Kipper, Hoa.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
PropBank, VerbNet & SemLink Edward Loper. PropBank 1M words of WSJ annotated with predicate- argument structures for verbs. –The location & type of each.
September 17, : Grammars and Lexicons Lori Levin.
Assessing the Impact of Frame Semantics on Textual Entailment Authors: Aljoscha Burchardt, Marco Pennacchiotti, Stefan Thater, Manfred Pinkal Saarland.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.
Penn 1 Kindle: Knowledge and Inference via Description Logics for Natural Language Dan Roth University of Illinois, Urbana-Champaign Martha Palmer University.
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
Semantic Role Labeling: English PropBank
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.
Lecture 17 Ling 442. Exercises 1.What is the difference between (a) and (b) regarding the thematic roles of the subject DPs. (a)Bill ran. (b) The tree.
Semantic Role Labeling. Introduction Semantic Role Labeling AgentThemePredicateLocation.
Combining Lexical Resources: Mapping Between PropBank and VerbNet Edward Loper,Szu-ting Yi, Martha Palmer September 2006.
Rules, Movement, Ambiguity
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
Supertagging CMSC Natural Language Processing January 31, 2006.
LING 6520: Comparative Topics in Linguistics (from a computational perspective) Martha Palmer Jan 15,
CSE391 – 2005 NLP 1 Events From KRR lecture. CSE391 – 2005 NLP 2 Ask Jeeves – A Q/A, IR ex. What do you call a successful movie? Tips on Being a Successful.
Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003.
ARDA Visit 1 Penn Lexical Semantics at Penn: Proposition Bank and VerbNet Martha Palmer, Dan Gildea, Paul Kingsbury, Olga Babko-Malaya, Bert Xue, Karin.
NLP. Introduction to NLP Last week, Min broke the window with a hammer. The window was broken with a hammer by Min last week With a hammer, Min broke.
1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003.
SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland.
An Introduction to Semantic Parts of Speech Rajat Kumar Mohanty rkm[AT]cse[DOT]iitb[DOT]ac[DOT]in Centre for Indian Language Technology Department of Computer.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
September 26, : Grammars and Lexicons Lori Levin.
Chinese Proposition Bank Nianwen Xue, Chingyi Chia Scott Cotton, Seth Kulick, Fu-Dong Chiou, Martha Palmer, Mitch Marcus.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Lec. 10.  In this section we explain which constituents of a sentence are minimally required, and why. We first provide an informal discussion and then.
CIS630, 9/13/04 1 Penn Putting Meaning into Your Trees Martha Palmer CIS630 September 13, 2004.
Leonardo Zilio Supervisors: Prof. Dr. Maria José Bocorny Finatto
Semantic/Thematic Roles Oct 9, 2007 Christopher Manning
English Proposition Bank: Status Report
Coarse-grained Word Sense Disambiguation
Statistical NLP: Lecture 3
Representation of Actions as an Interlingua
Unit-4 Lexical Semantics M.B.Chandak, HoD CSE,
CS224N Section 3: Corpora, etc.
Structure of a Lexicon Debasri Chakrabarti 13-May-19.
Progress report on Semantic Role Labeling
Presentation transcript:

Outline Linguistic Theories of semantic representation  Case Frames – Fillmore – FrameNet  Lexical Conceptual Structure – Jackendoff – LCS  Proto-Roles – Dowty – PropBank  English verb classes (diathesis alternations) - Levin - VerbNet Manual Semantic Annotation Automatic Semantic annotation Parallel PropBanks and Event Relations

Prague, Dec, 2006 Thematic Proto-Roles and Argument Selection, David Dowty, Language 67: , 1991 Thanks to Michael Mulyar

Context: Thematic Roles Thematic relations (Gruber 1965, Jackendoff 1972) Traditional thematic roles types include: Agent, Patient, Goal, Source, Theme, Experiencer, Instrument (p. 548). “Argument-Indexing View”: thematic roles objects at syntax- semantics interface, determining a syntactic derivation or the linking relations. Θ-Criterion (GB Theory): each NP of predicate in lexicon assigned unique θ-role (Chomsky 1981).

Problems with Thematic Role Types Thematic role types used in many syntactic generalizations, e.g. involving empirical thematic role hierarchies. Are thematic roles syntactic universals (or e.g. constructionally defined)? Relevance of role types to syntactic description needs motivation, e.g. in describing transitivity. Thematic roles lack independent semantic motivation. Apparent counter-examples to θ-criterion (Jackendoff 1987). Encoding semantic features (Cruse 1973) may not be relevant to syntax.

Problems with Thematic Role Types Fragmentation: Cruse (1973) subdivides Agent into four types. Ambiguity: Andrews (1985) is Extent, an adjunct or a core argument? Symmetric stative predicates: e.g. “This is similar to that” Distinct roles or not? Searching for a Generalization: What is a Thematic Role?

Proto-Roles Event-dependent Proto-roles introduced Prototypes based on shared entailments Grammatical relations such as subject related to observed (empirical) classification of participants Typology of grammatical relations Proto-Agent Proto-Patient

Proto-Agent Properties  Volitional involvement in event or state  Sentience (and/or perception)  Causing an event or change of state in another participant  Movement (relative to position of another participant)  (exists independently of event named) *may be discourse pragmatic

Proto-Patient Properties:  Undergoes change of state  Incremental theme  Causally affected by another participant  Stationary relative to movement of another participant  (does not exist independently of the event, or at all) *may be discourse pragmatic

Argument Selection Principle For 2 or 3 place predicates Based on empirical count (total of entailments for each role). Greatest number of Proto-Agent entailments  Subject; greatest number of Proto-Patient entailments  Direct Object. Alternation predicted if number of entailments for each role similar (nondiscreteness).

Worked Example: Psychological Predicates Examples: Experiencer SubjectStimulus Subject x likes yy pleases x x fears yy frightens x Describes “almost the same” relation Experiencer: sentient (P-Agent) Stimulus: causes emotional reaction (P-Agent) Number of proto-entailments same; but for stimulus subject verbs, experiencer also undergoes change of state (P- Patient) and is therefore lexicalized as the patient.

Symmetric Stative Predicates Examples: This one and that one rhyme / intersect / are similar. This rhymes with / intersects with / is similar to that. (cf. The drunk embraced the lamppost. / *The drunk and the lamppost embraced.)

Symmetric Predicates: Generalizing via Proto-Roles Conjoined predicate subject has Proto-Agent entailments which two-place predicate relation lacks (i.e. for object of two-place predicate). Generalization entirely reducible to proto- roles. Strong cognitive evidence for proto-roles: would be difficult to deduce lexically, but easy via knowledge of proto-roles.

Diathesis Alternations Alternations: Spray / Load Hit / Break Non-alternating: Swat / Dash Fill / Cover

Spray / Load Alternation Example: Mary loaded the hay onto the truck. Mary loaded the truck with hay. Mary sprayed the paint onto the wall. Mary sprayed the wall with paint. Analyzed via proto-roles, not e.g. as a theme / location alternation. Direct object analyzed as an Incremental Theme, i.e. either of two non-subject arguments qualifies as incremental theme. This accounts for alternating behavior.

Hit / Break Alternation John hit the fence with a stick. John hit the stick against a fence. John broke the fence with a stick. John broke the stick against the fence. Radical change in meaning associated with break but not hit. Explained via proto-roles (change of state for direct object with break class).

Swat doesn’t alternate… swat the boy with a stick *swat the stick at / against the boy

Fill / Cover Fill / Cover are non-alternating: Bill filled the tank (with water). *Bill filled water (into the tank). Bill covered the ground (with a tarpaulin). *Bill covered a tarpaulin (over the ground). Only goal lexicalizes as incremental theme (direct object).

Conclusion Dowty argues for Proto-Roles based on linguistic and cognitive observations. Objections: Are P-roles empirical (extending arguments about hit class)?

Proposition Bank: From Sentences to Propositions Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji ) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting... When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2)

A TreeBanked phrase NP a GM-Jaguar pact NP that SBAR WHNP-1 *T*-1 S NP-SBJ VP would VP give the US car maker NP an eventual 30% stake NP the British company NP PP- LOC in a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.

A TreeBanked phrase NP a GM-Jaguar pact NP that SBAR WHNP-1 *T*-1 S NP-SBJ VP would VP give the US car maker NP an eventual 30% stake NP the British company NP PP- LOC in a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.

The same phrase, PropBanked a GM-Jaguar pact that would give *T*-1 the US car maker an eventual 30% stake in the British company Arg0 Arg2 Arg1 give(GM-J pact, US car maker, 30% stake) a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.

The full sentence, PropBanked a GM-Jaguar pact that would give *T*-1 the US car maker an eventual 30% stake in the British company Arg0 Arg2 Arg1 expect(Analysts, GM-J pact) give(GM-J pact, US car maker, 30% stake) Analysts have been expecting a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company. a GM-Jaguar pact Arg0 Arg1 have been expecting Analysts

Frames File Example: expect Roles: Arg0: expecter Arg1: thing expected Example: Transitive, active: Portfolio managers expect further declines in interest rates. Arg0: Portfolio managers REL: expect Arg1: further declines in interest rates

Frames File example: give Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object The executives gave the chefs a standing ovation. Arg0: The executives REL: gave Arg2: the chefs Arg1: a standing ovation

Word Senses in PropBank Orders to ignore word sense not feasible for 700+ verbs  Mary left the room  Mary left her daughter-in-law her pearls in her will Frameset leave.01 "move away from": Arg0: entity leaving Arg1: place left Frameset leave.02 "give": Arg0: giver Arg1: thing given Arg2: beneficiary How do these relate to traditional word senses in VerbNet and WordNet?

Annotation procedure PTB II - Extraction of all sentences with given verb Create Frame File for that verb Paul Kingsbury  (3100+ lemmas, 4400 framesets,118K predicates)  Over 300 created automatically via VerbNet First pass: Automatic tagging (Joseph Rosenzweig)  Second pass: Double blind hand correction Paul Kingsbury Tagging tool highlights discrepancies Scott Cotton Third pass: Solomonization (adjudication)  Betsy Klipple, Olga Babko-Malaya

Semantic role labels: Jan broke the LCD projector. break (agent(Jan), patient(LCD-projector)) cause(agent(Jan), change-of-state(LCD-projector)) (broken(LCD-projector)) agent(A) -> intentional(A), sentient(A), causer(A), affector(A) patient(P) -> affected(P), change(P),… Filmore, 68 Jackendoff, 72 Dowty, 91

Trends in Argument Numbering Arg0 = agent Arg1 = direct object / theme / patient Arg2 = indirect object / benefactive / instrument / attribute / end state Arg3 = start point / benefactive / instrument / attribute Arg4 = end point Per word vs frame level – more general?

Additional tags (arguments or adjuncts?) Variety of ArgM’s (Arg#>4):  TMP - when?  LOC - where at?  DIR - where to?  MNR - how?  PRP -why?  REC - himself, themselves, each other  PRD -this argument refers to or modifies another  ADV –others

Inflection Verbs also marked for tense/aspect  Passive/Active  Perfect/Progressive  Third singular (is has does was)  Present/Past/Future  Infinitives/Participles/Gerunds/Finites Modals and negations marked as ArgMs

Frames: Multiple Framesets Framesets are not necessarily consistent between different senses of the same verb Framesets are consistent between different verbs that share similar argument structures, (like FrameNet) Out of the 787 most frequent verbs:  1 FrameNet – 521  2 FrameNet – 169  3+ FrameNet - 97 (includes light verbs)

Ergative/Unaccusative Verbs Roles (no ARG0 for unaccusative verbs) Arg1 = Logical subject, patient, thing rising Arg2 = EXT, amount risen Arg3* = start point Arg4 = end point Sales rose 4% to $3.28 billion from $3.16 billion. The Nasdaq composite index added 1.01 to on paltry volume.

PropBank/FrameNet Buy Arg0: buyer Arg1: goods Arg2: seller Arg3: rate Arg4: payment Sell Arg0: seller Arg1: goods Arg2: buyer Arg3: rate Arg4: payment More generic, more neutral – maps readily to VN,TR Rambow, et al, PMLB03

Annotator accuracy – ITA 84%

Limitations to PropBank Args2-4 seriously overloaded, poor performance  VerbNet and FrameNet both provide more fine- grained role labels WSJ too domain specific, too financial, need broader coverage genres for more general annotation  Additional Brown corpus annotation, also GALE data  FrameNet has selected instances from BNC

Prague, Dec, 2006 Levin – English Verb Classes and Alternations: A Preliminary Investigation, 1993.

Levin classes (Levin, 1993) 3100 verbs, 47 top level classes, 193 second and third level Each class has a syntactic signature based on alternations. John broke the jar. / The jar broke. / Jars break easily. John cut the bread. / *The bread cut. / Bread cuts easily. John hit the wall. / *The wall hit. / *Walls hit easily.

Levin classes (Levin, 1993) Verb class hierarchy: 3100 verbs, 47 top level classes, 193 Each class has a syntactic signature based on alternations. John broke the jar. / The jar broke. / Jars break easily. change-of-state John cut the bread. / *The bread cut. / Bread cuts easily. change-of-state, recognizable action, sharp instrument John hit the wall. / *The wall hit. / *Walls hit easily. contact, exertion of force

Limitations to Levin Classes Coverage of only half of the verbs (types) in the Penn Treebank (1M words,WSJ) Usually only one or two basic senses are covered for each verb Confusing sets of alternations  Different classes have almost identical “syntactic signatures”  or worse, contradictory signatures Dang, Kipper & Palmer, ACL98

Multiple class listings Homonymy or polysemy?  draw a picture, draw water from the well Conflicting alternations?  Carry verbs disallow the Conative, (*she carried at the ball), but include {push,pull,shove,kick,yank,tug}  also in Push/pull class, does take the Conative (she kicked at the ball)

Intersective Levin Classes “at” ¬CH-LOC “across the room” CH-LOC “apart” CH-STATE Dang, Kipper & Palmer, ACL98

Intersective Levin Classes More syntactically and semantically coherent  sets of syntactic patterns  explicit semantic components  relations between senses VERBNET verbs.colorado.edu/~mpalmer/ verbnet Dang, Kipper & Palmer, IJCAI00, Coling00

VerbNet – Karin Kipper Class entries:  Capture generalizations about verb behavior  Organized hierarchically  Members have common semantic elements, semantic roles and syntactic frames Verb entries:  Refer to a set of classes (different senses)  each class member linked to WN synset(s) (not all WN senses are covered)

Hand built resources vs. Real data VerbNet is based on linguistic theory – how useful is it? How well does it correspond to syntactic variations found in naturally occurring text? PropBank

Mapping from PropBank to VerbNet Frameset id = leave.02 Sense = give VerbNet class = future-having 13.3 Arg0GiverAgent Arg1Thing givenTheme Arg2BenefactiveRecipient

Mapping from PB to VerbNet

Mapping from PropBank to VerbNet Overlap with PropBank framesets  50,000 PropBank instances  85% VN classes Results  MATCH %. (80.90% relaxed)  (VerbNet isn’t just linguistic theory!) Benefits  Thematic role labels and semantic predicates  Can extend PropBank coverage with VerbNet classes  WordNet sense tags Kingsbury & Kipper, NAACL03, Text Meaning Workshop

Mapping PropBank/VerbNet Extended VerbNet now covers 80% of PropBank tokens. Kipper, et. al., LREC-04, LREC-06 (added Korhonen and Briscoe classes) Semi-automatic mapping of PropBank instances to VerbNet classes and thematic roles, hand-corrected. (final cleanup stage) VerbNet class tagging as automatic WSD Run SRL, map Args to VerbNet roles

Can SemLink improve Generalization? Overloaded Arg2-Arg5  PB: verb-by-verb  VerbNet: same thematic roles across verbs Example Rudolph Agnew, …, was named [ARG2 {Predicate} a nonexecutive director of this British industrial conglomerate.] ….the latest results appear in today ’ s New England Journal of Medicine, a forum likely to bring new attention [ARG2 {Destination} to the problem.] Use VerbNet as a bridge to merge PB and FN and expand the Size and Variety of the Training

Automatic Labelling of Semantic Relations – Gold Standard, 77% Given a constituent to be labelled Stochastic Model Features:  Predicate, (verb)  Phrase Type, (NP or S-BAR)  Parse Tree Path  Position (Before/after predicate)  Voice (active/passive)  Head Word of constituent Gildea & Jurafsky, CL02, Gildea & Palmer, ACL02

Additional Automatic Role Labelers Performance improved from 77% to 88% Automatic parses, 81% F, Brown corpus, 68%  Same features plus Named Entity tags Head word POS For unseen verbs – backoff to automatic verb clusters  SVM’s Role or not role For each likely role, for each Arg#, Arg# or not No overlapping role labels allowed Pradhan, et. al., ICDM03, Sardeneau, et. al, ACL03,Chen & Rambow, EMNLP03, Gildea & Hockemaier, EMNLP03, Yi & Palmer, ICON04 CoNLL-04, 05 Shared Task

Arg1 groupings; (Total count 59710) Group1 (53.11%) Group2 (23.04%) Group3 (16%) Group4 (4.67%) Group5 (.20%) Theme; Theme1; Theme2; Predicate; Stimulus; Attribute TopicPatient; Product; Patient1; Patient2 Agent; Actor2; Cause; Experiencer Asset

Arg2 groupings; (Total count 11068) Group1 (43.93%) Group2 (14.74%) Group3 (32.13%) Group4 (6.81%) Group5 (2.39%) Recipient; Destination; Location; Source; Material; Beneficiary Extent; Asset Predicate; Attribute; Theme; Theme2; Theme1; Topic Patient2; Product Instrument; Actor2; Cause; Experiencer

Process Retrain the SRL tagger  Original: Arg[0-5,A,M]  ARG1 Grouping: (similar for Arg2) Arg[0,2-5,A,M] Arg1-Group[1-6] Evaluation on both WSJ and Brown More Coarse-grained or Fine-grained?  more specific: data more coherent, but more sparse  more general: consistency across verbs even for new domains?

SRL Performance (WSJ/BROWN) SystemPrecisionRecallF-1 Arg1-Original Arg1-Mapped Arg2-Original Arg2-Mapped Arg1-Original Arg1-Mapped Arg2-Original Arg2-Mapped Loper, Yi, Palmer, SIGSEM07