10/9/01PropBank1 Proposition Bank: a resource of predicate-argument relations Martha Palmer, Dan Gildea, Paul Kingsbury University of Pennsylvania February.

Slides:



Advertisements
Similar presentations
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
Advertisements

SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS
Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.
Page 1 SRL via Generalized Inference Vasin Punyakanok, Dan Roth, Wen-tau Yih, Dav Zimak, Yuancheng Tu Department of Computer Science University of Illinois.
Syntax-Semantics Mapping Rajat Kumar Mohanty CFILT.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Relation Extraction.
Layering Semantics (Putting meaning into trees) Treebank Workshop Martha Palmer April 26, 2007.
Multilinugual PennTools that capture parses and predicate-argument structures, and their use in Applications Martha Palmer, Aravind Joshi, Mitch Marcus,
FrameNet, PropBank, VerbNet Rich Pell. FrameNet, PropBank, VerbNet  When syntactic information is not enough  Lexical databases  Annotate a natural.
CL Research ACL Pattern Dictionary of English Prepositions (PDEP) Ken Litkowski CL Research 9208 Gue Road Damascus,
E XTRACTING SEMANTIC ROLE INFORMATION FROM UNSTRUCTURED TEXTS Diana Trandab ă 1 and Alexandru Trandab ă 2 1 Faculty of Computer Science, University “Al.
Natural Language Processing Semantic Roles. Semantics Road Map 1.Lexical semantics 2.Disambiguating words Word sense disambiguation Coreference resolution.
Overview of the Hindi-Urdu Treebank Fei Xia University of Washington 7/23/2011.
Outline Linguistic Theories of semantic representation  Case Frames – Fillmore – FrameNet  Lexical Conceptual Structure – Jackendoff – LCS  Proto-Roles.
The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin.
Language Data Resources Treebanks. A treebank is a … database of syntactic trees corpus annotated with morphological and syntactic information segmented,
Statistical NLP: Lecture 3
Semantic Role Labeling Abdul-Lateef Yussiff
A Joint Model For Semantic Role Labeling Aria Haghighi, Kristina Toutanova, Christopher D. Manning Computer Science Department Stanford University.
10/9/01PropBank1 Proposition Bank: a resource of predicate-argument relations Martha Palmer University of Pennsylvania October 9, 2001 Columbia University.
PropBanks, 10/30/03 1 Penn Putting Meaning Into Your Trees Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
Towards Parsing Unrestricted Text into PropBank Predicate- Argument Structures ACL4 Project NCLT Seminar Presentation, 7th June 2006 Conor Cafferkey.
Steven Schoonover.  What is VerbNet?  Levin Classification  In-depth look at VerbNet  Evolution of VerbNet  What is FrameNet?  Applications.
Introduction to treebanks Session 1: 7/08/
DS-to-PS conversion Fei Xia University of Washington July 29,
NomBank 1.0: ULA08 Workshop March 18, 2007 NomBank 1.0 Released 12/2007 Unified Linguistic Annotation Workshop Adam Meyers New York University March 18,
LCS and Approximate Interlingua at UMD Semantic Annotation Planning Meeting April 14, 2004 Bonnie J. Dorr University of Maryland.
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.
EMPOWER 2 Empirical Methods for Multilingual Processing, ‘Onoring Words, Enabling Rapid Ramp-up Martha Palmer, Aravind Joshi, Mitch Marcus, Mark Liberman,
CIS630 1 Penn Putting Meaning Into Your Trees Martha Palmer Collaborators: Paul Kingsbury, Olga Babko-Malaya, Bert Xue, Scott Cotton Karin Kipper, Hoa.
Automatic Extraction of Opinion Propositions and their Holders Steven Bethard, Hong Yu, Ashley Thornton, Vasileios Hatzivassiloglou and Dan Jurafsky Department.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
PropBank, VerbNet & SemLink Edward Loper. PropBank 1M words of WSJ annotated with predicate- argument structures for verbs. –The location & type of each.
A Survey of NLP Toolkits Jing Jiang Mar 8, /08/20072 Outline WordNet Statistics-based phrases POS taggers Parsers Chunkers (syntax-based phrases)
Korean Treebank & Propbank Martha Palmer, Narae Han, Jinyoung Choi, Shijong Ryu University of Pennsylvania May 23, 2005.
The Prague (Czech-)English Dependency Treebank Jan Hajič Charles University in Prague Computer Science School Institute of Formal and Applied Linguistics.
Methods for the Automatic Construction of Topic Maps Eric Freese, Senior Consultant ISOGEN International.
Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.
Penn 1 Kindle: Knowledge and Inference via Description Logics for Natural Language Dan Roth University of Illinois, Urbana-Champaign Martha Palmer University.
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
Semantic Role Labeling: English PropBank
Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.
Combining Lexical Resources: Mapping Between PropBank and VerbNet Edward Loper,Szu-ting Yi, Martha Palmer September 2006.
NLP. Introduction to NLP Background –From the early ‘90s –Developed at the University of Pennsylvania –(Marcus, Santorini, and Marcinkiewicz 1993) Size.
Intra-Chunk Dependency Annotation : Expanding Hindi Inter-Chunk Annotated Treebank Prudhvi Kosaraju, Bharat Ram Ambati, Samar Husain Dipti Misra Sharma,
CSE391 – 2005 NLP 1 Events From KRR lecture. CSE391 – 2005 NLP 2 Ask Jeeves – A Q/A, IR ex. What do you call a successful movie? Tips on Being a Successful.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
11 Project, Part 3. Outline Basics of supervised learning using Naïve Bayes (using a simpler example) Features for the project 2.
ARDA Visit 1 Penn Lexical Semantics at Penn: Proposition Bank and VerbNet Martha Palmer, Dan Gildea, Paul Kingsbury, Olga Babko-Malaya, Bert Xue, Karin.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
NLP. Introduction to NLP Last week, Min broke the window with a hammer. The window was broken with a hammer by Min last week With a hammer, Min broke.
Multilinugual PennTools that capture parses and predicate-argument structures, for use in Applications Martha Palmer, Aravind Joshi, Mitch Marcus, Mark.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
Open Health Natural Language Processing Consortium
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Chinese Proposition Bank Nianwen Xue, Chingyi Chia Scott Cotton, Seth Kulick, Fu-Dong Chiou, Martha Palmer, Mitch Marcus.
CIS630, 9/13/04 1 Penn Putting Meaning into Your Trees Martha Palmer CIS630 September 13, 2004.
Natural Language Processing Vasile Rus
COSC 6336: Natural Language Processing
Leonardo Zilio Supervisors: Prof. Dr. Maria José Bocorny Finatto
Semantic/Thematic Roles Oct 9, 2007 Christopher Manning
English Proposition Bank: Status Report
CSC 594 Topics in AI – Natural Language Processing
Statistical NLP: Lecture 3
ADDING EVENT VARIABLES TO PROPBANK
Structure of a Lexicon Debasri Chakrabarti 13-May-19.
Progress report on Semantic Role Labeling
Presentation transcript:

10/9/01PropBank1 Proposition Bank: a resource of predicate-argument relations Martha Palmer, Dan Gildea, Paul Kingsbury University of Pennsylvania February 26, 2002 ACE PI Meeting, Fairfield Inn, MD

10/9/01PropBank2 Outline  Overview  Status Report  Outstanding Issues  Automatic Tagging – Dan Gildea  Details – Paul Kingsbury •Frames files •Annotator issues •Demo

10/9/01PropBank3 Proposition Bank: Generalizing from Sentences to Propositions Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji ) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting... When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2)

10/9/01PropBank4 Penn English Treebank  1.3 million words  Wall Street Journal and other sources  Tagged with Part-of-Speech  Syntactically Parsed  Widely used in NLP community  Available from Linguistic Data Consortium

10/9/01PropBank5 A TreeBanked Sentence Analysts S NP-SBJ VP have VP beenVP expecting NP a GM-Jaguar pact NP that SBAR WHNP-1 *T*-1 S NP-SBJ VP would VP give the US car maker NP an eventual 30% stake NP the British company NP PP-LOC in (S (NP-SBJ Analysts) (VP have (VP been (VP expecting (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S (NP-SBJ *T*-1) (VP would (VP give (NP the U.S. car maker) (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) Analysts have been expecting a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.

10/9/01PropBank6 The same sentence, PropBanked Analysts have been expecting a GM-Jaguar pact Arg0 Arg1 (S Arg0 (NP-SBJ Analysts) (VP have (VP been (VP expecting Arg1 (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S Arg0 (NP-SBJ *T*-1) (VP would (VP give Arg2 (NP the U.S. car maker) Arg1 (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) that would give *T*-1 the US car maker an eventual 30% stake in the British company Arg0 Arg2 Arg1 expect(Analysts, GM-J pact) give(GM-J pact, US car maker, 30% stake)

10/9/01PropBank7 English PropBank  1M words of Treebank over 2 years, May ’  New semantic augmentations •Predicate-argument relations for verbs  label arguments: Arg0, Arg1, Arg2, … •First subtask, 300K word financial subcorpus (12K sentences, 29K+ predicates)  Spin-off: Guidelines (necessary for annotators)  English lexical resource – FRAMES FILES •3500+ verbs with labeled examples, rich semantics 

10/9/01PropBank8 English PropBank – Current Status  Frames files • 742 verb lemmas (includes phrasal variants - 932) • 363/899 VerbNet semi-automatic expansions (subtask/PB)  First subtask: 300K financial subcorpus —22,595K unique predicates annotated out of 29K, (80%) –6K+ remaining (7 weeks, first pass) — 1005 verb lemmas out of (59%) –700 remaining (3.5 months,  PropBank, (including some of Brown?) • 34,437 predicates annotated out of 118K, (29%) • 1904 ( ) verb lemmas out of 3500, (54%)

10/9/01PropBank9 Projected delivery dates  Financial subcorpus alpha release – December, 2001 beta release – June, 2002 adjudicated release – Dec, 2002  Propbank alpha release – December, 2002 beta release – Spring, 2003

10/9/01PropBank10 English PropBank - Status  Sense tagging •200+ verbs with multiple rolesets —sense tag this summer with undergrads using NSF funds  Still need to address  3 usages of "have ” : imperative, possessive, auxiliary • be, become: predicate adjectives, predicate nominals

10/9/01PropBank11 Automatic Labeling of Semantic Relations Features:  Predicate  Phrase Type  Parse Tree Path  Position (Before/after predicate)  Voice (active/passive)  Head Word

10/9/01PropBank12 Example with Features

10/9/01PropBank13 Labelling Accuracy-Known Boundaries Automatic Gold Standard PropBank > 10 instances PropBankFramenetParses Accuracy of semantic role prediction for known boundaries--the system is given the constituents to classify. Framenet examples (training/test) are handpicked to be unambiguous.

10/9/01PropBank14 Labelling Accuracy – Unknown Boundaries Automatic Gold Standard PropBank Precision Recall Framenet Precision Recall Parses Accuracy of semantic role prediction for unknown boundaries--the system must identify the constituents as arguments and give them the correct roles.

10/9/01PropBank15 Complete Sentence Analysts have been expecting a GM-Jaguar pact that *T*-1 would give the U.S. car maker an eventual 30% stake in the British company and create joint ventures that *T*-2 would produce an executive-model range of cars. expect(analysts, pact) give(pact, car_maker,stake) create(pact,joint_ventures) produce(joint_ventures,range_of_cars)

10/9/01PropBank16 Guidelines: Frames Files  Created manually - Paul Kingsbury —new framer: Olga Babko-Malaya, (Ph.D.,Rugters, Linguistics)  Refer to VerbNet, WordNet and Framenet  Currently in place for 787/986 verbs  Use "semantic role glosses" unique to each verb (map to Arg0, Arg1 labels appropriate to class)

10/9/01PropBank17 Frames Example: expect Roles: Arg0: expecter Arg1: thing expected Example: Transitive, active: Portfolio managers expect further declines in interest rates. Arg0:Portfolio managers REL:expect Arg1:further declines in interest rates

10/9/01PropBank18 Frames example: give Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object The executives gave the chefs a standing ovation. Arg0:The executives REL: gave Arg2: the chefs Arg1: a standing ovation

10/9/01PropBank19 How are arguments numbered?  Examination of example sentences  Determination of required / highly preferred elements  Sequential numbering, Arg0 is typical first argument, except O ergative/unaccusative verbs (shake example) O Arguments mapped for "synonymous" verbs

10/9/01PropBank20 Additional tags (arguments or adjuncts?)  Variety of ArgM ’ s (Arg#>4): • TMP - when? • LOC - where at? • DIR - where to? • MNR - how? • PRP -why? • REC - himself, themselves, each other • PRD -this argument refers to or modifies another • ADV -others

10/9/01PropBank21 Ergative/Unaccusative Verbs: rise Roles Arg1 = Logical subject, patient, thing rising Arg2 = EXT, amount risen Arg3* = start point Arg4 = end point Sales rose 4% to $3.28 billion from $3.16 billion. *Note: Have to mention prep explicitly, Arg3-from, Arg4-to, or could have used ArgM-Source, ArgM-Goal. Arbitrary distinction.

10/9/01PropBank22 Synonymous Verbs: add in sense rise Roles: Arg1 = Logical subject, patient, thing rising/gaining/being added to Arg2 = EXT, amount risen Arg4 = end point The Nasdaq composite index added 1.01 to on paltry volume.

10/9/01PropBank23 Phrasal Verbs  Put together  Put in  Put off  Put on  Put out  Put up ... Accounts for additional 200 "verbs"

10/9/01PropBank24 Frames: Multiple Rolesets  Rolesets are not necessarily consistent between different senses of the same verb O Verb with multiple senses can have multiple frames, but not necessarily  Roles and mappings onto argument labels are consistent between different verbs that share similar argument structures, Similar to Framenet O Levin / VerbNet classes O  Out of the 787 most frequent verbs:  1 Roleset  2 rolesets  3+ rolesets - 97 (includes light verbs)

10/9/01PropBank25 Semi-automatic expansion of Frames  Experimenting with semi-automatic expansion  Find unframed members of Levin class in VerbNet--inherit ” frames from other member  787 verbs manually framed •Can expand to using VerbNet •Will need hand correction  First experiment, automatic expansion provided 90% coverage of data

10/9/01PropBank26 More on Automatic Expansion Destroy: Arg0: destroyer Arg1: thing destroyed Arg2: instrument of destruction Verbnet class Destroy-44: annihilate, blitz, decimate, demolish, destroy, devastate, exterminate, extirpate, obliterate, ravage, raze, ruin, waste, wreck

10/9/01PropBank27 What a Waste Waste: Arg0: destroyer Arg1: thing destroyed Arg2: instrument of destruction • He didn’t waste any time distancing himself from his former boss Arg0: He Arg1: any time Arg2 =? distancing himself...

10/9/01PropBank28 Trends in Argument Numbering  Arg0 = agent  Arg1 = direct object / theme / patient  Arg2 = indirect object / benefactive / instrument / attribute / end state  Arg3 = start point / benefactive / instrument / attribute  Arg4 = end point

10/9/01PropBank29 Morphology  Verbs also marked for tense/aspect/voice OPassive/Active OPerfect/Progressive OThird singular (is has does was) OPresent/Past/Future OInfinitives/Participles/Gerunds/Finites  Modals and negation marked as ArgMs

10/9/01PropBank30 Annotation procedure  Extraction of all sentences with given verb  First pass: Automatic tagging (Joseph Rosenzweig) •  Second pass: Double blind hand correction •Variety of backgrounds •Less syntactic training than for treebanking  Tagging tool highlights discrepancies  Third pass: Solomonization (adjudication)

10/9/01PropBank31 Inter-Annotator Agreement

10/9/01PropBank32 Annotator vs. Gold Standard

10/9/01PropBank33 Financial Subcorpus Status  1005 verbs framed (700+ to go) O ( VerbNet siblings)  535 verbs first-passed O22,595 unique tokens ODoes not include ~3000 tokens tagged for Senseval  89 verbs second-passed O tokens  42 verbs solomonized O2890 tokens

10/9/01PropBank34 Throughput  Framing: approximately 25 verbs/week •Olga will also start framing; joint up to 50 verbs/wk  Annotation: approximately 50 predicates/hour •20 hours of annotation a week, 1000 predicates/wk  Solomonization: approximately 1 hour per verb, but will speed up with lower frequency verbs.

10/9/01PropBank35 Summary  Predicate-argument structure labels are arbitrary to a certain degree, but still consistent, and generic enough to be mappable to particular theoretical frameworks  Automatic tagging as a first pass makes the task feasible  Agreement and accuracy figures are reassuring  Financial subcorpus is 80% complete, beta-release June

10/9/01PropBank36 Solomonization Source tree: Intel told analysts that the company will resume shipments of the chips within two to three weeks. *** Kate said: arg0 : Intel arg1 : the company will resume shipments of the chips within two to three weeks arg2 : analysts *** Erwin said: arg0 : Intel arg1 : that the company will resume shipments of the chips within two to three weeks arg2 : analysts

10/9/01PropBank37 Solomonization Such loans to Argentina also remain classified as non-accruing, *TRACE*-1 costing the bank $ 10 million *TRACE*-*U* of interest income in the third period. *** Kate said: arg1 : *TRACE*-1 arg2 : $ 10 million *TRACE*-*U* of interest income arg3 : the bank argM-TMP : in the third period *** Erwin said: arg1 : *TRACE*-1 -> Such loans to Argentina arg2 : $ 10 million *TRACE*-*U* of interest income arg3 : the bank argM-TMP : in the third period

10/9/01PropBank38 Solomonization Also, substantially lower Dutch corporate tax rates helped the company keep its tax outlay flat relative to earnings growth. *** Kate said: arg0 : the company arg1 : its tax outlay arg3-PRD : flat argM-MNR : relative to earnings growth *** Katherine said: arg0 : the company arg1 : its tax outlay arg3-PRD : flat argM-ADV : relative to earnings growth