Download presentation
Presentation is loading. Please wait.
1
10/9/01PropBank1 Proposition Bank: a resource of predicate-argument relations Martha Palmer, Dan Gildea, Paul Kingsbury University of Pennsylvania February 26, 2002 ACE PI Meeting, Fairfield Inn, MD
2
10/9/01PropBank2 Outline Overview Status Report Outstanding Issues Automatic Tagging – Dan Gildea Details – Paul Kingsbury Frames files Annotator issues Demo
3
10/9/01PropBank3 Proposition Bank: Generalizing from Sentences to Propositions Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji ) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting... When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2)
4
10/9/01PropBank4 Penn English Treebank 1.3 million words Wall Street Journal and other sources Tagged with Part-of-Speech Syntactically Parsed Widely used in NLP community Available from Linguistic Data Consortium
5
10/9/01PropBank5 A TreeBanked Sentence Analysts S NP-SBJ VP have VP beenVP expecting NP a GM-Jaguar pact NP that SBAR WHNP-1 *T*-1 S NP-SBJ VP would VP give the US car maker NP an eventual 30% stake NP the British company NP PP-LOC in (S (NP-SBJ Analysts) (VP have (VP been (VP expecting (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S (NP-SBJ *T*-1) (VP would (VP give (NP the U.S. car maker) (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) Analysts have been expecting a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.
6
10/9/01PropBank6 The same sentence, PropBanked Analysts have been expecting a GM-Jaguar pact Arg0 Arg1 (S Arg0 (NP-SBJ Analysts) (VP have (VP been (VP expecting Arg1 (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S Arg0 (NP-SBJ *T*-1) (VP would (VP give Arg2 (NP the U.S. car maker) Arg1 (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) that would give *T*-1 the US car maker an eventual 30% stake in the British company Arg0 Arg2 Arg1 expect(Analysts, GM-J pact) give(GM-J pact, US car maker, 30% stake)
7
10/9/01PropBank7 English PropBank 1M words of Treebank over 2 years, May ’ 01-03 New semantic augmentations Predicate-argument relations for verbs label arguments: Arg0, Arg1, Arg2, … First subtask, 300K word financial subcorpus (12K sentences, 29K+ predicates) Spin-off: Guidelines (necessary for annotators) English lexical resource – FRAMES FILES 3500+ verbs with labeled examples, rich semantics http://www.cis.upenn.edu/~ace/
8
10/9/01PropBank8 English PropBank – Current Status Frames files 742 verb lemmas (includes phrasal variants - 932) 363/899 VerbNet semi-automatic expansions (subtask/PB) First subtask: 300K financial subcorpus 22,595K unique predicates annotated out of 29K, (80%) 6K+ remaining (7 weeks, 1000@week, first pass) 1005 verb lemmas out of 1700+ (59%) 700 remaining (3.5 months, 200@month) PropBank, (including some of Brown?) 34,437 predicates annotated out of 118K, (29%) 1904 (1005 + 899) verb lemmas out of 3500, (54%)
9
10/9/01PropBank9 Projected delivery dates Financial subcorpus alpha release – December, 2001 beta release – June, 2002 adjudicated release – Dec, 2002 Propbank alpha release – December, 2002 beta release – Spring, 2003
10
10/9/01PropBank10 English PropBank - Status Sense tagging 200+ verbs with multiple rolesets sense tag this summer with undergrads using NSF funds Still need to address 3 usages of "have ” : imperative, possessive, auxiliary be, become: predicate adjectives, predicate nominals
11
10/9/01PropBank11 Automatic Labeling of Semantic Relations Features: Predicate Phrase Type Parse Tree Path Position (Before/after predicate) Voice (active/passive) Head Word
12
10/9/01PropBank12 Example with Features
13
10/9/01PropBank13 Labelling Accuracy-Known Boundaries 79.673.682.0Automatic 83.177.0Gold Standard PropBank > 10 instances PropBankFramenetParses Accuracy of semantic role prediction for known boundaries--the system is given the constituents to classify. Framenet examples (training/test) are handpicked to be unambiguous.
14
10/9/01PropBank14 Labelling Accuracy – Unknown Boundaries 57.7 50.064.6 61.2Automatic 71.1 64.4Gold Standard PropBank Precision Recall Framenet Precision Recall Parses Accuracy of semantic role prediction for unknown boundaries--the system must identify the constituents as arguments and give them the correct roles.
15
10/9/01PropBank15 Complete Sentence Analysts have been expecting a GM-Jaguar pact that *T*-1 would give the U.S. car maker an eventual 30% stake in the British company and create joint ventures that *T*-2 would produce an executive-model range of cars. expect(analysts, pact) give(pact, car_maker,stake) create(pact,joint_ventures) produce(joint_ventures,range_of_cars)
16
10/9/01PropBank16 Guidelines: Frames Files Created manually - Paul Kingsbury new framer: Olga Babko-Malaya, (Ph.D.,Rugters, Linguistics) Refer to VerbNet, WordNet and Framenet Currently in place for 787/986 verbs Use "semantic role glosses" unique to each verb (map to Arg0, Arg1 labels appropriate to class)
17
10/9/01PropBank17 Frames Example: expect Roles: Arg0: expecter Arg1: thing expected Example: Transitive, active: Portfolio managers expect further declines in interest rates. Arg0:Portfolio managers REL:expect Arg1:further declines in interest rates
18
10/9/01PropBank18 Frames example: give Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object The executives gave the chefs a standing ovation. Arg0:The executives REL: gave Arg2: the chefs Arg1: a standing ovation
19
10/9/01PropBank19 How are arguments numbered? Examination of example sentences Determination of required / highly preferred elements Sequential numbering, Arg0 is typical first argument, except O ergative/unaccusative verbs (shake example) O Arguments mapped for "synonymous" verbs
20
10/9/01PropBank20 Additional tags (arguments or adjuncts?) Variety of ArgM s (Arg#>4): TMP - when? LOC - where at? DIR - where to? MNR - how? PRP -why? REC - himself, themselves, each other PRD -this argument refers to or modifies another ADV -others
21
10/9/01PropBank21 Ergative/Unaccusative Verbs: rise Roles Arg1 = Logical subject, patient, thing rising Arg2 = EXT, amount risen Arg3* = start point Arg4 = end point Sales rose 4% to $3.28 billion from $3.16 billion. *Note: Have to mention prep explicitly, Arg3-from, Arg4-to, or could have used ArgM-Source, ArgM-Goal. Arbitrary distinction.
22
10/9/01PropBank22 Synonymous Verbs: add in sense rise Roles: Arg1 = Logical subject, patient, thing rising/gaining/being added to Arg2 = EXT, amount risen Arg4 = end point The Nasdaq composite index added 1.01 to 456.6 on paltry volume.
23
10/9/01PropBank23 Phrasal Verbs Put together Put in Put off Put on Put out Put up ... Accounts for additional 200 "verbs"
24
10/9/01PropBank24 Frames: Multiple Rolesets Rolesets are not necessarily consistent between different senses of the same verb O Verb with multiple senses can have multiple frames, but not necessarily Roles and mappings onto argument labels are consistent between different verbs that share similar argument structures, Similar to Framenet O Levin / VerbNet classes O http://www.cis.upenn.edu/~dgildea/Verbs/ http://www.cis.upenn.edu/~dgildea/Verbs/ Out of the 787 most frequent verbs: 1 Roleset - 521 2 rolesets - 169 3+ rolesets - 97 (includes light verbs)
25
10/9/01PropBank25 Semi-automatic expansion of Frames Experimenting with semi-automatic expansion Find unframed members of Levin class in VerbNet--inherit ” frames from other member 787 verbs manually framed Can expand to 1200+ using VerbNet Will need hand correction First experiment, automatic expansion provided 90% coverage of data
26
10/9/01PropBank26 More on Automatic Expansion Destroy: Arg0: destroyer Arg1: thing destroyed Arg2: instrument of destruction Verbnet class Destroy-44: annihilate, blitz, decimate, demolish, destroy, devastate, exterminate, extirpate, obliterate, ravage, raze, ruin, waste, wreck
27
10/9/01PropBank27 What a Waste Waste: Arg0: destroyer Arg1: thing destroyed Arg2: instrument of destruction He didn’t waste any time distancing himself from his former boss Arg0: He Arg1: any time Arg2 =? distancing himself...
28
10/9/01PropBank28 Trends in Argument Numbering Arg0 = agent Arg1 = direct object / theme / patient Arg2 = indirect object / benefactive / instrument / attribute / end state Arg3 = start point / benefactive / instrument / attribute Arg4 = end point
29
10/9/01PropBank29 Morphology Verbs also marked for tense/aspect/voice OPassive/Active OPerfect/Progressive OThird singular (is has does was) OPresent/Past/Future OInfinitives/Participles/Gerunds/Finites Modals and negation marked as ArgMs
30
10/9/01PropBank30 Annotation procedure Extraction of all sentences with given verb First pass: Automatic tagging (Joseph Rosenzweig) http://www.cis.upenn.edu/~josephr/TIDES/index.html#lexicon Second pass: Double blind hand correction Variety of backgrounds Less syntactic training than for treebanking Tagging tool highlights discrepancies Third pass: Solomonization (adjudication)
31
10/9/01PropBank31 Inter-Annotator Agreement
32
10/9/01PropBank32 Annotator vs. Gold Standard
33
10/9/01PropBank33 Financial Subcorpus Status 1005 verbs framed (700+ to go) O (742 + 363 VerbNet siblings) 535 verbs first-passed O22,595 unique tokens ODoes not include ~3000 tokens tagged for Senseval 89 verbs second-passed O 7600+ tokens 42 verbs solomonized O2890 tokens
34
10/9/01PropBank34 Throughput Framing: approximately 25 verbs/week Olga will also start framing; joint up to 50 verbs/wk Annotation: approximately 50 predicates/hour 20 hours of annotation a week, 1000 predicates/wk Solomonization: approximately 1 hour per verb, but will speed up with lower frequency verbs.
35
10/9/01PropBank35 Summary Predicate-argument structure labels are arbitrary to a certain degree, but still consistent, and generic enough to be mappable to particular theoretical frameworks Automatic tagging as a first pass makes the task feasible Agreement and accuracy figures are reassuring Financial subcorpus is 80% complete, beta-release June
36
10/9/01PropBank36 Solomonization Source tree: Intel told analysts that the company will resume shipments of the chips within two to three weeks. *** Kate said: arg0 : Intel arg1 : the company will resume shipments of the chips within two to three weeks arg2 : analysts *** Erwin said: arg0 : Intel arg1 : that the company will resume shipments of the chips within two to three weeks arg2 : analysts
37
10/9/01PropBank37 Solomonization Such loans to Argentina also remain classified as non-accruing, *TRACE*-1 costing the bank $ 10 million *TRACE*-*U* of interest income in the third period. *** Kate said: arg1 : *TRACE*-1 arg2 : $ 10 million *TRACE*-*U* of interest income arg3 : the bank argM-TMP : in the third period *** Erwin said: arg1 : *TRACE*-1 -> Such loans to Argentina arg2 : $ 10 million *TRACE*-*U* of interest income arg3 : the bank argM-TMP : in the third period
38
10/9/01PropBank38 Solomonization Also, substantially lower Dutch corporate tax rates helped the company keep its tax outlay flat relative to earnings growth. *** Kate said: arg0 : the company arg1 : its tax outlay arg3-PRD : flat argM-MNR : relative to earnings growth *** Katherine said: arg0 : the company arg1 : its tax outlay arg3-PRD : flat argM-ADV : relative to earnings growth
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.