Presentation is loading. Please wait.

Presentation is loading. Please wait.

PropBanks, 10/30/03 1 Penn Putting Meaning Into Your Trees Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben.

Similar presentations


Presentation on theme: "PropBanks, 10/30/03 1 Penn Putting Meaning Into Your Trees Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben."— Presentation transcript:

1 PropBanks, 10/30/03 1 Penn Putting Meaning Into Your Trees Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben Snyder PropBanks I and II site visit University of Pennsylvania, October 30, 2003

2 PropBanks, 10/30/03 2 Penn Proposition Bank: From Sentences to Propositions Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji ) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting... When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2)

3 PropBanks, 10/30/03 3 Penn Capturing semantic roles*  JK broke [ ARG1 the LCD Projector.]  [ARG1 The windows] were broken by the hurricane.  [ARG1 The vase] broke into pieces when it toppled over. SUBJ *See also Framenet, http://www.icsi.berkeley.edu/~framenet/http://www.icsi.berkeley.edu/~framenet/

4 PropBanks, 10/30/03 4 Penn Outline  Introduction  Proposition Bank  Starting with Treebanks  Frames files  Annotation process and status  PropBank II  Automatic labelling of semantic roles  Chinese Proposition Bank

5 PropBanks, 10/30/03 5 Penn A TreeBanked Sentence Analysts S NP-SBJ VP have VP beenVP expecting NP a GM-Jaguar pact NP that SBAR WHNP-1 *T*-1 S NP-SBJ VP would VP give the US car maker NP an eventual 30% stake NP the British company NP PP- LOC in (S (NP-SBJ Analysts) (VP have (VP been (VP expecting (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S (NP-SBJ *T*-1) (VP would (VP give (NP the U.S. car maker) (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) Analysts have been expecting a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.

6 PropBanks, 10/30/03 6 Penn The same sentence, PropBanked Analysts have been expecting a GM-Jaguar pact Arg0 Arg1 (S Arg0 (NP-SBJ Analysts) (VP have (VP been (VP expecting Arg1 (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S Arg0 (NP-SBJ *T*-1) (VP would (VP give Arg2 (NP the U.S. car maker) Arg1 (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) that would give *T*-1 the US car maker an eventual 30% stake in the British company Arg0 Arg2 Arg1 expect(Analysts, GM-J pact) give(GM-J pact, US car maker, 30% stake)

7 PropBanks, 10/30/03 7 Penn Frames File Example: expect Roles: Arg0: expecter Arg1: thing expected Example: Transitive, active: Portfolio managers expect further declines in interest rates. Arg0: Portfolio managers REL: expect Arg1: further declines in interest rates

8 PropBanks, 10/30/03 8 Penn Frames File example: give Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object The executives gave the chefs a standing ovation. Arg0: The executives REL: gave Arg2: the chefs Arg1: a standing ovation

9 PropBanks, 10/30/03 9 Penn Trends in Argument Numbering  Arg0 = agent  Arg1 = direct object / theme / patient  Arg2 = indirect object / benefactive / instrument / attribute / end state  Arg3 = start point / benefactive / instrument / attribute  Arg4 = end point

10 PropBanks, 10/30/03 10 Penn Ergative/Unaccusative Verbs Roles (no ARG0 for unaccusative verbs) Arg1 = Logical subject, patient, thing rising Arg2 = EXT, amount risen Arg3* = start point Arg4 = end point Sales rose 4% to $3.28 billion from $3.16 billion. The Nasdaq composite index added 1.01 to 456.6 on paltry volume.

11 PropBanks, 10/30/03 11 Penn Function tags for English/Chinese (arguments or adjuncts?)  Variety of ArgM’s (Arg#>4):  TMP - when?  LOC - where at?  DIR - where to?  MNR - how?  PRP -why?  TPC – topic  PRD -this argument refers to or modifies another  ADV –others  CND – conditional  DGR – degree  FRQ - frequency

12 PropBanks, 10/30/03 12 Penn Inflection  Verbs also marked for tense/aspect  Passive/Active  Perfect/Progressive  Third singular (is has does was)  Present/Past/Future  Infinitives/Participles/Gerunds/Finites  Modals and negation marked as ArgMs

13 PropBanks, 10/30/03 13 Penn Word Senses in PropBank  Orders to ignore word sense not feasible for 700+ verbs  Mary left the room  Mary left her daughter-in-law her pearls in her will Frameset leave.01 "move away from": Arg0: entity leaving Arg1: place left Frameset leave.02 "give": Arg0: giver Arg1: thing given Arg2: beneficiary How do these relate to traditional word senses as in WordNet?

14 PropBanks, 10/30/03 14 Penn Overlap between Groups and Framesets – 95% WN1 WN2 WN3 WN4 WN6 WN7 WN8 WN5 WN 9 WN10 WN11 WN12 WN13 WN 14 WN19 WN20 Frameset1 Frameset2 develop Palmer, Dang & Fellbaum, NLE 2004

15 PropBanks, 10/30/03 15 Penn Annotator accuracy – ITA 84%

16 PropBanks, 10/30/03 16 Penn English PropBank Status - ( w/ Paul Kingsbury & Scott Cotton)  Create Frame File for that verb - DONE  3282 lemmas, 4400+ framesets  First pass: Automatic tagging (Joseph Rosenzweig)  Second pass: Double blind hand correction  118K predicates – all but 300 done  Third pass: Solomonization (adjudication)  Betsy Klipple, Olga Babko-Malaya – 400 left  Frameset tags  700+, double blind, almost adjudicated, 92% ITA  Quality Control and general cleanup

17 PropBanks, 10/30/03 17 Penn Quality Control and General Cleanup  Frame File consistency checking  Coordination with NYU  Insuring compatibility of frames and format  Leftover tasks  have, be, become  Adjectival usages  General cleanup  Tense tagging  Finalizing treatment of split arguments, ex. say, and symmetric arguments, ex. match  Supplementing sparse data w/ Brown for selected verbs

18 PropBanks, 10/30/03 18 Penn Summary of English PropBank Paul Kingsbury, Olga Babko-Malaya, Scott Cotton GenreWordsFrames FilesFrameset Tags Released Wall Street Journal* (financial subcorpus) 300K< 2000400July, 02 Wall Street Journal* (Penn TreeBank II) 1000K< 4000700Dec, 03? (March, 03) English Translation of Chinese TreeBank * ITIC funding 100K<1500July, 04 Sinorama, English corpus NSF-ITR funding 150K<2000July, 05 English half of DLI Military Corpus ARL funding 50K< 1000July, 05

19 PropBanks, 10/30/03 19 Penn PropBank II  Nominalizations NYU  Lexical Frames DONE  Event Variables, (including temporals and locatives)  More fine-grained sense tagging  Tagging nominalizations w/ WordNet sense  Selected verbs and nouns  Nominal Coreference  not names  Clausal Discourse connectives – selected subset

20 PropBanks, 10/30/03 20 Penn PropBank I Also, [ Arg0 substantially lower Dutch corporate tax rates] helped [ Arg1 [ Arg0 the company] keep [ Arg1 its tax outlay] [ Arg3- PRD flat] [ ArgM-ADV relative to earnings growth]]. relative to earnings… flatits tax outlaythe company keep the company keep its tax outlay flat tax rateshelp ArgM-ADVArg3- PRD Arg1Arg0REL Event variables; ID# h23 k16 nominal reference; sense tags; help2,5 tax rate1 keep1 company1 discourse connectives { } I

21 PropBanks, 10/30/03 21 Penn Summary of Multilingual TreeBanks, PropBanks Parallel Corpora TextTreebankPropBank IProp II Chinese Treebank Chinese 500K English 400K Chinese 500K English 100K Chinese 500K English 350K* Ch 100K En 100K Arabic Treebank Arabic 500K English 500K Arabic 500K English 100K Korean Treebank Korean 180K English 50K Korean 180K English 50K Korean100K+ English 50K * Also 1M word English monolingual PropBank

22 PropBanks, 10/30/03 22 Penn Agenda  PropBank I 10:30 – 10:50  Automatic labeling of semantic roles  Chinese Proposition Bank  Proposition Bank II 10:50 – 11:30  Event variables – Olga Babko Malaya  Sense tagging – Hoa Dang  Nominal coreference – Edward Loper  Discourse tagging – Aravind Joshi  Research Areas – 11:30 – 12:00  Moving forward – Mitch Marcus  Alignment improvement via dependency structures– Yuan Ding  Employing syntactic features in MT – Libin Shen  Lunch 12:00 – 1:30 White Dog  Research Area - 1:30 – 1:45  Clustering – Paul Kingsbury  DOD Program presentation – 1:45 – 2:15  Discussion 2:15 – 3:00


Download ppt "PropBanks, 10/30/03 1 Penn Putting Meaning Into Your Trees Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben."

Similar presentations


Ads by Google