Survey of Annotation Work Joint session Thursday afternoon, April 14 Chair: Eduard Hovy, ISI
Phenomena (from OntoBank) LevelWhoPhenomenon L1Penn Treebankbracketing/grouping of predications L1Propbankverb sense creation and annotation (including copula) L1 Propbank, Framenet, Verbnet, LCS, ILIT verb sense frames & predicate structure (what labels?) L1 Propbank+Omega, IAMTC+Omega, ILIT, Scone semantic term repository: conversion of senses to concepts(/clusters), axiom creation, insertion into ontology L1,L2NomBank, ACEnoun senses, NP structure, propositions, (genitives, …) L1Gazetteersrepository of instances (people, places, events…) L1BBN, (ACE)co-reference links (including events) L2 pronoun (and empty trace?) classification (ref, bound, event, generic, other)(proposition vs. event?) L2Propbank II, ILITevent identification
LevelWhoPhenomenon L1direct quotation and reported speech L1simple quantifier phrases and numerical exprs L1,L2TimeBank, TIMEX, ISI (Hobbs), ILIT inter-predicate relations: temporal, spatial, manner, etc. (incl. effects from discourse and aspect) L2+WordNetPlus, Pantel, CYCentailments L2+comparatives L2coordination L2/L3Penn Discourse Treebank, RST Treebank, ILIT discourse structure L2/L3U Pitt, ISIopinions L3identifying propositions and simple modality L3/L4other adverbials (epistemic modals, evidentials) L3/L4polarity (more advanced than plain “neg” in L1) L3+Steedman, Hajicova, Sgallinformation structure (theme/rheme), focus L4ILITpragmatics/speech acts, style L4presuppositions ?CYC, Sconeaxioms and reasoning ?Framenetmetaphor
Notional goal phenomenon annot annot functionality funder speed reliability need noun senses 25 wph 86/90% IE,MT,QA... high verb senses 70 wph ~87% MT,QA,WSD high verb frames 80 w/week 87% MT,QA,IE… high time exprs 18 wpm 96% QA,IR,Summ med-hi discourse 100K in 400h~90/80%Summ,QA med gazetteers ?~95/90% QA,IE high opinions 100K in 400h ~76%QA,Summ med-hi number exprs ? ? IE,QA,Summ med hypotheticals ? ? QA,Summ low?
Agenda I Predicate/verb level: –PropBank I and II: Martha Palmer, UPenn –OntoBank corefs: Lance Ramshaw, BBN –IAMTC consortium: Steve Helmreich, NMSU –FrameNet: Charles Fillmore, UC Berkeley –Extended LCS: Bonnie Dorr, U Maryland Nominal level: –NomBank: Adam Meyers, NYU –ACE: Ralph Grishman, NYU Terminology banks: –WordNet: Christiane Fellbaum, Princeton –Omega: Eduard Hovy, USC/ISI to PropBank to IAMTC to OntoBank coref to Framenet to LCS to NomBank and Pie-in-the-Sky to ACE to WordNetPlus to Omega
Agenda II Discourse level: –RST treebank: Lynn Carlson, DoD –Penn discourse treebank: Aravind Joshi, UPenn Specific semantic phenomena: –TIMEX: Lisa Ferro, MITRE & Beth Sundheim, SPAWAR –ILIT: Sergei Nirenburg, UMBC –Opinions: Jan Wiebe, U Pitt –Gazetteers: Beth Sundheim, SPAWAR Inference and reasoning: –WN Entailments: Christiane Fellbaum, Princeton –CYC: Dave Schneider –Scone: Scott Fahlman to Penn discourse to TIMEX to ILIT to opinions to gazetteers to WN entailments to CYC to Scone to RST
Summary of annot work