Presentation is loading. Please wait.

Presentation is loading. Please wait.

David Farwell, Stephen Helmreich Computing Research Laboratory/New Mexico State University Lori Levin, Teruko Mitamura Language Technologies Institute/Carnegie.

Similar presentations


Presentation on theme: "David Farwell, Stephen Helmreich Computing Research Laboratory/New Mexico State University Lori Levin, Teruko Mitamura Language Technologies Institute/Carnegie."— Presentation transcript:

1 David Farwell, Stephen Helmreich Computing Research Laboratory/New Mexico State University Lori Levin, Teruko Mitamura Language Technologies Institute/Carnegie Mellon University Bonnie Dorr, Rebecca Green Institute for Advanced Computer Studies/University of Md. Eduard Hovy Information Sciences Institute/University of S. California Keith Miller, Florence Reeder MITRE Corporation Owen Rambow, Nizar Habash Columbia University Columbia, CRL/NMSU, ISI/USC, LTI/CMU, MITRE, UMIACS/UMD

2 What we annotate multiple comparable bilingual text corpora parallel text corpora multiple translations of texts Genre - newspaper texts / DARPA corpus Goals common representation (interlingua) common methodology and tools observe and catalogue different surface realizations of the same meaning across and within languages Columbia, CRL/NMSU, ISI/USC, LTI/CMU, MITRE, UMIACS/UMD

3

4

5 Annotation Process Text is syntactically parsed (Connexor / IL0) Reviewed and corrected (TrEd) Annotation to IL1 (Tiamat) Content words annotated for sense (Omega) Arguments annotated for thematic role (LCS) 2 English translations of 6 articles Arabic, French, Hindi, Japanese, Korean, Spanish 12 annotators, 2 at each site Total: 144 annotated texts to IL1 level Columbia, CRL/NMSU, ISI/USC, LTI/CMU, MITRE, UMIACS/UMD

6 Results: Agreement & Time Tools (Tiamat) Manuals (IL0 for 7 languages, IL1) Inter-annotator agreement: kappa =.83 (mK),.66 (wn),.59 (theta-roles) Annotation time: 4 hours/annotator/ text, 250 words/text, 2 annotators/text = approx. 2 person years for 100K at IL1 Next step: merge IL1 representations and develop transformation algorithms to produce IL2 Columbia, CRL/NMSU, ISI/USC, LTI/CMU, MITRE, UMIACS/UMD


Download ppt "David Farwell, Stephen Helmreich Computing Research Laboratory/New Mexico State University Lori Levin, Teruko Mitamura Language Technologies Institute/Carnegie."

Similar presentations


Ads by Google