Training and Evaluation CSCI-GA.2591

Training and Evaluation CSCI-GA.2591
NYU Training and Evaluation CSCI-GA.2591 Ralph Grishman

Training Data For POS: For NE: For Coref:
WSJ in CoNLL format For NE: Reuters in CoNLL format OntoNotes in CoNLL format ACE 2005 data (? Format) For Coref: For ACE tasks: Entities, Relations, Events: ACE 2005 data in APF

APF format One APF document for each source document XML format
Stand-off annotation Does not count XML tags in offsets Entities, relations, Events Supported through AceJet classes (Ace*)

Evaluation Starting Point
We begin by evaluating each component separately With what input? Use output of previous stage of pipeline? Likely to have lots of errors Adversely affect component in question Alignment problem RE producing correct relations on incorrect entities

Evaluation Starting Point
Start with perfect input Perfect entities for relation and event extraction Perfect NE for entity extraction and coref Not always possible, may need to supplement with pipeline output Set of all mentions as input to entity extraction

Full System Evaluations
Alignment problem

MaxEnt: things to try More iterations Smoothing (raise cutoff)
Two-stage classifier

For Next Week Short report (1 page) Any initial results
Plan for following 3 weeks

Training and Evaluation CSCI-GA.2591

Similar presentations

Presentation on theme: "Training and Evaluation CSCI-GA.2591"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Training and Evaluation CSCI-GA.2591

Similar presentations

Presentation on theme: "Training and Evaluation CSCI-GA.2591"— Presentation transcript:

Similar presentations

About project

Feedback