Download presentation
Presentation is loading. Please wait.
1
Training and Evaluation CSCI-GA.2591
NYU Training and Evaluation CSCI-GA.2591 Ralph Grishman
2
Training Data For POS: For NE: For Coref:
WSJ in CoNLL format For NE: Reuters in CoNLL format OntoNotes in CoNLL format ACE 2005 data (? Format) For Coref: For ACE tasks: Entities, Relations, Events: ACE 2005 data in APF
3
APF format One APF document for each source document XML format
Stand-off annotation Does not count XML tags in offsets Entities, relations, Events Supported through AceJet classes (Ace*)
4
Evaluation Starting Point
We begin by evaluating each component separately With what input? Use output of previous stage of pipeline? Likely to have lots of errors Adversely affect component in question Alignment problem RE producing correct relations on incorrect entities
5
Evaluation Starting Point
Start with perfect input Perfect entities for relation and event extraction Perfect NE for entity extraction and coref Not always possible, may need to supplement with pipeline output Set of all mentions as input to entity extraction
6
Full System Evaluations
Alignment problem
7
MaxEnt: things to try More iterations Smoothing (raise cutoff)
Two-stage classifier
8
For Next Week Short report (1 page) Any initial results
Plan for following 3 weeks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.