A transformation-based approach to argument labeling Derrick Higgins Educational Testing Service
General approach Word-by-word SRL Modified IOB scheme for indicating role boundaries Start from simplistic baseline labeling TBL rules re-label words based on contextual features
Data representation Modified IOB Words Argument labeling Input formatIOB2Modified IOB They(A0*A0)B-A0 left(V*V)B-V their(A1*B-A1 jobs*A1)I-A1I on(AM-TMP*B-AM-TMP Friday*AM-TMP)I-AM-TMPI.*OO
Features Fairly standard set; role label of word depends on: –Target verb –Target verb POS –Target verb passive –Word –POS –Chunk tag –NE tag –L/R of target word –Clause embedding –PP feature –PP head –NP head –Path Values for current word and surrounding words No use made of PB frames
Transformational rules 130 total Minimum number of applications = 3 (Mostly) local rules –Local syntactic features + [path, target V, NP head, etc] Rules using context –Smoothing rules –Long-distance rules
Results Overtraining is an issue Core arguments easier than modifiers
Results
Error analysis Pros/cons of TBL –Pro: easy conditioning on many factors –Con: Little control over trade-off between rule frequency and rule type in selecting rules –Con: Predictive features which are correlated with one another may not be used jointly –Con: No real probabilistic framework Problems with low-freq. roles
Error analysis Dependency on length
Potential improvements Phrase-by-phrase labeling Using ‘official’ baseline Rules in ordered sets? Global optimization Additional features