Download presentation
Presentation is loading. Please wait.
Published byTerence Reed Modified over 9 years ago
1
Preposition Usage Errors by English as a Second Language (ESL) learners: “ They ate by* their hands.” The writer used by instead of with. This work is supported by a grant from the US department of Education. A multi-class classifier is trained; each class corresponds to a distinct preposition. Usually, one considers 9-34 top English prepositions. Generating Confusion Sets for Context-Sensitive Error Correction Alla Rozovskaya and Dan Roth {rozovska, danr}@illinois.edu A Confusion Set (candidate set) for preposition p i – prepositions considered as corrections for p i. Standard confusion sets -- every participating preposition is viewed as a valid correction for p i. (De Felice & Pulman ’08, Tetreault & Chodorow ’08, Gamon et al., ’08). To narrow down the candidates, we need knowledge about which prepositions can serve as valid corrections. Narrow down candidates by L1 (writer’s first language). “ They ate by* their hands.” (1) Standard Conf. sets – 10 correction candidates for by: {about, on, p3, …p10}. (2) L1-dependent Conf. sets – exclude candidates not seen as corrections for by in the ESL texts: Russian {by, with, of}; Chinese {by, with, in}. (3) L1-dependent Weighted Conf. sets – enhanced with probability for each cand. Experimental Results Preposition Errors and ESL Preposition errors are very common with ESL learners. Preposition errors are influenced by L1. Not all preposition confusions are equally likely (Han et al. ‘10, Rozovskaya & Roth ’10a). The Annotated ESL Corpus 63000 words of ESL writing, annotated for article and preposition errors, other grammar and lexical errors (Rozovskaya & Roth ’10a). Data from speakers of 9 first languages. 4185 prepositions, 352 (8.4%) erroneous. Source language Total preps. Incorrect preps. Error rate Chinese95314415.1% Czech627284.5% Italian687436.3% Russian1210857.0% Spanish708527.3% All41853528.4% Preposition errors in the ESL data. Experiments Models are trained on top 10 prepositions on native English data using the Averaged Perceptron Algorithm with LBJ (Rizzolo & Roth ’07). (1) Standard confusion sets. (2) L1-dependent confusion sets; Bad candidates excluded at decision time. (3) L1-dependent Weighted confusion sets; Bad candidates excluded in training. Artificial preposition errors are added in training, using error distributions of the speakers of L1 (Rozovskaya & Roth, ’10b). Contributions Confusion Sets for Preposition Error Correction Problem: Multi-class Classification with a Very Large Number of Classes Our Approach – Narrow down the Candidates L1-dependent confusion sets are superior to the standard confusion sets. On the same recall points, the models with restricted confusion sets obtain a consistently better precision. Using knowledge about the likelihood of each preposition confusion (weighted confusion sets) is even more effective. (stat. signif. at p<0.001, using McNemar’s test). Preposition Error Correction as a Multi- class Classification Problem Selected References We propose to narrow down candidates instead of considering all possible classes. We propose methods to narrow down candidates at decision time and in training. We narrow down preposition correction candidates using knowledge about typical errors observed with writers whose first language is L1. M. Gamon, J. Gao, C. Brockett, A. Klementiev, W. Dolan, D. Belenko, and L. Vanderwende. 2008. Using contextual speller techniques and language modeling for ESL error correction. IJCNLP. N. Han, J. Tetreault, S. Lee, and J. Ha. 2010. Using an error annotated learner corpus to develop and ESL/EFL error correction System. LREC. A. Rozovskaya and D. Roth. 2010a. Annotating ESL errors: Challenges and rewards. NAACL-BEA workshop. A. Rozovskaya and D. Roth. 2010b. Training paradigms for correcting errors in grammar and usage. NAACL.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.