Page 1 NAACL-HLT 2010 Los Angeles, CA Training Paradigms for Correcting Errors in Grammar and Usage Alla Rozovskaya and Dan Roth University of Illinois.

Slides:



Advertisements
Similar presentations
December 2011 NIPS Adaptation Workshop With thanks to: Collaborators: Ming-Wei Chang, Michael Connor, Gourab Kundu, Alla Rozovskaya Funding: NSF, MIAS-DHS,
Advertisements

Automatic Identification of Cognates, False Friends, and Partial Cognates University of Ottawa, Canada University of Ottawa, Canada.
Summary-Response Essay
Probabilistic Detection of Context-Sensitive Spelling Errors Johnny Bigert Royal Institute of Technology, Sweden
1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Rethinking Grammatical Error Detection and Evaluation with the Amazon Mechanical Turk Joel Tetreault[Educational Testing Service] Elena Filatova[Fordham.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
Using Web Queries for Learner Error Detection Michael Gamon, Microsoft Research Claudia Leacock, Butler-Hill Group.
Semi-supervised learning and self-training LING 572 Fei Xia 02/14/06.
Page 1 NAACL-HLT BEA Los Angeles, CA Annotating ESL Errors: Challenges and Rewards Alla Rozovskaya and Dan Roth University of Illinois at Urbana-Champaign.
Extracting Interest Tags from Twitter User Biographies Ying Ding, Jing Jiang School of Information Systems Singapore Management University AIRS 2014, Kuching,
Albert Gatt Corpora and Statistical Methods Lecture 9.
Preposition Usage Errors by English as a Second Language (ESL) learners: “ They ate by* their hands.”  The writer used by instead of with. This work is.
Na-Rae Han (University of Pittsburgh), Joel Tetreault (ETS), Soo-Hwa Lee (Chungdahm Learning, Inc.), Jin-Young Ha (Kangwon University) May , LREC.
Automated Essay Evaluation Martin Angert Rachel Drossman.
The Ups and Downs of Preposition Error Detection in ESL Writing Joel Tetreault[Educational Testing Service] Martin Chodorow[Hunter College of CUNY]
Preposition Errors in ESL Writings Mohammad Moradi KOWSAR INSTITUTE.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Rika Yoshii, Ph.D. and Jacquelyn Hernandez CSIS Department California State University, San Marcos Send us suggestions and requests to.
English Language Arts Level 7 #44 Ms. Walker
Chris Luszczek Biol2050 week 3 Lecture September 23, 2013.
Using Contextual Speller Techniques and Language Modeling for ESL Error Correction Michael Gamon, Jianfeng Gao, Chris Brockett, Alexandre Klementiev, William.
The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
Learning Models for Object Recognition from Natural Language Descriptions Presenters: Sagardeep Mahapatra – Keerti Korrapati
On the Issue of Combining Anaphoricity Determination and Antecedent Identification in Anaphora Resolution Ryu Iida, Kentaro Inui, Yuji Matsumoto Nara Institute.
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
ACBiMA: Advanced Chinese Bi-Character Word Morphological Analyzer 1 Ting-Hao (Kenneth) Huang Yun-Nung (Vivian) Chen Lingpeng Kong
GoogleDictionary Paul Nepywoda Alla Rozovskaya. Goal Develop a tool for English that, given a word, will illustrate its usage.
1 Co-Training for Cross-Lingual Sentiment Classification Xiaojun Wan ( 萬小軍 ) Associate Professor, Peking University ACL 2009.
Summary-Response Essay Responding to Reading. Reading Critically Not about finding fault with author Rather engaging author in a discussion by asking.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
Reading, Multiple Choice and Graphic Text.  Information paragraph- presents ideas and information on a topic  News report- presents information in the.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
CS 6998 NLP for the Web Columbia University 04/22/2010 Analyzing Wikipedia and Gold-Standard Corpora for NER Training William Y. Wang Computer Science.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-
Lina Bikelienė Vilnius University 3 September, 2010 Connector usage in advanced Lithuanian learners’ English writing.
Unit 1 Activity 2B Communication Barriers Report
Copyright © 2013 by Educational Testing Service. All rights reserved. 14-June-2013 Detecting Missing Hyphens in Learner Text Aoife Cahill *, Martin Chodorow.
Hierarchical Clustering for POS Tagging of the Indonesian Language Derry Tanti Wijaya and Stéphane Bressan.
Inference Protocols for Coreference Resolution Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Nick Rizzolo, Mark Sammons, and Dan Roth This research.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Detecting Missing Hyphens in Learner Text Aoife Cahill, SusanneWolff, Nitin Madnani Educational Testing Service ACL 2013 Martin Chodorow Hunter College.
On using context for automatic correction of non-word misspellings in student essays Michael Flor Yoko Futagi Educational Testing Service 2012 ACL.
Correcting Comma Errors in Learner Essays, and Restoring Commas in Newswire Text Ross Israel Indiana University Joel Tetreault Educational Testing Service.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Word Sense and Subjectivity (Coling/ACL 2006) Janyce Wiebe Rada Mihalcea University of Pittsburgh University of North Texas Acknowledgements: This slide.
A classifier-based approach to preposition and determiner error correction in L2 English Rachele De Felice, Stephen G. Pulman Oxford University Computing.
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Writing in English Academic Writing.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
ACL/EMNLP 2012 review (eNLP version) Mamoru Komachi 2012/07/17 Educational NLP research group Computational Linguistics Lab Nara Institute of Science and.
This research is supported by the U.S. Department of Education and DARPA. Focuses on mistakes in determiner and preposition usage made by non-native speakers.
The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.
GGGE6533 LANGUAGE LEARNING STRATEGY INSTRUCTION SUCCESSFUL ENGLISH LANGUAGE LEARNING INVENTORY (SELL-IN) FINDINGS & IMPLICATIONS PREPARED BY: ZULAIKHA.
Michael Gamon, Chris Brockett, William B
Annotating ESL Errors: Challenges and Rewards
The CoNLL-2014 Shared Task on Grammatical Error Correction
University of Illinois System in HOO Text Correction Shared Task
Preposition error correction using Graph Convolutional Networks
Presentation transcript:

Page 1 NAACL-HLT 2010 Los Angeles, CA Training Paradigms for Correcting Errors in Grammar and Usage Alla Rozovskaya and Dan Roth University of Illinois at Urbana-Champaign TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA

Page 2 Error correction tasks Context-sensitive spelling mistakes  I would like a peace*/piece of cake. English as a Second Language (ESL) mistakes  Mistakes involving prepositions To*/in my mind, this is a serious problem.  Mistakes involving articles Nearly species of plants are under the*/a serious threat of disappearing. Laziness is the engine of the*/ progress.

Page 3 The standard training paradigm for error correction Example: Correcting article mistakes [Izumi et al., ’03; Han et al., ’06; De Felice and Pulman, ’08; Gamon et al., ’08]  Cast the problem as a classification task  Provide a set of candidates: {a,the,NONE}  Task: select the appropriate candidate in context  Define features based on the surrounding context and train a classifier on correct (native) data Laziness is the engine ofprogress[the] Features: w 1 B=of, w 1 A=progress, w 2 Bw 1 B=engine-of, …

Page 4 The standard training paradigm for error correction Correcting article mistakes [Izumi et al., ’03; Han et al., ’06; De Felice and Pulman, ’08; Gamon et al., ’08] Correcting preposition mistakes [Eeg-Olofsson and Knutsson, ’03; Gamon et al., ’08; Tetreault and Chodorow, ’08, others] Context-sensitive spelling correction [Golding and Roth, ’96,’99; Carlson et al., ’01, others]

Page 5 But this is a paradigm for a selection task! Selection task (e.g. WSD):  We have a set of candidates  Task: select the correct candidate from a set of candidates The selection paradigm is appropriate for WSD, because there is no proposed candidate in context

Page 6 The typical error correction training paradigm is the paradigm of a selection task! Why? Easy to obtain training data – can use correct text No need for annotation

Page 7 Outline The error correction task: Problem statement The error correction task: Problem statement The typical training paradigm – does selection rather than correction The typical training paradigm – does selection rather than correction Selection versus correction  What is the appropriate training paradigm for the correction task? The ESL corpus Training paradigms for the error correction task  Key idea  Methods of error generation Experiments Conclusions

Page 8 Selection tasks versus error correction tasks Article selection task Nearly species of plants are under ___ serious threat of disappearing. Article correction task Nearly species of plants are under the serious threat of disappearing. Set of candidates: {a,the,NONE} source

Page 9 Correction versus selection Article selection classifier  Accuracy on native English data 87-90%  Baseline for the article selection task 60-70% (use the most common article) Non-native data accuracy >90%  If we use the writer’s selection, the results are very good already! Conclusion: Need to use the proposed candidate (or will make more mistakes than there are in the data) Error rate=10% With a selection model – can use it as a threshold Can we do better if we use the proposed candidate in training?

Page 10 The proposed article is a useful resource We want to use the proposed article in training  90% of articles are used correctly  Article mistakes are not random Selection paradigm: Can we use the proposed candidate in training? - No: In native data, the proposed article always corresponds to the label

Page 11 How can we use the proposed article in training? Using annotated data for training Laziness is the engine of progress. Annotating data for training is expensive *Need a method to generate training data for the error correction task without expensive annotation. sourcelabel

Page 12 Contributions of this work We propose a method to generate training data for the error correction task  Avoid expensive data annotation We use the generated data to train classifiers in the paradigm of correction  With the proposed candidate in training We show that error correction training paradigms are superior to the selection paradigm of training

Page 13 Outline The error correction task: Problem statement The error correction task: Problem statement The typical training paradigm – does selection rather than correction The typical training paradigm – does selection rather than correction Selection versus correction Selection versus correction  What is the appropriate training paradigm for correction? The ESL corpus Training paradigms for the error correction task  Key idea  Methods of error generation Experiments Conclusions

Page 14 The annotated ESL corpus Annotated a corpus of ESL sentences (60K words) Extracted from two corpora of ESL essays:  ICLE [Granger et al.,’02]  CLEC [Gui and Yang,’03] Sentences written by ESL students of 9 first languages Each sentence is fully corrected and error tagged Annotated by native English speakers Experiments: Chinese, Czech, Russian

Page 15 The annotated ESL corpus Annotating ESL sentences with an annotation tool Sentence for annotation

Page 16 The annotated ESL corpus Each sentence is fully corrected and error-tagged For details about the annotation, please see [Rozovskaya and Roth, ’10, NAACL-BEA5]  Before annotation “This time asks for looking at things with our eyes opened.”  With annotation comments “This age, asks $us$ for looking *look* at things with our eyes opened.”  After annotation “This period asks us to look at things with our eyes opened.”

Page 17 Outline The error correction task: Problem statement The error correction task: Problem statement The typical training paradigm – does selection rather than correction The typical training paradigm – does selection rather than correction Selection versus correction Selection versus correction  What is the appropriate training paradigm for correction? The ESL data used in the evaluation The ESL data used in the evaluation Training paradigms for the error correction task  Key idea  Methods of error generation Experiments Conclusions

Page 18 Training paradigms for the error correction task Generate artificial article errors in native training data The source article can be used in training as a feature Constraint: We want training data to be similar to non-native text  Other works that use artificial errors do not take into account error patterns in non-native data [Sjöbergh and Knutsson, ’05; Brockett et al., ’06, Foster and Andersen, ’09] Key idea: We want to be able to use the proposed candidate in training

Page 19 Training paradigms for the error correction task We examine article errors in the annotated data:  Add errors selectively  Mimic the article distribution the error rate the error patterns of the non-native text

Page 20 Error rates in article usage Very common mistakes made by non-native speakers of English TOEFL essays by Russian, Chinese, and Japanese speakers: 13% of noun phrases have article mistakes [Han et al., ’06] Essays by advanced Chinese, Czech, Russian learners of ESL: 10% of noun phrases have article mistakes.

Page 21 Distribution of articles in the annotated ESL data Source language Examples total Error rate Errors total Classes atheNone Chinese % Czech % Russian % English Wikipedia This error rate sets the baseline for the task around 90%

Page 22 Distribution of article errors in the annotated ESL text Not all confusions are equally likely Errors are dependent on the first language of the writer

Page 23 Characteristics of the non-native data: Summary Article distribution Error rates Error patterns of the non-native text We use this knowledge to generate errors for error correction training paradigms

Page 24 Error correction training paradigm 1: General General Add errors uniformly at random with error rate conf, where conf 2 {5%,10%,12%,14%,16%,18%} Example: Let error rate=10% replace(the, a, 0.05) replace(the,NONE,0.05) theaNONE replace(a, the, 0.05) replace(a,NONE,0.05) replace(NONE, a, 0.05) replace(NONE,the,0.05)

Page 25 Error correction training paradigm 2: ArticleDistr ArticleDistr atheNONE Czech English Wikipedia Mimic the distribution of the ESL source articles in training the replace(the, a, p 1 ) replace(the,NONE,p 2 ) Constraints: (1) Prob Train (the)=Prob Czech (the) (2) p 1, p 2 ¸ minConf, where minConf 2 {0.02, 0.03, 0.04, 0.05} Example : A linear program is set up to find p 1 and p 2

Page 26 Error correction training paradigm 3: ErrorDistr ErrorDistr Add article mistakes to mimic the error rate and confusion patterns observed in the ESL data. Example : Chinese Error rate: 9.2% Article confusions by error type

Page 27 Error correction training paradigms: Summary Key idea: generate artificial errors in native training data  We can use the source article in training as a feature  Important constraints: Errors mimic the error patterns of the ESL text  Error rate  Distribution of different article confusions

Page 28 Error correction training paradigms: Costs 3 error generation methods Use different knowledge (and have different costs)  Paradigm 1 (error rate in the data)  Paradigm 2 (distribution of articles in the ESL data) – no annotation required  Paradigm 3 (error rate and article confusions) – requires annotated data (the most costly method)

Page 29 Outline The error correction task: Problem statement The error correction task: Problem statement The typical training paradigm – does selection rather than correction The typical training paradigm – does selection rather than correction Selection versus correction Selection versus correction  What is the appropriate training paradigm for correction? The ESL data used in the evaluation The ESL data used in the evaluation Training paradigms for the error correction task Training paradigms for the error correction task  Key idea  Methods of error generation Experiments Conclusions

Page 30 Experimental setup Train a TrainClean classifier using the selection paradigm 3 classifiers are Trained With artificial Errors (TWE classifiers) Online learning paradigm and the Averaged Perceptron Algorithm.

Page 31 Features Features are based on the 3-word window around the target. If we take [a] brief look back if-IN we-PRP take-VBP [a] brief-JJ look-NN back-RB Word features : headWord=look, w 3 B=if, w 2 B=we,w 1 B=take, w 1 A=brief, etc. Tag features : p 3 B=IN, p 2 B=PRP, etc. Composite features : w 2 Bw 1 B=we-take w 1 Bw 1 A= take-brief, etc. source feature – TWE systems only

Page 32 Performance on the data by Russian speakers Training Paradigm AccuracyError reduction TrainClean90.62%5.92% TWE (General)91.25%12.24% TWE (Article Distr.)91.52%14.94% TWE (Error Distr.)91.63%16.05% Baseline90.03% All TWE’s outperform the selection paradigm TrainClean for all languages On average, TWE (Error Distr.) provides the best improvement

Page 33 Improvement due to training with errors Source language BaselineTrain Clean TWEError reduction Chinese92.03%91.85%92.67%10.06% Czech90.88%91.82%92.22%4.89% Russian90.03%90.62%91.63%10.77%

Page 34 Conclusions We argued that the error correction task should be studied in the error correction paradigm rather than the current selection paradigm  The baseline for the error correction task is high  Mistakes are not random We have proposed a method to generate training data for error correction tasks using artificial errors  The artificial errors mimic error rates and error patterns in the non- native text  The method allows us to train with the proposed candidate, in the paradigm of error correction The error correction training paradigms are superior to the typical selection training paradigm

Page 35 Thank you! Questions?