The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.

Slides:

Advertisements

Similar presentations

C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.

Advertisements

TEMPLATE DESIGN © Identifying Noun Product Features that Imply Opinions Lei Zhang Bing Liu Department of Computer Science,

HOO 2012: A Report on the Preposition and Determiner Error Correction Shared Task Robert Dale, Ilya Anisimoff and George Narroway Centre for Language Technology.

LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.

1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.

Using Web Queries for Learner Error Detection Michael Gamon, Microsoft Research Claudia Leacock, Butler-Hill Group.

Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.

Page 1 NAACL-HLT BEA Los Angeles, CA Annotating ESL Errors: Challenges and Rewards Alla Rozovskaya and Dan Roth University of Illinois at Urbana-Champaign.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.

Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.

Preposition Usage Errors by English as a Second Language (ESL) learners: “ They ate by* their hands.”  The writer used by instead of with. This work is.

Robert Hass CIS 630 April 14, 2010 NP NP↓ Super NP tagging JJ ↓

McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.

Evaluating Statistically Generated Phrases University of Melbourne Department of Computer Science and Software Engineering Raymond Wan and Alistair Moffat.

Drafting and Revising Effective Sentences Chapter 11 Presented By Blake O'Hare Chris McClune.

The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.

Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,

A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and Its Evaluation D. Bollegala, N. Okazaki and M. Ishizuka The University.

GoogleDictionary Paul Nepywoda Alla Rozovskaya. Goal Develop a tool for English that, given a word, will illustrate its usage.

A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:

Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.

A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,

Corpus-based generation of suggestions for correcting student errors Paper presented at AsiaLex August 2009 Richard Watson Todd KMUTT ©2009 Richard Watson.

2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.

Page 1 NAACL-HLT 2010 Los Angeles, CA Training Paradigms for Correcting Errors in Grammar and Usage Alla Rozovskaya and Dan Roth University of Illinois.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Inference Protocols for Coreference Resolution Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Nick Rizzolo, Mark Sammons, and Dan Roth This research.

1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.

Shallow Parsing for South Asian Languages -Himanshu Agrawal.

FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.

Extracting and Ranking Product Features in Opinion Documents Lei Zhang #, Bing Liu #, Suk Hwan Lim *, Eamonn O’Brien-Strain * # University of Illinois.

Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)

Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.

This research is supported by the U.S. Department of Education and DARPA. Focuses on mistakes in determiner and preposition usage made by non-native speakers.

Natural Language Processing Vasile Rus

Language Identification and Part-of-Speech Tagging

5 Passages 75 Questions 45 Minutes

Automatic Writing Evaluation

Criterial features If you have examples of language use by learners (differentiated by L1 etc.) at different levels, you can use that to find the criterial.

What do these mean? Your time is up Ready for anything (Red E)

Cognitive Processes in SLL and Bilinguals:

A Brief Introduction to Distant Supervision

CRF &SVM in Medication Extraction

David Mareček and Zdeněk Žabokrtský

Erasmus University Rotterdam

Syntax Analysis Chapter 4.

Improving a Pipeline Architecture for Shallow Discourse Parsing

Yoav Goldberg and Michael Elhadad

THE NATURE of LEARNER LANGUAGE

LING/C SC 581: Advanced Computational Linguistics

Lecture 12: Data Wrangling

Michael Gamon, Chris Brockett, William B

Extracting Semantic Concept Relations

Annotating ESL Errors: Challenges and Rewards

Eiji Aramaki* Sadao Kurohashi* * University of Tokyo

Toward Better Understanding

The CoNLL-2014 Shared Task on Grammatical Error Correction

Introduction Task: extracting relational facts from text

Automatic Detection of Causal Relations for Question Answering

The CoNLL-2014 Shared Task on Grammatical Error Correction

The Nature of Learner Language (Chapter 2 Rod Ellis, 1997) Page 15

Statistical n-gram David ling.

Lexico-grammar: From simple counts to complex models

Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars

University of Illinois System in HOO Text Correction Shared Task

Preposition error correction using Graph Convolutional Networks

Extracting Why Text Segment from Web Based on Grammar-gram

Stance Classification of Ideological Debates

Presentation transcript:

The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University of Illinois at Urbana-Champaign Urbana, IL CoNLLST 2013

Introduction The task of correcting grammar and usage mistakes made by English as a Second Language (ESL) writers is difficult for several reasons. 1.Many of these errors are context-sensitive mistakes that confuse valid English words. 2.The relative frequency of mistakes is quite low. 3.An ESL writer may make multiple mistakes in a single sentence. 2

Task Description The CoNLL-2013 shared task focuses on correcting five types of mistakes that are commonly made by non-native speakers of English. The training data released by the task organizers comes from the NUCLE corpus (Dahlmeier et al., 2013). The test data for the task consists of an additional set of 50 student essays. 3

Task Description (Cont’d) 4

5

System Components (1) – Determiners (1/4) There are three types of determiner error: 1.Omitting a determiner 2.Choosing an incorrect determiner 3.Adding a spurious determiner The system first extracts from the data all articles, and all spaces at the beginning of a noun phrase where an article is likely to be omitted. Then we train a multiclass classifier with features. 6

System Components (1) – Determiners (2/4) 7

System Components (1) – Determiners (3/4) We adopt the approach proposed in (Rozovskaya et al., 2012), the error inflation method. –This method prevents the source feature from dominating the context features, and improves the recall of the system. We experimented with two types of classifiers: 1.Averaged Perceptron (AP) 2.L1-generalized logistic regression classifier (LR) We also experimented with adding a language model (LM) feature to the LR learner. –The LM was trained on the Google corpus. 8

System Components (1) – Determiners (4/4) 9

System Components (2) – Prepositions The most common preposition errors are replacements. –Where the author correctly recognized the need for a preposition, but chose the wrong one to use. All features used in the preposition module are lexical: –Word n-grams in the 4-word window around the target preposition The NB-priors classifier, which is part of our model, can only make use of the word n-gram features. –It uses n-gram features of lengths 3, 4, and 5. 10

System Components (2) – Prepositions (Cont’d) The NB-priors model does not target preposition omissions and insertions. –It corrects only preposition replacements that involve the 12 most common English prepositions. 11

System Components (3) – Nouns and Verbs (1/3) The three remaining types of errors are corrected using separate NB models also trained on the Google corpus. –We focus here on the selection of candidates for correction, as this strongly affects performance. This stage selects the set of words that are presented as input to the classifier. –This is a crucial step because it limits the performance of any system. –This is also a challenging step as the class of verbs and nouns is open. 12

System Components (3) – Nouns and Verbs (2/3) We use the POS tag and the shallow parser output to identify the set of candidates that are input to the classifiers. –For nouns, we collect all words tagged as NN or NNS. –For verbs, we compared several candidate selection methods. 1.Extracts all verbs heading a verb phrase, as identified by the shallow parser. 2.Expands this set to words tagged with one of the verb POS tags {VB,VBN,VBG,VBD,VBP,VBZ}. 3.Adds words that are in the lemma list of common English verbs compiled using the Gigaword corpus. We use this method. 13

System Components (3) – Nouns and Verbs (3/3) 14

Combined Model The combined model includes all of these modules, which are each applied to examples individually. The combined system also includes a postprocessing step where we remove certain corrections of noun and verb forms. –We found that occur quite often but are never correct. –We add it to a list of changes that is ignored. The combined results on the training data conducted by 5-fold CV, where we add one component at a time. 15

Combined Model (Cont’d) 16

Test Results The performance improves when each component is added into the final system. –The precision is much higher while the recall is only slightly lower. The original official annotations announced by the organizers. Another set of annotations is offered based on the combination of revised official annotations and accepted alternative annotations proposed by participants. 17

Test Results (Cont’d) 18

Error Analysis (1/2) 1.Incorrect verb form correction –This correction would be needed in tandem with the insertion of the auxiliary “have” to create a passive construction, to maintain the same word order. 2.Incorrect determiner insertion –This example requires a model of discourse at the level of recognizing when a specific disease is a focus of the text, rather than disease in general. 19

Error Analysis (2/2) 3.Incorrect verb number correction –This appears to be the result of the system heuristics intended to mitigate POS tagging errors on ESL text, where the word “need” is considered as a candidate verb rather than a noun. 4.Incorrect determiner deletion –In this example, local context may suggest a list structure, but the wider context indicates that the comma represents an appositive structure. 20

Discussion (1/2) The presence of multiple errors can have very negative effects on preprocessing. –This limits the potential of rule-based approaches. Machine learning approaches require sufficient examples of each error type to allow robust statistical modeling of contextual features. –This motivates our use of just POS and shallow parse analysis. Language modeling approaches can use counts derived from very large native corpora, to provide robust inputs for machine learning algorithms. 21

Discussion (2/2) The interaction between errors suggests that constraints could be used to improve results by ensuring. This is more difficult than it may first appear for two reasons. 1.The noun that is the subject of the verb under consideration may be relatively distant in the sentence. Due to the presence of intervening relative clauses 2.The constraint only limits the possible correction options. The correct number for the noun in focus may depend on the form used in the preceding sentences. 22

Conclusion We have described our system that participated in the shared task on grammatical error correction and ranked first out of 17 participating teams. We built specialized models for the five types of mistakes that are the focus of the competition. We have also presented error analysis of the system output and discussed possible directions for future work. 23

THANKS Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task, pages 13–19, Sofia, Bulgaria, August © 2013 Association for Computational Linguistics 24