The CoNLL-2014 Shared Task on Grammatical Error Correction

Slides:

Advertisements

Similar presentations

Keyboarding Objective Apply language skills in keyed documents

Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Machine Learning PoS-Taggers COMP3310 Natural Language Processing Eric.

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PoS-Tagging theory and terminology COMP3310 Natural Language Processing.

© Paradigm Publishing, Inc Word 2010 Level 2 Unit 1Formatting and Customizing Documents Chapter 2Proofing Documents.

Unsupervised Dependency Parsing David Mareček Institute of Formal and Applied Linguistics Charles University in Prague Doctoral thesis defense September.

A method for unsupervised broad-coverage lexical error detection and correction 4th Workshop on Innovative Uses of NLP for Building Educational Applications.

1 A Comparative Evaluation of Deep and Shallow Approaches to the Automatic Detection of Common Grammatical Errors Joachim Wagner, Jennifer Foster, and.

1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.

Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.

Using Web Queries for Learner Error Detection Michael Gamon, Microsoft Research Claudia Leacock, Butler-Hill Group.

Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.

CS224N Interactive Session Competitive Grammar Writing Chris Manning Sida, Rush, Ankur, Frank, Kai Sheng.

Chapter 11 – Grammar: Finding a Balance

Word Processing. ► This is using a computer for:  Writing  EditingTEXT  Printing  Used to write letters, books, memos and produce posters etc.  A.

Keyboarding Objective 3.01 Interpret Proofreader Marks

Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.

Some Advances in Transformation-Based Part of Speech Tagging

1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.

Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.

The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.

CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)

Lecture 10 NLTK POS Tagging Part 3 Topics Taggers Rule Based Taggers Probabilistic Taggers Transformation Based Taggers - Brill Supervised learning Readings:

S1: Chapter 1 Mathematical Models Dr J Frost Last modified: 6 th September 2015.

A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart

Corpus-based generation of suggestions for correcting student errors Paper presented at AsiaLex August 2009 Richard Watson Todd KMUTT ©2009 Richard Watson.

Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.

Supertagging CMSC Natural Language Processing January 31, 2006.

Common mistakes in students writing Ms.Hatoon Aljulayel.

IELTS Intensive Writing part two. IELTS Writing Two parts of ielts writing Part one writing about a Graph, chart, diagram Part two is an essay.

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.

Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.

Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.

A Simple English-to-Punjabi Translation System By : Shailendra Singh.

Giving feedback on students’ written work at the United Nations, Geneva Carol Waites, PhD.

The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.

Extending LanguageTool, a style and grammar checker Daniel Naber, Marcin Miłkowski 11:15 – 13:00 Friday, Workshop Notes v2.0, updated

Finding your way around LegalEasy - a simple tutorial on what it offers and how to make use of itLegalEasy Nigel Bruce Updated July27th, 2016.

Language Identification and Part-of-Speech Tagging

Writing Inspirations, 2017 Aalto University

Introduction to Machine Learning and Text Mining

Computational and Statistical Methods for Corpus Analysis: Overview

ALL ABOUT VERBS GRAMMAR SUMMARY.

Custom rules on subject verb agreement

LanguageTool - Part A David Ling.

Writing: Grammar and Usage

Writing Inspirations, Spring 2016 Aalto University

Keyboarding Objective Interpret Proofreaders’ Marks in Documents

LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.

CSCI 5832 Natural Language Processing

Grammar and Vocabulary Development

Topics in Linguistics ENG 331

web1T and deep learning methods

Transformer result, convolutional encoder-decoder

Eiji Aramaki* Sadao Kurohashi* * University of Tokyo

The CoNLL-2014 Shared Task on Grammatical Error Correction

Automatic Detection of Causal Relations for Question Answering

Hong Kong English in Students’ Writing

Chunk Parsing CS1573: AI Application Development, Spring 2003

Grammar correction – Data collection interface

LanguageTool David Ling.

Statistical n-gram David ling.

Natural Language Processing

Ngram frequency smooting

University of Illinois System in HOO Text Correction Shared Task

English project More detail and the data collection system

Preposition error correction using Graph Convolutional Networks

Tri-gram + LanguageTool

Some preliminary results

Editing Process: English 10 Spoken Language

Presentation transcript:

The CoNLL-2014 Shared Task on Grammatical Error Correction Rule-based/statistical parts of different teams David Ling 20-11-2017

Contents Rule-based/statistical parts of different teams Suggestions Team CAMB Team POST Team UFC Suggestions Different approaches for different errors (RB/LM/MT) Annotated student scripts of HK students in xml Problematic patterns from non-annotated scripts using language models

Grammar checking approaches Rule-based usually refer to handcrafted rules Low recall but high accuracy Statistical corpus-driven, usually ngram or parser features (parser may give incorrect tags if the sentence is problematic) Phase-based statistical machine translation Higher recall, but with lower accuracy Deep learning Translation Recent common approaches: Different methods for different error types

CoNLL-2014 approaches overview Many hybrid systems with rule- based (RB) LM: language model MT: machine translation SIMPLE hand-crafted rules are used as a preliminary Reason: Some error types are more regular than others: subject verb agreement (SVA) Some are hard: wrong collocation (Wci) eg. Heavy rain -> Thick rain

Team CAMB (University of Cambridge) (1st) Pipeline: Ngram corpus-driven rules Derived from the Cambridge Learner Corpus (2003) 6M words, with errors and corrections annotated, non-public Extract unigrams, bigrams, and trigrams as errors, which (1) marked as incorrect for > 90% of the times they occur and (2) appeared over 5 times Many mistakes are still not detected, but precision has been found to be more important in terms of learning effect We do not have such huge annotated corpus SMT part is skipped in this power point phase-based statistical machine translation (SMT) Rule-based (RB)

Team POST (Pohang University of Science and Technology)(4th) Using Language model + Hand-crafted rules (without SMT) Pipeline: Quite similar to LanguageTool Hand-crafted rules ngram frequency for replacement language model: Noun number error (catcats) hand-crafted rules: Insertion (only articles: a, an, the) hand-crafted rules: Subject verb agreement, preposition Ngram frequency: Deletion, replacement (raiserise)

Team POST (4th)- The hand-crafted rules Only a few simple hand-crafted rules use word and POSTag patterns as the trigger condition VBZ (Verb, 3rd ps. sing. Present): eats, jumps, believes, is, has NN (singular noun) VBG (Verb, gerund/present participle): eating, jumping, believing VBP (Verb, non-3rd ps. sing. Present): eat, jump, believe, am

Team POST (4th) – ngram frequency for deletion, replacement A list of candidate pairs (Only 215 pairs) Extracted from UNCLE (about 1000 essays, from the conference, see the table) Different candidate pairs have different window sizes, eg. (1;1), (2;1), (1;3) Use Google N-gram corpus frequency to give replacement or not Schematic example: (too to) with window (2;1): Sentence: I go too school by bus ‘I go too school’: 0 times vs ‘I go to school’: 5 times Replace ‘too’ by ‘to’ Window size of a pair is the one with the highest accuracy in training (max: 5gram) However, I think the neural network approach from LanguageTool is better (faster with less memory)

Team UFC (11th) - Universit´e de Franche-Comt´e Title: Grammatical Error Detection and Correction Using Tagger Disagreement Pipeline: Statistical parsers  Handcrafted rules statistical parsers Stanford Parser: statistical probability parser TreeTagger: by decision tree Deal with 3 types of errors only: (1) Subject verb agreement (by Stanford Parser) (2) Verb form (by Stanford Parser) (3) verb form or word form (by disagreement between Stanford Parser and TreeTagger)

Team UFC (11th) - Universit´e de Franche-Comt´e Subject verb agreement (Stanford Parser) Provides POSTag and word dependency Example: Bills on ports and immigration were submitted by Senator Brownback, Republican of Kansas. Nsubjpass: (submitted, Bills) A singular verb (VBZ) plural, if it was the root of the dependency tree and was referred to by a plural noun (NNS) They submits They submit (2) verb form (Stanford Parser, hand-crafted rules) Use POSTag of the preceding token Modal verb (MD) + singular verb (VBZ), then VBZ bare infinitive form David can eats.  He can eat.

Team UFC (11th) - Universit´e de Franche-Comt´e (3) By disagreements between Stanford Parser and TreeTagger) Vform error: JJ (Stanford) & VB (TreeTagger) past participle a develop country  a developed country Wform error: JJ (Stanford) & adverb (TreeTagger) remove ‘ly’ lead to a completely failure  lead to a complete failure JJ: adjective VB: verb

Conclusions Different methods for different error types Error with regular pattern  Simple handcraft rules: SVA (subject verb agreement) Vform (verb form) Mec (spelling, punctuation), ArtOrDet, Nn (noun number) Error depends on nearby words  Ngram frequency/LanguageTool neural network Prep (preposition) Vform (verb form), Wform (word form) Wci (wrong collocation) Error depends on context  Machine translation UM (unclear meaning) Rloc- (redundancy)

Suggested next steps We need: 1) An annotated corpus (HK students) performance evaluation Suggest to follow the xml format of UNCLE (28 error types) a program for teachers to marke scripts and output in the xml format Statistical findings also add values to literatures 2) Extract error patterns from non-annotated scripts with low ngram probability alternative as we don’t have a large corpus For 5-gram, possible language models are Bi-gram model: p(x1,x2,x3,x4,x5) = p(x1)p(x2|x1)p(x3|x2)…p(x5|x4) Neural network: p(x1,x2,x3,x4,x5) = f (x1,x2,x3,x4,x5), where xi are trained word vectors with the neural network f

Suggested next steps - Extract error patterns Example sentence from a HSMC student: “Hence , he explain lots of people upload a video to YouTube .” Ngrams with low ngram frequencies are (1) Problematic, (2) Bad style (as a 2nd language learner) or (3) Correct but uncommon usage 3-gram frequency with Wiki corpus ' Hence , he ': 61, ' , he explain ': 1, ' he explain lots ': 0, ' explain lots of ': 1, ' lots of people ': 2383, ' of people upload ': 1, ' people upload a ': 0, ' upload a video ': 49, ' a video to ': 138, ' video to YouTube ':401, ‘ to YouTube . ': 105 2-gram frequency with Wiki corpus ' Hence , ': 3044, ' , he ': 365657, ' he explain ': 10, ' explain lots ': 0, ' lots of ': 3700, ' of people ': 21564, ' people upload ': 1, ' upload a ': 33, ' a video ': 6769, ' video to ': 744, ' to YouTube ': 55, ' YouTube \\. ': 338 We may try to group the extracted ngrams and make rules Example: a noun clause is preferred after the word ‘explain’

Suggested next steps 3) The rule-based part Similar to LanguageTool, allows English teachers to make handcrafted rules using xml A function to search for false alarms during deciding the handcrafted rules Assuming texts in Wikipedia are correct Try to search the corpus to see if there are any matched patterns with the designed rule This helps a lot in finding exceptions (acutally LanguageTool has this feature online) 4) Language model and the machine translation part

End The CoNLL-2014 Shared Task on Grammatical Error Correction Google Books N-gram Corpus used as a Grammar Checker (EACL 2012) N-gram based Statistical Grammar Checker for Bangla and English Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction, 2016 Conference on Empirical Methods in Natural Language Processing, pages 1546–1556

Google Books N-gram Corpus used as a Grammar Checker (EACL 2012) Google Books N-gram Corpus (up to 5-gram, >40 times will be recorded) Use n-gram frequency to do 4 types of tasks: Error detection: Flag a tag if 2-gram is not found in the database Other tasks: Word multiple choices, inflection, fill in the banks N-gram error detection for French (non-native speakers) N-gram error detection has a good performance on lexical selection and prepositions Tp: true positive fn: false negative