Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge ACL 2008.

Slides:



Advertisements
Similar presentations
Spelling Correction for Search Engine Queries Bruno Martins, Mario J. Silva In Proceedings of EsTAL-04, España for Natural Language Processing Presenter:
Advertisements

Large-Scale Entity-Based Online Social Network Profile Linkage.
Acoustic Model Adaptation Based On Pronunciation Variability Analysis For Non-Native Speech Recognition Yoo Rhee Oh, Jae Sam Yoon, and Hong Kook Kim Dept.
MINING FEATURE-OPINION PAIRS AND THEIR RELIABILITY SCORES FROM WEB OPINION SOURCES Presented by Sole A. Kamal, M. Abulaish, and T. Anwar International.
A method for unsupervised broad-coverage lexical error detection and correction 4th Workshop on Innovative Uses of NLP for Building Educational Applications.
Linear Model Incorporating Feature Ranking for Chinese Documents Readability Gang Sun, Zhiwei Jiang, Qing Gu and Daoxu Chen State Key Laboratory for Novel.
HOO 2012: A Report on the Preposition and Determiner Error Correction Shared Task Robert Dale, Ilya Anisimoff and George Narroway Centre for Language Technology.
1 A Comparative Evaluation of Deep and Shallow Approaches to the Automatic Detection of Common Grammatical Errors Joachim Wagner, Jennifer Foster, and.
1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.
Automatic Metaphor Interpretation as a Paraphrasing Task Ekaterina Shutova Computer Lab, University of Cambridge NAACL 2010.
Using Web Queries for Learner Error Detection Michael Gamon, Microsoft Research Claudia Leacock, Butler-Hill Group.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Simple Features for Chinese Word Sense Disambiguation Hoa Trang Dang, Ching-yi Chia, Martha Palmer, Fu- Dong Chiou Computer and Information Science University.
SPOKEN LANGUAGE SYSTEMS MIT Computer Science and Artificial Intelligence Laboratory Mitchell Peabody, Chao Wang, and Stephanie Seneff June 19, 2004 Lexical.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
ASR Evaluation Julia Hirschberg CS Outline Intrinsic Methods –Transcription Accuracy Word Error Rate Automatic methods, toolkits Limitations –Concept.
Using Maximal Embedded Subtrees for Textual Entailment Recognition Sophia Katrenko & Pieter Adriaans Adaptive Information Disclosure project Human Computer.
Minimum Error Rate Training in Statistical Machine Translation By: Franz Och, 2003 Presented By: Anna Tinnemore, 2006.
Na-Rae Han (University of Pittsburgh), Joel Tetreault (ETS), Soo-Hwa Lee (Chungdahm Learning, Inc.), Jin-Young Ha (Kangwon University) May , LREC.
The Montclair Electronic Language Learner Database (MELD) Eileen Fitzpatrick & Steve Seegmiller Montclair State.
Andreea Bodnari, 1 Peter Szolovits, 1 Ozlem Uzuner 2 1 MIT, CSAIL, Cambridge, MA, USA 2 Department of Information Studies, University at Albany SUNY, Albany,
The Ups and Downs of Preposition Error Detection in ESL Writing Joel Tetreault[Educational Testing Service] Martin Chodorow[Hunter College of CUNY]
Evaluation in NLP Zdeněk Žabokrtský. Intro The goal of NLP evaluation is to measure one or more qualities of an algorithm or a system Definition of proper.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
A Survey of NLP Toolkits Jing Jiang Mar 8, /08/20072 Outline WordNet Statistics-based phrases POS taggers Parsers Chunkers (syntax-based phrases)
Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.
Thinking about agreement. Part of Dick Hudson's web tutorial on Word Grammarweb tutorial.
Using Contextual Speller Techniques and Language Modeling for ESL Error Correction Michael Gamon, Jianfeng Gao, Chris Brockett, Alexandre Klementiev, William.
The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.
Researcher affiliation extraction from homepages I. Nagy, R. Farkas, M. Jelasity University of Szeged, Hungary.
Automatic Detection of Plagiarized Spoken Responses Copyright © 2014 by Educational Testing Service. All rights reserved. Keelan Evanini and Xinhao Wang.
Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop Nizar Habash and Owen Rambow Center for Computational Learning.
A Language Independent Method for Question Classification COLING 2004.
Phrase Reordering for Statistical Machine Translation Based on Predicate-Argument Structure Mamoru Komachi, Yuji Matsumoto Nara Institute of Science and.
1 Sentence-extractive automatic speech summarization and evaluation techniques Makoto Hirohata, Yosuke Shinnaka, Koji Iwano, Sadaoki Furui Presented by.
Efficiently Computed Lexical Chains As an Intermediate Representation for Automatic Text Summarization H.G. Silber and K.F. McCoy University of Delaware.
Some Tips about Writing Technical Papers Michael R. Lyu September 9, 2003.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.
Copyright © 2013 by Educational Testing Service. All rights reserved. 14-June-2013 Detecting Missing Hyphens in Learner Text Aoife Cahill *, Martin Chodorow.
Corpus-based generation of suggestions for correcting student errors Paper presented at AsiaLex August 2009 Richard Watson Todd KMUTT ©2009 Richard Watson.
An Iterative Approach to Extract Dictionaries from Wikipedia for Under-resourced Languages G. Rohit Bharadwaj Niket Tandon Vasudeva Varma Search and Information.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology.
2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.
Page 1 NAACL-HLT 2010 Los Angeles, CA Training Paradigms for Correcting Errors in Grammar and Usage Alla Rozovskaya and Dan Roth University of Illinois.
A Critique and Improvement of an Evaluation Metric for Text Segmentation A Paper by Lev Pevzner (Harvard University) Marti A. Hearst (UC, Berkeley) Presented.
Intelligent Key Prediction by N-grams and Error-correction Rules Kanokwut Thanadkran, Virach Sornlertlamvanich and Tanapong Potipiti Information Research.
Headline Generation Based on Statistical Translation Michele Banko Computer Science Department Johns Hopkins University Vibhu O.Mittal Just Research 報告人.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Detecting Missing Hyphens in Learner Text Aoife Cahill, SusanneWolff, Nitin Madnani Educational Testing Service ACL 2013 Martin Chodorow Hunter College.
On using context for automatic correction of non-word misspellings in student essays Michael Flor Yoko Futagi Educational Testing Service 2012 ACL.
Discovering Relations among Named Entities from Large Corpora Takaaki Hasegawa *, Satoshi Sekine 1, Ralph Grishman 1 ACL 2004 * Cyberspace Laboratories.
1 Measuring the Semantic Similarity of Texts Author : Courtney Corley and Rada Mihalcea Source : ACL-2005 Reporter : Yong-Xiang Chen.
Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.
 Can facial expressions tell if someone is lying or not?
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Review Day. ERROR ANALYSIS- Students have completed the following problems incorrectly. Write a sentence for each on a lined sheet of paper explaining.
A classifier-based approach to preposition and determiner error correction in L2 English Rachele De Felice, Stephen G. Pulman Oxford University Computing.
ACL/EMNLP 2012 review (eNLP version) Mamoru Komachi 2012/07/17 Educational NLP research group Computational Linguistics Lab Nara Institute of Science and.
This research is supported by the U.S. Department of Education and DARPA. Focuses on mistakes in determiner and preposition usage made by non-native speakers.
The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.
GCSE English Language: Exam Paper Outline
Statistical n-gram David ling.
NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)
University of Illinois System in HOO Text Correction Shared Task
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Extracting Why Text Segment from Web Based on Grammar-gram
Presentation transcript:

Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge ACL 2008

Outline  Introduction  Background  System  Baselines  Data  Evaluation  Conclusions

Introduction

Outline  Introduction  Background  System  Baselines  Data  Evaluation  Conclusions

Background The goal is to correct confusions among the five forms, as well as the infinitive caused by semantic and syntactic errors. Semantic Errors Suppose one wants to say “I am prepared for the exam”, but writes “I am preparing for the exam”.

Background Syntactic Errors Subject-Verb Agreement He *have been living there since June. Auxiliary Agreement He has been *live there since June. Complementation He wants *live there.

Outline  Introduction  Background  System  Baselines  Data  Evaluation  Conclusions

System Step1 Automatic Parsing “My father is *work in the laboratory.”

System Step2 Replacing the verb forms

System

Step3 N-gram counts as a filter Using WEB 1T N-GRAM corpus. Prepared by Google Inc.

Outline  Introduction  Background  System  Baselines  Data  Evaluation  Conclusions

Baselines majority baseline No correction. verb-only baseline (Only used in Auxiliary Agreement & Complementation) It attempts corrections only when the word in question is actually tagged as a verb.

Outline  Introduction  Background  System  Baselines  Data  Evaluation  Conclusions

Data Development Data AQUAINT Corpus (English News Text) Evaluation Data JLE (Japanese Learners of English corpus) For 167 of the transcribed interviews, totalling 15,637 sentences. Test Set 477 sentences (3.1%) contain subject-verb agreement errors, and 238 (1.5%) contain auxiliary agreement and complementation errors

Data Evaluation Data HKUST (Hong Kong University of Science and Technology) It contains a total of 2556 sentences.

Data Evaluation Metric Accuracy (true neg + true pos) / total number of sentences Recall true pos / (true pos + false neg + inv pos) Detection Precision (true pos + inv pos) / (true pos + inv pos + false pos) Correction Precision true pos / (true pos + false pos + inv pos)

Outline  Introduction  Background  System  Baselines  Data  Evaluation  Conclusions

Evaluation JLE Results for Subject-Verb Agreement CorpusMethodAccuracyPrecision (correction) Precision (detection) Recall JLEall majority 98.93% 96.95% 81.61%83.93%80.92% Results for Auxiliary Agreement & Complementation CorpusMethodAccuracyPrecision (correction) Precision (detection) Recall JLEall verb-only majority 98.94% 98.85% 98.47% 68.00% 71.43% 80.67% 84.75% 42.86% 31.51%

Evaluation HKUST Results for Auxiliary Agreement & Complementation Two native speakers of English were given the edited sentences, as well as the original input. For each pair, they were asked to select one of four statements: one of the two is better, or both are equally correct, or both are equally incorrect. CorpusMethodAccuracyPrecision (correction) Precision (detection) Recall HKUSTallNot available71.71%not available Kappa: 0.76

Evaluation

Outline  Introduction  Background  System  Baselines  Data  Evaluation  Conclusions

Conclusions  This paper proposes a method to correct English verb form errors made by non-native speakers.  Investigation of the ways the ways in which verb form errors affect parse trees.