ACL/EMNLP 2012 review (eNLP version) Mamoru Komachi 2012/07/17 Educational NLP research group Computational Linguistics Lab Nara Institute of Science and.

Slides:

Advertisements

Similar presentations

Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.

Advertisements

Punctuation Generation Inspired Linguistic Features For Mandarin Prosodic Boundary Prediction CHEN-YU CHIANG, YIH-RU WANG AND SIN-HORNG CHEN 2012 ICASSP.

Tracking L2 Lexical and Syntactic Development Xiaofei Lu CALPER 2010 Summer Workshop July 14, 2010.

Search-Based Structured Prediction

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

A method for unsupervised broad-coverage lexical error detection and correction 4th Workshop on Innovative Uses of NLP for Building Educational Applications.

® Towards Using Structural Events To Assess Non-Native Speech Lei Chen, Joel Tetreault, Xiaoming Xi Educational Testing Service (ETS) The 5th Workshop.

1 A Comparative Evaluation of Deep and Shallow Approaches to the Automatic Detection of Common Grammatical Errors Joachim Wagner, Jennifer Foster, and.

Deep Learning in NLP Word representation and how to use it for Parsing

Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.

1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.

Chinese Word Segmentation Method for Domain-Special Machine Translation Su Chen; Zhang Yujie; Guo Zhen; Xu Jin’an Beijing Jiaotong University.

Using Web Queries for Learner Error Detection Michael Gamon, Microsoft Research Claudia Leacock, Butler-Hill Group.

Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.

Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.

Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.

Page 1 NAACL-HLT BEA Los Angeles, CA Annotating ESL Errors: Challenges and Rewards Alla Rozovskaya and Dan Roth University of Illinois at Urbana-Champaign.

Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.

CALL: Computer-Assisted Language Learning. 2/14 Computer-Assisted (Language) Learning “Little” programs Purpose-built learning programs (courseware) Using.

Corpora and Language Teaching

Seven Lectures on Statistical Parsing Christopher Manning LSA Linguistic Institute 2007 LSA 354 Lecture 7.

Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.

Introduction.  Classification based on function role in classroom instruction  Placement assessment: administered at the beginning of instruction 

A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.

Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.

Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.

Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.

Assessing Writing Writing skill at least at rudimentary levels, is a necessary condition for achieving employment in many walks of life and is simply taken.

 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.

On the Issue of Combining Anaphoricity Determination and Antecedent Identification in Anaphora Resolution Ryu Iida, Kentaro Inui, Yuji Matsumoto Nara Institute.

Better Punctuation Prediction with Dynamic Conditional Random Fields Wei Lu and Hwee Tou Ng National University of Singapore.

ACBiMA: Advanced Chinese Bi-Character Word Morphological Analyzer 1 Ting-Hao (Kenneth) Huang Yun-Nung (Vivian) Chen Lingpeng Kong

A Language Independent Method for Question Classification COLING 2004.

Date: 2014/02/25 Author: Aliaksei Severyn, Massimo Nicosia, Aleessandro Moschitti Source: CIKM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Building.

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.

Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.

Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.

Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.

Page 1 NAACL-HLT 2010 Los Angeles, CA Training Paradigms for Correcting Errors in Grammar and Usage Alla Rozovskaya and Dan Roth University of Illinois.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏

Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,

1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.

Natural Language Generation with Tree Conditional Random Fields Wei Lu, Hwee Tou Ng, Wee Sun Lee Singapore-MIT Alliance National University of Singapore.

Wei Lu, Hwee Tou Ng, Wee Sun Lee National University of Singapore

Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artiﬁcial Intelligence Laboratory, MIT, Cambridge ACL 2008.

Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.

Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.

The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.

Language Identification and Part-of-Speech Tagging

Experience Report: System Log Analysis for Anomaly Detection

What do these mean? Your time is up Ready for anything (Red E)

PRESENTED BY: PEAR A BHUIYAN

Improving a Pipeline Architecture for Shallow Discourse Parsing

The CoNLL-2014 Shared Task on Grammatical Error Correction

Grammar correction – Data collection interface

University of Illinois System in HOO Text Correction Shared Task

Preposition error correction using Graph Convolutional Networks

Extracting Why Text Segment from Web Based on Grammar-gram

Presentation transcript:

ACL/EMNLP 2012 review (eNLP version) Mamoru Komachi 2012/07/17 Educational NLP research group Computational Linguistics Lab Nara Institute of Science and Technology, Japan

Today’s agenda Introduce several papers presented at ACL/EMNLP conferences Not complete list, so please take a look at accepted papers by yourself! More papers on related areas such as spelling correction and text normalization (especially for microblogs like Twitter) Disclaimer: I haven’t read any papers yet. I will talk about the impression from the presentation (oral, poster, demo) of their work. Please refer to the paper itself if you feel interested

ACL Native Language Detection with Tree Substitution Grammars A Corpus of Textual Revisions in Second Language Writing A Meta Learning Approach to Grammatical Error Correction FLOW: A First-Language-Oriented Writing Assistant System Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation

Native Language Detection with Tree Substitution Grammars (Short Paper) Ben Swanson and Eugene Charniak (Brown University, USA) Problem: Though syntactic features are known to be useful for native language detection, CFG rules cannot capture long range dependencies Idea: Use Tree Substitution Grammar to extract tree fragments for native language identification Use tree fragments as features for MaxEnt classifier Tested on ICLE and outperformed baselines (CFG and frequent-based tree mining)

A Corpus of Textual Revisions in Second Language Writing John Lee and Jonathan Webster (City University of Hong Kong) Problem: There is no ESL corpus containing sentence aligned revision logs Idea: Collected a corpus with (possibly multiple) revision logs of ESL learners Errors are identified by language teachers (not necessarily the same person for each revision) Mail them to get a copy for research purpose

A Meta Learning Approach to Grammatical Error Correction (Short Paper) Hongsuck Seo, Jonghoon Lee, Seokhwan Kim, Kyusong Lee, Sechun Kang, and Gary Geunbae Lee (PosTech, Korea) Problem: There are many ESL corpora which have different characteristics Idea: Train several classifiers using different corpora, and combine them with a meta-classifier Base classifiers use ASO (Ando and Zhang, 2005) to train a model from both a native corpus and an error-tagged corpus Meta-Learner improves precision and F1 on article error correction task

FLOW: A First-Language-Oriented Writing Assistant System Mei-Hua Chen, Shih-Ting Huang, Hung-Ting Hsieh, Ting-Hui Kao, and Jason S. Chang (National Tsing Hua University, Taiwan) Problem: Previous ESL assistance tool does not take context and native language into account Idea: Developed a browser-based ESL writing assistance system for Chinese speakers Can accept Chinese input given English context, and show predictive text by N-gram Paraphrase suggestion by translation from En- >Ch->En

Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation (Short Paper) Kenji Imamura, Kuniko Saito, Kugatsu Sadamitsu, and Hitoshi Nishikawa (NTT) Problem: Error-tagged corpora of language learners are hard to obtain Idea: Automatically generates error-tagged corpora using a confusion set (derived from manually tagged corpus) Applied Frustratingly-easy domain adaptation Domain adaptation gives stable improvement

EMNLP A Beam-Search Decoder for Grammatical Error Correction Assessment of ESL Learners’ Syntactic Competence Based on Similarity Measures Exploring Adaptor Grammars for Native Language Identification

A Beam-Search Decoder for Grammatical Error Correction Daniel Dahlmeier and Hwee Tou Ng (NUS, Singapore) Problem: Traditional approach uses multi-class pointwise prediction, which does not correct a sentence as a whole Idea: Build a beam search decoder that combines the classification approach and SMT Pipeline. Proposers generate candidates and experts ranks generated candidates Tested on spelling, article, preposition, punctuation insertion and noun number task and achieved state-of-the-art

Assessment of ESL Learners’ Syntactic Competence Based on Similarity Measures Su-Youn Yoon and Suma Bhat (UIUC, USA) Problem: Previous studies focus on the length of the output, such as the mean length of clauses Idea: Focus on morpho-syntactic features for measuring English proficiency Constructed POS-based vector space model for each proficiency level POS tag sequences are robust and highly correlates with human evaluation

Exploring Adaptor Grammars for Native Language Identification Sze-Meng Jojo Wong, Mark Dras, and Mark Johnson (Macquarie University, Australia) Problem: {word,character,POS} N-gram features for native language identification do not consider long range contextual information Idea: Use Adapter Grammar (a non-parametric extension to PCFGs) to capture long n-grams Built a MaxEnt classifier to combing syntactic language model and n-gram collocations Experimental results are not stable, but shows better accuracy overall

Summary Introduced eNLP-related papers presented at ACL/EMNLP For M2/D students: eNLP can exploit sophisticated methods explored in sequence labeling, parsing and SMT (e.g. string-to-tree, tree substitution grammer, etc) For M1 students: Find a good problem and think hard to solve it!