Automated Suggestions for Miscollocations the Fourth Workshop on Innovative Use of NLP for Building Educational Applications Authors:Anne Li-E Liu, David.

Slides:

Advertisements

Similar presentations

Spelling Correction for Search Engine Queries Bruno Martins, Mario J. Silva In Proceedings of EsTAL-04, España for Natural Language Processing Presenter:

Advertisements

Multi-Document Person Name Resolution Michael Ben Fleischman (MIT), Eduard Hovy (USC) From Proceedings of ACL-42 Reference Resolution workshop 2004.

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.

A method for unsupervised broad-coverage lexical error detection and correction 4th Workshop on Innovative Uses of NLP for Building Educational Applications.

® Towards Using Structural Events To Assess Non-Native Speech Lei Chen, Joel Tetreault, Xiaoming Xi Educational Testing Service (ETS) The 5th Workshop.

LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.

1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.

Rethinking Grammatical Error Detection and Evaluation with the Amazon Mechanical Turk Joel Tetreault[Educational Testing Service] Elena Filatova[Fordham.

Automatic Metaphor Interpretation as a Paraphrasing Task Ekaterina Shutova Computer Lab, University of Cambridge NAACL 2010.

Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.

Ensemble Learning: An Introduction

Taking the Kitchen Sink Seriously: An Ensemble Approach to Word Sense Disambiguation from Christopher Manning et al.

Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.

June 5, 2009Automated Suggestions for Miscollocations 1 Anne Li-E Liu David Wible Nai-lung Tsao.

Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.

Preposition Usage Errors by English as a Second Language (ESL) learners: “ They ate by* their hands.”  The writer used by instead of with. This work is.

Na-Rae Han (University of Pittsburgh), Joel Tetreault (ETS), Soo-Hwa Lee (Chungdahm Learning, Inc.), Jin-Young Ha (Kangwon University) May , LREC.

Mining and Summarizing Customer Reviews

McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

1 Statistical NLP: Lecture 10 Lexical Acquisition.

An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.

1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.

The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.

PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.

Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.

Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.

WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.

A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.

Combining Statistical Language Models via the Latent Maximum Entropy Principle Shaojum Wang, Dale Schuurmans, Fuchum Peng, Yunxin Zhao.

TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.

A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:

Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.

Recommendation for English multiple-choice cloze questions based on expected test scores 2011, International Journal of Knowledge-Based and Intelligent.

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Model Chun-Jen Lee Jason S. Chang Thomas C. Chuang AMTA 2004.

1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,

1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Finding frequent and interesting triples in text Janez Brank, Dunja Mladenić, Marko Grobelnik Jožef Stefan Institute, Ljubljana, Slovenia.

Automated Conceptual Abstraction of Large Diagrams By Daniel Levy and Christina Christodoulakis December 2012 (2 days before the end of the world)

Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.

Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.

Learning Taxonomic Relations from Heterogeneous Evidence Philipp Cimiano Aleksander Pivk Lars Schmidt-Thieme Steffen Staab (ECAI 2004)

1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.

1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003.

Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.

Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.

Multi-Criteria-based Active Learning for Named Entity Recognition ACL 2004.

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

Content-Based Image Retrieval Using Color Space Transformation and Wavelet Transform Presented by Tienwei Tsai Department of Information Management Chihlee.

Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.

By: Niraj Kumar Automatic Essay Grading Novelty Detection 1. 2.

A classifier-based approach to preposition and determiner error correction in L2 English Rachele De Felice, Stephen G. Pulman Oxford University Computing.

A Simple English-to-Punjabi Translation System By : Shailendra Singh.

Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.

The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.

Sentiment analysis algorithms and applications: A survey

Introduction to Corpus Linguistics: Exploring Collocation

Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD

Do-Gil Lee1*, Ilhwan Kim1 and Seok Kee Lee2

Clustering Algorithms for Noun Phrase Coreference Resolution

Presentation transcript:

Automated Suggestions for Miscollocations the Fourth Workshop on Innovative Use of NLP for Building Educational Applications Authors:Anne Li-E Liu, David Wible, Nai-Lung Tsao Reporter: Yeh, Chi-Shan

2 Overview Abstract Introduction Methodology Experimental Results Conclusion

3 Abstract (1/2) One of the most common and persistent error types in second language writing is collocation errors, such as learn knowledge instead of gain or acquire knowledge, or make damage rather than cause damage. In this work-in-progress report, we propose a probabilistic model for suggesting corrections to lexical collocation errors.

4 Abstract (2/2) The probabilistic model incorporates three features: word association strength (MI), semantic similarity (via Word- Net) and the notion of shared collocations (or intercollocability). The results suggest that the combination of all three features outperforms any single feature or any combination of two features.

5 Introduction (1/3) The importance and difficulty of collocations for second language users has been widely acknowledged. Liu’s [1] study of a 4-million-word learner corpus reveals that verb-noun (VN) miscollocations make up the bulk of the lexical collocation errors in learners’ essays. Our study focuses mainly on VN miscollocation correction. [1] Anne. Li-E Liu A Corpus-based Lexical Semantic Investigation of VN Miscollocations in Taiwan Learners’ English. Master Thesis, Tamkang University, Taiwan.

6 Introduction (2/3) Error detection and correction have been two major issues in NLP research in the past decade. Studies that focus on providing automatic correction, however, mainly deal with errors that derive from closed-class words, such as articles [2] and prepositions [3]. One goal of this work-in-progress is to address the less studied issue of open class lexical errors, specifically lexical collocation errors. [2] Na-Rae Han, Martin Chodorow and Claudia Leacock Detecting Errors in English Article Usage with a Maximum Entropy Classifier Trained on a Large, Diverse Corpus, Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal. [3] Martin Chodorow, Joel R. Tetreault and Na-Rae Han Detection of Grammatical Errors Involving Prepositions, Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Special Interest Group on Semantics, Workshop on Prepositions,

7 Introduction (3/3) We focus on providing correct collocation suggestions for lexical miscollocations. Three features are employed to identify the correct collocation substitute for a miscollocation: word association measurement, semantic similarity between the correction candidate and the misused word to be replaced, and intercollocability. While we are working on error dection and correction, here we report specifically on our work on lexical miscollocation correction.

8 Method (1/2) 84 VN miscollocations from Liu’s (2002) study were employed as the training and the testing data in that each comprised 42 randomly chosen miscollocations. Two experienced English teachers manually went through the 84 miscollocations and provided a list of correction suggestions. Only when the system output matches to any of the suggestions offered by the two annotators would the data be included in the result.

9 Method (2/2) The two main knowledge resources that we incorporated are British National Corpus and WordNet. BNC was utilized to measure word association strength and to extract shared collocates while WordNet was used in determining semantic similarity. Note that all the 84 VN miscollocations are combination of incorrect verbs and focal nouns, our approach is therefore aimed to find the correct verb replacements.

10 Three features adopted Word Association Measurement Semantic Similarity Shared Collocates in Collocation Clusters

11 Word Association Measurement Mutual Information (Church et al. 1991) Two purposes: 1.All suggested correct collocations have to be identified as collocations. 2.The higher the word association strength the more likely it is to be a correct substitute for the wrong collocate.

12 Example training data: –Correct collocation: cause damage(MI=3), spend time(MI=5), take medicine(MI=2),..... –Miscollocation: make damage(MI=-10), pay time(MI=0.2), eat medicine(MI=0.5),.... Then we need get the following probability for testing. –P(MI / this collocation is correct)

13 Example In this simple example, we just divide MI into two ranges: 0~2 and 2~5(in our paper, we use 5 ranges) Then we get the probability for each range: P(MI=0~2/ this collocation is correct) = 1/3 P(MI=2~5/ this collocation is correct) = 2/3 If we have a testing data, reach dream, to find all verbs which can be followed by "dream", for example, we find two candidates: "fulfill" and "make". We can get the post probability –P(MI(fufill,dream)=1.5/the collocation is correct) = 1/3. –P(MI(make,dream)=2.5/the collocation is correct) = 2/3.

14 Three features adopted Word Association Measurement Semantic Similarity Shared Collocates in Collocation Clusters

15 Semantic Similarity (1/3) Both Gitsaki et al. (2000) and Liu (2002) suggest a semantic relation holds between a miscollocate and its correct counterpart. Following this, we assume that in the 84 miscollocations, the miscollocates should stand in more or less a semantic relation with the corrections. To measure similarity we take the synsets of WordNet to be nodes in a graph.

16 Semantic Similarity (2/3) We quantify the semantic similarity of the incorrect verb in a miscollocation with other possible substitute verbs by measuring graph-theoretic distance between the synset containing the miscollocate verb and the synset containing candidate substitutes. In cases of polysemy, we take the closest synsets for the distance measure. If the miscollocate and the candidate substitute occur in the same synset, then the distance between them is zero.

17 Semantic Similarity (3/3) The similarity measurement function is as follows:

18 Example training data: –Correct collocation: cause damage, spend time, take medicine,..... –Miscollocation: make damage, pay time, eat medicine,.... Then we can get the following similarity from WordNet(only verbs with the same noun needed to compute) : –cause(correct) - make: 0.7 do(mis) - make: 0.1 spend(correct) - pay: 0.8 take(correct) - eat: 0.3

19 Example Using these data, we can get the following prior probabilities: –P(sim=0~0.5/this verb is correct) = 1/3 P(sim=0.5~1/this verb is correct) = 2/3 If we have a testing data, reach dream, to find all verbs which can be followed by "dream", for example, we find two candidates: "fulfill" and "make". Then we compute the similarity of "fulfill" and "make" and "reach". –fulfill - reach: 0.7 make - reach: 0.4 We can get the post probability for each candidate –P(sim(fulfill,reach)/the collocation is correct) = 2/3. P(sim(make,reach)/the collocation is correct) = 1/3

20 Three features adopted Word Association Measurement Semantic Similarity Shared Collocates in Collocation Clusters

21 Shared Collocates in Collocation Clusters Fig. Collocation cluster of “bringing something into actuality”

22 Example training data: –Correct collocation: cause damage, spend time, take medicine,..... –Miscollocation: make damage, pay time, eat medicine,.... Using "cause damage" and "make damage" as example,we get N1=Noun(cause) and N2=Noun(make) from BNC. (Noun() means the noun set for a specific verb and only those with high associations can be contained). If the number of the intersection between N1 and N2 is 60 and the number of N2 is 100(we use N2 because it's miscollocation), the shared collocate score is 0.6.

23 Example Using this step, we can get the following data: –cause - make: 0.6 do - make: 0.4 spend-pay: 0.7 take-eat: 0.3 Using these data, we can get the following prior probabilities (still, two ranges in this example): –P(0~0.5/this verb is correct) = 2/3 P(0.5~1/this verb is correct) = 1/3 Again, use "reach dream" as a testing data. Find all verbs which can be followed by "dream", for example, we find two candidates: "fulfill" and "make".

24 Example Then we compute the shared collocate scores for "fulfill" and "make" and "reach". –fulfill - reach: 0.7 make - reach: 0.4 Then We can get the post probability for each candidate –P(shared(fulfill,reach)/the collocation is correct) = 2/3. P(shared(make,reach)/the collocation is correct) = 1/3

25 Probabilistic Model (1/2) The three features we described above are integrated into a probabilistic model. Each feature is used to look up the correct collocation suggestion for a miscollocation. For instance, cause damage, one of the possible suggestions for the miscollocation make damage, is found to be ranked the 5 th correction candidate by using word association measurement merely, the 2nd by semantic similarity and the 14th by using shared collocates. If we combine the three features, however, cause damage is ranked first.

26 Probabilistic Model (2/2) The conditional probability: According to Bayes theorem and Bayes assumption, which assume that these features are independent, the probability can be computed by:

27 Training Probability distribution of word association strength MI value to 5 levels ( 6) P( MI level ) P(MI level | S c )

28 Training Probability distribution of semantic similarity Similarity score to 5 levels (0.0~0.2, 0.2~0.4, 0.4~0.6, 0.6~0.8 and 0.8 ~1.0 ) P(SS level ) P(SS level | S c )

29 Training Probability distribution of intercollocability Normalized shared collocates number to 5 levels (0.0~0.2, 0.2~0.4, 0.4~0.6, 0.6~0.8 and 0.8 ~1.0 ) P(SC level ) P(SC level | S c )

30 Experimental Results (1/5) Different combinations of the three features.

31 Experimental Results (2/5) K-Best M1M2 (SS) M3M4M5 M6 (SS+SC) M7 (MI+SS+SC)

32 Experimental Results (3/5) The K-Best suggestions for “get knowledge”. K-BestM2M6M7 1aimobtainacquire 2generateshare 3drawdevelopobtain 4 generatedevelop 5 acquiregain

33 Experimental Results (4/5) The K-Best suggestions for *reach purpose. K-BestM2M6M7 1achieve 2teachaccount 3explaintrade 4accounttreatfulfill 5tradeallocateserve

34 Experimental Results (5/5) The K-Best suggestions for *pay time. K-BestM2M6M7 1devotespend 2 investwaste 3expenddevote 4sparedateinvest 5 wastedate

35 Conclusion (1/2) A probabilistic model to integrate features. Applying such mechanisms to other types of miscollocations. Miscollocation detection will be one of the main points of this research. A larger amount of miscollocations should be included in order to verify our approach.

36 Conclusion (2/2) Further, a larger amount of miscollocations should be included in order to verify our approach and to address the issue of the small drop of the full-hybrid M7 at k=1.