NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

1 Opinion Summarization Using Entity Features and Probabilistic Sentence Coherence Optimization (UIUC at TAC 2008 Opinion Summarization Pilot) Nov 19,
Improved TF-IDF Ranker
A method for unsupervised broad-coverage lexical error detection and correction 4th Workshop on Innovative Uses of NLP for Building Educational Applications.
UNIT-III By Mr. M. V. Nikum (B.E.I.T). Programming Language Lexical and Syntactic features of a programming Language are specified by its grammar Language:-
GNANA SUNDAR RAJENDIRAN JOYESH MISHRA RISHI MISHRA FALL 2008 BIOINFORMATICS Clustering Method for Repeat Analysis in DNA sequences.
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.
Three kinds of learning
Providing Tutoring Service through Accumulating Interaction Data Chi-Jen LIN Fo Guang University, Taiwan.
5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.
Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
LEARNING WORD TRANSLATIONS Does syntactic context fare better than positional context? NCLT/CNGL Internal Workshop Ankit Kumar Srivastava 24 July 2008.
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Automated Essay Evaluation Martin Angert Rachel Drossman.
Developing Theory-Based Diagnostic Tests of English Grammar: Application of Processability Theory Rosalie Hirch April 26, 2013.
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
Assisting cloze test making with a web application Ayako Hoshino ( 星野綾子 ) Hiroshi Nakagawa ( 中川裕志 ) University of Tokyo ( 東京大学 ) Society for Information.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
A Multimedia English Learning System Using HMMs to Improve Phonemic Awareness for English Learning Yen-Shou Lai, Hung-Hsu Tsai and Pao-Ta Yu Chun-Yu Chen.
Automatic Readability Evaluation Using a Neural Network Vivaek Shivakumar October 29, 2009.
A Language Independent Method for Question Classification COLING 2004.
Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ A. ONUR DOĞUÇ
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Summarization Focusing on Polarity or Opinion Fragments in Blogs Yohei Seki Toyohashi University of Technology Visiting Scholar at Columbia University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Building Sub-Corpora Suitable for Extraction of Lexico-Syntactic Information Ondřej Bojar, Institute of Formal and Applied Linguistics, ÚFAL.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge ACL 2008.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
A. Baker, J. de Jong, A. Orgassa & F. Weerman Collaborators: VARIFLEX project: Elma Blom & Daniela Polišenská (NWO-research grant : Disentangling.
The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.
WP4 Models and Contents Quality Assessment
Vocabulary Module 2 Activity 5.
班級:應英四乙 學號:497c0106 姓名:李國溢.
Assessing Grammar Module 5 Activity 5.
A Brief Introduction to Distant Supervision
Semantic Processing with Context Analysis
Overview of Assessments
Challenges in Creating an Automated Protein Structure Metaserver
Introduction to IR Research
James L. McClelland SS 100, May 31, 2011
Assessing Grammar Module 5 Activity 5.
For Evaluating Dialog Error Conditions Based on Acoustic Information
Anastassia Loukina, Klaus Zechner, James Bruno, Beata Beigman Klebanov
Automatic Detection of Causal Relations for Question Answering
Mixed ability or different ways of understanding?
Changyoon Lee, Donghoon Han, Hyoungwook Jin
Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang
Leverage Consensus Partition for Domain-Specific Entity Coreference
What is the Entrance Exams Task
English project More detail and the data collection system
Artificial Intelligence 2004 Speech & Natural Language Processing
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Extracting Why Text Segment from Web Based on Grammar-gram
A Joint Model of Orthography and Morphological Segmentation
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Presentation transcript:

NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU) A Human-Computer Collaboration Approach to Improve Accuracy of an Automated English Scoring System NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)

Outline Overview of the system Issue Solution Evaluation Conclusion 4/12/20194/12/2019 Outline Overview of the system Issue Redundant errors Solution Introducing method to determine redundant errors Evaluation Conclusion NAACL-HLT2010

Procedure of Automated Scoring System 4/12/20194/12/2019 Procedure of Automated Scoring System Teacher Question: 그녀는 방과 후에 축구를 했다. Correct answers: She played soccer after school. She played soccer after school is over. automated scoring system question database Input: She play footboll. Student score: 3 points out of 6 jerror in number agreement (play  plays|played) kmisspelling (footboll  football) ltense mismatching (play  played) mmissing elements “after school” scoring result feedback NAACL-HLT2010

Automated English Scoring System 4/12/20194/12/2019 Automated English Scoring System Scoring a single sentence not an essay Target users Junior high school students learning English as a second language Calculating a score based on the number of errors the types of errors NAACL-HLT2010

calculating similarity 4/12/20194/12/2019 System Overview mapping errors a scoring result & diagnostic feedback inter-sentential error detection module comparing sentences & calculating similarity a student’s answer dependency structures a set of correct answers dependency structures lexical information & syntactic rules & synonyms lexicon lexicon intra-sentential error detection module syntactic analyzer word errors syntactic morphological analyzer NAACL-HLT2010

Errors 76 error types to be detected by the system Error Reporting 4/12/20194/12/2019 Errors 76 error types to be detected by the system 16 word errors  morphological analyzer 46 syntactic errors  syntactic analyzer 14 mapping errors  comparing sentences Error Reporting She is too week to carry the bag. ERROR_ID | ERROR_POSITION | ERROR_CORRECTION_INFO e.g., CONFUSABLE_WORD_EROR | 4 | weak NAACL-HLT2010

Issue  Teacher’s assessment : ‘her ’ has to be omitted 4/12/20194/12/2019 Issue Correct Answer: She is too weak to carry the bag. Student Answer: She is too weak to carry the her bag.  Teacher’s assessment : ‘her ’ has to be omitted A single error has been detected Error detection result produced by the system EXTRA_DET_ERROR | 7-9 |  Syntactic processing phase UNNECESSARY_NODE_ERROR | 8 | (her)  Mapping processing phase System’s assessment: treated them as two distinctive errors NAACL-HLT2010

4/12/20194/12/2019 Error Example Correct Answer: She is a teacher who came to our school last week. Student Answer: She is a teacher who come school last weak. Error Reporting Phases CONFUSABLE_WORD_EROR | 9 | week word error SUBJ_VERB_AGR_ERROR | 3-7 | syntactic error VERB_SUBCAT_ERROR | 6-7 | TENSE_UNMATCHED_ERROR | 6 | came mapping OPTIONAL NODE_MISSING_ERROR | (7) | to OPTIONAL NODE_MISSING_ERROR | (8) | our  One of the errors has to be removed!!! NAACL-HLT2010

Redundant Errors A pair of errors is determined as redundant errors if 4/12/20194/12/2019 Redundant Errors A pair of errors is determined as redundant errors if they satisfy the following 3 conditions all together COND1: Sharing an error position COND2: Detected from different process phases COND3: Dealing with the same linguistic phenomenon Objectives To remove one of the redundant errors To improve the accuracy of the system NAACL-HLT2010

Deciding Redundant Errors 4/12/20194/12/2019 Deciding Redundant Errors 14,892 sentences with errors detected by the system Filtering by Cond #1 & #2 150,419 pairs of errors 657 pairs of error ID Filtering by PMI & RFC 29,588 pairs of errors 111 pairs of error ID Filtering by human experts 20 pairs of error ID 47 pairs of error ID 44 pairs of error ID redundant non-redundant redundant or non-redundant Deciding by Decision Tree NAACL-HLT2010

Deciding Redundant Errors (1) 4/12/20194/12/2019 Deciding Redundant Errors (1) Filtering by COND #1 & #2 Input 14,892 task-takers’ sentences scored by the system All the possible pairs of errors which could occur in a sentence Output 150,419 pairs of errors were filtered 657 pairs of error ID COND1: Sharing an error position COND2: Detected from different process phases ERROR_ID | ERROR_POSITION | ERROR_CORRECTION

Deciding Redundant Errors (2) 4/12/20194/12/2019 Deciding Redundant Errors (2) Filtering using threshold of PMI & RFC [Su et al, 1994] Input 657 pairs of error ID from the previous step Pointwise Mutual Information (PMI) Relative Frequency Count (RFC) Filtering Output 111 pairs of error ID PMI: simultaneous error occurrences RFC: frequency of averaged occurrence NAACL-HLT2010

Deciding Redundant Errors (3) 4/12/20194/12/2019 Deciding Redundant Errors (3) Filtering by human experts Background of the experts Junior high school English teachers With Linguistics knowledge With teaching experiences of 10 years or more Input 111 pairs of error ID Output Categorized errors into 3 classes NAACL-HLT2010

Deciding Redundant Errors (4) 4/12/20194/12/2019 Deciding Redundant Errors (4) 3 error classes Class Pairs of errors Action “redundant” (20 pairs of error ID) (DET_NOUN_CV_ERR, DET_UNMATCHED_ERR) (EXTRA_DET_ERR, DET_UNMATCHED_ERR) (MODIFIER_COMP_ERR, FORM_UNMATCHED_ERR) (MISSPELLING_ERR, LEXICAL_ERR) … Remove one of the errors “non-redundant” (47 pairs of error ID) (SUBJ_VERB_AGR_ERR, TENSE_UNMATCHED_ERR) (AUX_MISSING_ERR, UNNECESSARY_NODE_ERR) (CONJ_MISSING_ERR, DET_UNMATCHED_ERR) Keep both errors “yet to be decided” (44 pairs of error ID) (VERB_FORM_ERR, ASPECT_UNMATCHED_ERR) (VERB_ING_FORM_ERR, TENSE_UNMATCHED_ERR) (EXTRA_PREP_ERR, UNNECESSARY_NODE_ERR) None: Need additional Information to decide NAACL-HLT2010

Deciding Redundant Errors (5) 4/12/20194/12/2019 Deciding Redundant Errors (5) For 44 “yet to be decided” pairs Need additional information to determine if they are redundant or not Using Decision Tree Extracting decision rules NAACL-HLT2010

Deciding Redundant Errors (6) 4/12/20194/12/2019 Deciding Redundant Errors (6) Features for decision tree learning For a pair of errors (E1, E2) Feature Description Shared_length length of shared words in E1 & E2 / total words in a shorter sentence Non_shared_length length of non-shared words in E1 & E2 / total words in a shorter sentence E1 Correction_Info Error correction information of E1 E2 Correction_Info Error correction information of E2 Edit_distance Edit distance between Correction_Info strings of E1 & E2 E1 pos Error position of error E1 E2 pos Error position of error E2 Diff_error_pos Difference of error positions of E1 and E2 NAACL-HLT2010

Examples of Decision Rules 4/12/20194/12/2019 Examples of Decision Rules E1=CONJ_MISSING_ERR E2=OPTIONAL_NODE_MISSING_ERR If E2.Correction_Info=‘conj’ and E2.pos=1 then redundant error E1=EXTRA_PREP_ERR E2=UNNECESSARY_NODE_ERR If E2.Correction_Info=‘prep’ and E2.pos=1 E1=VERB_SUBCAT_ERR If diff_error_pos <=3 and E2.Correction_Info={‘prep’, ‘adv’} then redundant error E1=VERB_ING_FORM_ERR E2=TENSE_UNMATCHED_ERR If E2.Correction_Info=‘verb-ing’ NAACL-HLT2010

Class of “non-redundant” Class of “yet to be decided” 4/12/20194/12/2019 Evaluation Scoring 200 unseen student-sentences by the system Overall system’s performance 2.6% improved… Reducing a gap between human scoring and machine scoring Deciding by Decision Tree 20 pairs of error ID 47 pairs of error ID 44 pairs of error ID redundant non-redundant redundant or Class of “redundant” Class of “non-redundant” Class of “yet to be decided” Accuracy 94.1% 98.0% 82.3% NAACL-HLT2010

4/12/20194/12/2019 Conclusion Improvement was achieved by collaborating with human experts Overall accuracy of the system has been improved NAACL-HLT2010

4/12/20194/12/2019 Thank you! NAACL-HLT2010

Cannot be decided yet Correct answer: I don’t know why she went there. 4/12/20194/12/2019 Cannot be decided yet (Ex4) Correct answer: I don’t know why she went there. Student answer: I don’t know why she go to their. Err1: CONFUSABLE_WORD_ERR|8|there word Err2: SUBJ_VERB_AGR_ERR|6|went[3S] syntactic Err3: EXTRA_PREP_ERR|6-8| Err4: UNNECESSARY_NODE_ERR|7|(to) mapping Err5: TENSE_UNMATCHED_ERR|6|went[past] NAACL-HLT2010

Cannot be decided yet (cont’d) 4/12/20194/12/2019 Cannot be decided yet (cont’d) (Ex5) Correct answer: Would you like to come? Student answer: you go to home? Err1: FIRST_WORD_CASE_ERR|1| word Err2: EXTRA_PREP_ERR|3-4| syntactic Err3: OBLIGATORY_NODE_MISSING_ERR|(1,3)|Would _ like mapping Err4: UNNECESSARY_NODE_ERR|4|(home) Err5: LEXICAL_ERR|2|come NAACL-HLT2010