Presentation is loading. Please wait.

Presentation is loading. Please wait.

NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)

Similar presentations


Presentation on theme: "NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)"— Presentation transcript:

1 NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)
A Human-Computer Collaboration Approach to Improve Accuracy of an Automated English Scoring System NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)

2 Outline Overview of the system Issue Solution Evaluation Conclusion
4/12/20194/12/2019 Outline Overview of the system Issue Redundant errors Solution Introducing method to determine redundant errors Evaluation Conclusion NAACL-HLT2010

3 Procedure of Automated Scoring System
4/12/20194/12/2019 Procedure of Automated Scoring System Teacher Question: 그녀는 방과 후에 축구를 했다. Correct answers: She played soccer after school. She played soccer after school is over. automated scoring system question database Input: She play footboll. Student score: 3 points out of 6 jerror in number agreement (play  plays|played) kmisspelling (footboll  football) ltense mismatching (play  played) mmissing elements “after school” scoring result feedback NAACL-HLT2010

4 Automated English Scoring System
4/12/20194/12/2019 Automated English Scoring System Scoring a single sentence not an essay Target users Junior high school students learning English as a second language Calculating a score based on the number of errors the types of errors NAACL-HLT2010

5 calculating similarity
4/12/20194/12/2019 System Overview mapping errors a scoring result & diagnostic feedback inter-sentential error detection module comparing sentences & calculating similarity a student’s answer dependency structures a set of correct answers dependency structures lexical information & syntactic rules & synonyms lexicon lexicon intra-sentential error detection module syntactic analyzer word errors syntactic morphological analyzer NAACL-HLT2010

6 Errors 76 error types to be detected by the system Error Reporting
4/12/20194/12/2019 Errors 76 error types to be detected by the system 16 word errors  morphological analyzer 46 syntactic errors  syntactic analyzer 14 mapping errors  comparing sentences Error Reporting She is too week to carry the bag. ERROR_ID | ERROR_POSITION | ERROR_CORRECTION_INFO e.g., CONFUSABLE_WORD_EROR | 4 | weak NAACL-HLT2010

7 Issue  Teacher’s assessment : ‘her ’ has to be omitted
4/12/20194/12/2019 Issue Correct Answer: She is too weak to carry the bag. Student Answer: She is too weak to carry the her bag.  Teacher’s assessment : ‘her ’ has to be omitted A single error has been detected Error detection result produced by the system EXTRA_DET_ERROR | 7-9 |  Syntactic processing phase UNNECESSARY_NODE_ERROR | 8 | (her)  Mapping processing phase System’s assessment: treated them as two distinctive errors NAACL-HLT2010

8 4/12/20194/12/2019 Error Example Correct Answer: She is a teacher who came to our school last week. Student Answer: She is a teacher who come school last weak. Error Reporting Phases CONFUSABLE_WORD_EROR | 9 | week word error SUBJ_VERB_AGR_ERROR | 3-7 | syntactic error VERB_SUBCAT_ERROR | 6-7 | TENSE_UNMATCHED_ERROR | 6 | came mapping OPTIONAL NODE_MISSING_ERROR | (7) | to OPTIONAL NODE_MISSING_ERROR | (8) | our  One of the errors has to be removed!!! NAACL-HLT2010

9 Redundant Errors A pair of errors is determined as redundant errors if
4/12/20194/12/2019 Redundant Errors A pair of errors is determined as redundant errors if they satisfy the following 3 conditions all together COND1: Sharing an error position COND2: Detected from different process phases COND3: Dealing with the same linguistic phenomenon Objectives To remove one of the redundant errors To improve the accuracy of the system NAACL-HLT2010

10 Deciding Redundant Errors
4/12/20194/12/2019 Deciding Redundant Errors 14,892 sentences with errors detected by the system Filtering by Cond #1 & #2 150,419 pairs of errors 657 pairs of error ID Filtering by PMI & RFC 29,588 pairs of errors 111 pairs of error ID Filtering by human experts 20 pairs of error ID 47 pairs of error ID 44 pairs of error ID redundant non-redundant redundant or non-redundant Deciding by Decision Tree NAACL-HLT2010

11 Deciding Redundant Errors (1)
4/12/20194/12/2019 Deciding Redundant Errors (1) Filtering by COND #1 & #2 Input 14,892 task-takers’ sentences scored by the system All the possible pairs of errors which could occur in a sentence Output 150,419 pairs of errors were filtered 657 pairs of error ID COND1: Sharing an error position COND2: Detected from different process phases ERROR_ID | ERROR_POSITION | ERROR_CORRECTION

12 Deciding Redundant Errors (2)
4/12/20194/12/2019 Deciding Redundant Errors (2) Filtering using threshold of PMI & RFC [Su et al, 1994] Input 657 pairs of error ID from the previous step Pointwise Mutual Information (PMI) Relative Frequency Count (RFC) Filtering Output 111 pairs of error ID PMI: simultaneous error occurrences RFC: frequency of averaged occurrence NAACL-HLT2010

13 Deciding Redundant Errors (3)
4/12/20194/12/2019 Deciding Redundant Errors (3) Filtering by human experts Background of the experts Junior high school English teachers With Linguistics knowledge With teaching experiences of 10 years or more Input 111 pairs of error ID Output Categorized errors into 3 classes NAACL-HLT2010

14 Deciding Redundant Errors (4)
4/12/20194/12/2019 Deciding Redundant Errors (4) 3 error classes Class Pairs of errors Action “redundant” (20 pairs of error ID) (DET_NOUN_CV_ERR, DET_UNMATCHED_ERR) (EXTRA_DET_ERR, DET_UNMATCHED_ERR) (MODIFIER_COMP_ERR, FORM_UNMATCHED_ERR) (MISSPELLING_ERR, LEXICAL_ERR) Remove one of the errors “non-redundant” (47 pairs of error ID) (SUBJ_VERB_AGR_ERR, TENSE_UNMATCHED_ERR) (AUX_MISSING_ERR, UNNECESSARY_NODE_ERR) (CONJ_MISSING_ERR, DET_UNMATCHED_ERR) Keep both errors “yet to be decided” (44 pairs of error ID) (VERB_FORM_ERR, ASPECT_UNMATCHED_ERR) (VERB_ING_FORM_ERR, TENSE_UNMATCHED_ERR) (EXTRA_PREP_ERR, UNNECESSARY_NODE_ERR) None: Need additional Information to decide NAACL-HLT2010

15 Deciding Redundant Errors (5)
4/12/20194/12/2019 Deciding Redundant Errors (5) For 44 “yet to be decided” pairs Need additional information to determine if they are redundant or not Using Decision Tree Extracting decision rules NAACL-HLT2010

16 Deciding Redundant Errors (6)
4/12/20194/12/2019 Deciding Redundant Errors (6) Features for decision tree learning For a pair of errors (E1, E2) Feature Description Shared_length length of shared words in E1 & E2 / total words in a shorter sentence Non_shared_length length of non-shared words in E1 & E2 / total words in a shorter sentence E1 Correction_Info Error correction information of E1 E2 Correction_Info Error correction information of E2 Edit_distance Edit distance between Correction_Info strings of E1 & E2 E1 pos Error position of error E1 E2 pos Error position of error E2 Diff_error_pos Difference of error positions of E1 and E2 NAACL-HLT2010

17 Examples of Decision Rules
4/12/20194/12/2019 Examples of Decision Rules E1=CONJ_MISSING_ERR E2=OPTIONAL_NODE_MISSING_ERR If E2.Correction_Info=‘conj’ and E2.pos=1 then redundant error E1=EXTRA_PREP_ERR E2=UNNECESSARY_NODE_ERR If E2.Correction_Info=‘prep’ and E2.pos=1 E1=VERB_SUBCAT_ERR If diff_error_pos <=3 and E2.Correction_Info={‘prep’, ‘adv’} then redundant error E1=VERB_ING_FORM_ERR E2=TENSE_UNMATCHED_ERR If E2.Correction_Info=‘verb-ing’ NAACL-HLT2010

18 Class of “non-redundant” Class of “yet to be decided”
4/12/20194/12/2019 Evaluation Scoring 200 unseen student-sentences by the system Overall system’s performance 2.6% improved… Reducing a gap between human scoring and machine scoring Deciding by Decision Tree 20 pairs of error ID 47 pairs of error ID 44 pairs of error ID redundant non-redundant redundant or Class of “redundant” Class of “non-redundant” Class of “yet to be decided” Accuracy 94.1% 98.0% 82.3% NAACL-HLT2010

19 4/12/20194/12/2019 Conclusion Improvement was achieved by collaborating with human experts Overall accuracy of the system has been improved NAACL-HLT2010

20 4/12/20194/12/2019 Thank you! NAACL-HLT2010

21 Cannot be decided yet Correct answer: I don’t know why she went there.
4/12/20194/12/2019 Cannot be decided yet (Ex4) Correct answer: I don’t know why she went there. Student answer: I don’t know why she go to their. Err1: CONFUSABLE_WORD_ERR|8|there word Err2: SUBJ_VERB_AGR_ERR|6|went[3S] syntactic Err3: EXTRA_PREP_ERR|6-8| Err4: UNNECESSARY_NODE_ERR|7|(to) mapping Err5: TENSE_UNMATCHED_ERR|6|went[past] NAACL-HLT2010

22 Cannot be decided yet (cont’d)
4/12/20194/12/2019 Cannot be decided yet (cont’d) (Ex5) Correct answer: Would you like to come? Student answer: you go to home? Err1: FIRST_WORD_CASE_ERR|1| word Err2: EXTRA_PREP_ERR|3-4| syntactic Err3: OBLIGATORY_NODE_MISSING_ERR|(1,3)|Would _ like mapping Err4: UNNECESSARY_NODE_ERR|4|(home) Err5: LEXICAL_ERR|2|come NAACL-HLT2010


Download ppt "NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)"

Similar presentations


Ads by Google