Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hong Kong English in Students’ Writing

Similar presentations


Presentation on theme: "Hong Kong English in Students’ Writing"— Presentation transcript:

1

2 Hong Kong English in Students’ Writing
Holly Chung (ENG), David Ling (DLC)

3 Contents Background System approaches Preliminary results Summary
The team The problem System approaches Statistical methods Traditional rule-based methods Deep learning methods Preliminary results Summary

4 Background – The team Project-in-charge:
Dr. Anora Wong , Dr. Holly Chung, Dr. Amy Kong Parties/collaborators involved HSMC English Department HSMC Deep Learning Center (RGC funded) eClass Project website:

5 Are you suffering these common headaches as an English-language teacher?
A few classes’ writing flooding in on the same date? Need to grade writing assignments from different forms? One pile of writing JUST done but then another pile coming right after? Students submitting extra writing to you?

6 And we are only humans…

7 Messy to the eyes! Frustrating to the mind!

8 Unfamiliar word choice/ unnatural collocations
“Ignorance of the ingredients in pearl milk tea can kill consumers in a taciturn way.” Taciturn (adj): Speaking very little

9 Even MS Word may not be able to detect them…
“People may select digital wallets as the main payment manner.” “The appearance of digital wallet may alter the lending payment method in Hong Kong.

10 And what about some good practices?

11 Background – the problem
Grammatical errors and semantic errors (HK style English) often appear in students’ writing They engages the audience by different tactics. Both of them done a good job. engage did They engages the audience by different tactics. Both of them done a good job.

12 Background – the problem
Grammatical errors and semantic errors (HK style English) often appear in students’ writing I had a causal chat with Tim yesterday. Math lessons use English. He can say Chinese. casual are conducted in speak I had a causal chat with Tim yesterday. Math lessons use English. He can say Chinese.

13 Background – the problem
A system which facilitates teachers’ work by: Highlighting and suggesting both grammatical and semantic errors I have two Persian cat. Output Input system Math lessons use English.

14 I. Statistical methods – Comparing the writing with corpora
Online data allow machines to learn English Digital corpus Books Online newspaper articles, Wikipedia articles, online books, essays, papers, magazines, …

15 I. Statistical methods – Comparing the writing with corpora
Detection by tri-gram frequency A contiguous sequence of 3 words/tokens Example: Peter goes to school every day. He can say Chinese. can say Chinese. TRI-GRAM COUNTS _start_ peter goes 1955 peter goes to 1443 goes to school 67559 school every day 35998 _start_ he can he can say 99710 can say chinese say chinese . 57 COUNTS: the frequency in corpora Low frequency – uncommon (bad style) or problematic Corpora: Wikipedia 2007 articles (10GB), Google books N-gram (20GB) can say chinese

16 I. Statistical methods – Comparing the writing with corpora
Detection by dependency relation frequency The relation between two words in a sentence Example: Those special viral video can gain the attentions of the audiences. Those special viral video can gain the attentions of the audiences. TOKEN PAIR RELATION COUNTS video Those determiner special adjective modifier 135 viral 770 gain nominal subj 1 can auxiliary verb 1732 attentions direct object 5 . punctuation 4820 the 762 TOKEN PAIR RELATION COUNTS video Those determiner special adjective modifier 135 viral 770 gain nominal subj 1 can auxiliary verb 1732 attentions direct object 5 . punctuation 4820 the 762 Can “see” beyond tri-gram Extracts and counts the relations in corpora Corpora: Wikipedia 2018 articles, BNC, ANC, BBC news, Reuters

17 II. Rule-based methods Detection by matching error patterns
Implemented by LanguageTool (open-sourced software) Examples: Agreement errors (eg., a plural noun followed by a singular verb) Wrong prepositions (eg., happened + to her/him) Agreement error

18 III. Deep learning methods (In progress)
Previous methods: Detection only Very labour intensive Many exceptions Deep Learning methods: Understand the sentence meaning Rewrite the sentence or provide corrections

19 III. Deep learning methods (In progress)
Possible approaches Neural network classifier To resolve words that are easily confused, e.g., “causal”, “casual” Neural network machine translation An unedited sentence is “translated” into a corrected sentence “The models words.” => “The models work.” A lot of data for training and validation is needed Try to capture the sentence’s meaning by a vector, and attempt to generate more likely text Allen et. al., “Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction”, Harvard University, 2016

20 Preliminary results – An essay from HSMC students
Blue: by hand-crafted rules Green: by tri-gram detection Purple: by dependency relation Results marked by the system: an assumptions tastemarkers statistics Those … videos System thinks that (video, gain) is an uncommon pair … than in the past. “the unexpectedness one” is rare

21 Preliminary results: Compared with the teacher’s marking
Human marked script

22 Preliminary results: Compared with the teacher’s marking
By system: By teacher: It captures some overlooked mistakes by the teacher.

23 Summary and the future Summary: On-going Up-coming
Correct errors and highlight good practices Provide data and statistics for teachers On-going Gather more students’ scripts (both annotated and raw scripts) Attempt different deep-learning methods, analyze gathered data Up-coming Invite teachers and schools to participate Apply for the Quality Education Fund For more:

24 The End Thank you very much!


Download ppt "Hong Kong English in Students’ Writing"

Similar presentations


Ads by Google