Download presentation
Presentation is loading. Please wait.
1
Grammar correction – Data collection interface
David Ling
2
Contents Background Data collection plan
Data collection system (progress report)
3
Background Project in charge: Target: Method:
Holly Chung, Amy Kwok, Anora Wong (ENG) Target: English error corrections for HK students Highlight good practices (not well defined yet) Method: Chollampatt, 2018 Fairseq (Facebook) + Language model (KenLM) Math lessons use English. Math lessons are conducted in English. Better than traditional rule based (eg. Word, Grammarly)
4
Grammar correction data set
LANG-8 (Japan social website) NUCLE (National University of Singapore)
5
Grammar correction data set
Building a data set for Hong Kong students Improvement on the checker Different sentence style Different error types Literature value statistical analysis on HK students’ English
6
Grammar correction data set
Daniel Dahlmeier, Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English, ACL 2013 Course assignments A wide range of topics, like technology innovation or health care
7
Grammar correction data set
NUCLE DATA SAMPLE 28 error types: Verb tense, Subject-verb-agreement, Article or Determiner, Noun Number, … 10 English instructors 7 months
8
Data collection plan Timeline proposed by Amy Review and modify the tag sets (adding HK style tag) Hire teachers for tagging and proposed: HK$100 per essay x 2,000 essays = HK$200,000 Data set: internal use/ open to public / commercial?
9
Marking tool Marking tool = Data base (sqlite) + Interface (PHP + javascript) Data base: essays + annotations Currently implemented features Login, listing database essays, annotate, save and remove annotation DEMO: Teacher Teacher Teacher
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.