Download presentation
Presentation is loading. Please wait.
Published byAngelina Simmons Modified over 9 years ago
1
Measuring Language Development in Children: A Case Study of Grammar Checking in Child Language Transcripts Khairun-nisa Hassanali and Yang Liu {nisa, yangl}@hlt.utdallas.edu The University of Texas at Dallas 1. Summary Automatically detected grammatical errors in child language transcripts. In all cases, we had a recall higher than 84% Classifiers that used features other than rules performed the best with an F1-measure of 0.967. LI children made more grammatical mistakes than TD children on most error categories 7. Experimental Results 6. Conclusion 3. Grammatical Errors Used the Paradise Data Set 677 transcripts (623 TD children, 54 Language Impaired (LI) children) 108,711 utterances, 394,290 words with a mean length of utterance of 3.64 Annotated transcripts for 10 types of grammatical errors Found more LI children made the grammatical mistake at least once compared to TD children. Created rule based systems and statistical systems using 2 sets of features to detect the following 6 types of errors: Misuse of –ing participle, missing copulae, subject-auxiliary agreement, missing verb, wrong verb usage and missing infinitive marker “To” Focused on verb related errors since LI children have more problems with verb usage when compared to TD children Constructed one rule based classifier, alternating decision classifier and naïve Bayes classifier for each error category Rule based classifiers were constructed using regular expressions based on parse tree structure Alternating decision tree classifiers used rules as features Naïve Bayes classifiers used a variety of other features such as bigrams, skip bigrams and other syntactic features depending on the error category Serially applied all the classifiers to detect grammatical errors 2. The Larger Problem Measuring language development in children Measures such as Index of Productive Syntax measure language competence but don’t take into account a child’s grammar deficiencies Automatic grammar checking will allow clinicians to analyze a child’s grammar deficiencies in addition to competence. Given a child language transcript, answer the following question: Does the child make more grammatical mistakes than an average Typically Developing (TD) child? Grammatical errors are analyzed in child language transcripts Focus on automatic detection of 6 types of grammatical errors using rule based and statistical systems Statistical system outperforms rule based systems in most of the cases 7. Future Work Use the grammatical errors as features for detecting language impairment Enhance system to detect other grammatical errors such as missing article Create a language development score that takes into account grammatical errors made by a child Take into account dialect specific errors for grammar checking 4. Automatic Grammar Checking Performed 10 fold cross validation using naïve Bayes and alternating decision tree classifier from WEKA Used the alternating decision tree classifier from the WEKA toolkit using rules as features ErrorExample % (Count) % of LI children making error % of TD children making error Missing auxiliary You talking to me? 8.43 (641)75 Missing copulaeShe lovely.36.67 (2788)77.7845 Subject-auxiliary agreement You is talking to me.6.31 (480)40.7435 Incorrect auxiliary verb used She does dead girl.0.71(54)11.473 Missing verbShe her a book.5 (380)29.6310 Wrong verb usageHe love dogs.14.59 (1109)68.550 Missing preposition The book is the table.5 (380)7.45 Missing articleShe ate apple.3.97 (302)29.6335 Missing subjectI know loves me.7.69 (585)3.75 Missing infinitive marker “To”I give it her.1.58 (120)7.511.67 Other errorsThe put.10.05 (764)56.723.2 Error Rule based (P/R) F1 Decision tree (P/R) F1 Naïve Bayes (P/R)F1 Misuse of -ing participle (0.984/0.978) 0.981(0.986/1) 0.993 (0.736/0.929) 0.821 Missing copulae (0.885/0.9) 0.892 (0.912/0.94) 0.926 (0.82/0.86) 0.84 Missing verb (0.875/0.932) 0.903 (0.92/0.89) 0.905 (0.87/0.91) 0.9 Subject-auxiliary agreement (0.855/0.932) 0.888 (0.95/0.84) 0.892 (0.89/0.934) 0.912 Subject-verb agreement (0.883/0.945) 0.892 (0.92/0.877) 0.898 (0.91/0.914) 0.912 Missing infinitive marker “To” (0.97/0.954) 0.962 (0.94/0.84) 0.887 (0.95/0.88) 0.914 Overall (0.935/0.923) 0.929 (0.945/0.965) 0.955 (0.956/0.978) 0.967 The 6th Workshop on Innovative Use of NLP for Building Educational Applications 5. Experimental Results
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.