Customized Spell Corrector Aviad Ashkenazi Matan Zinger March 2012
Agenda Overview about Dysgraphia Short overview of Natural Language Processing Using NLP to solve Dysgraphia symptoms Dispeller Application Demonstration
Overview – Cognitive Writing Process Dysgraphia may caused by a “damage” in any of this modules.
Overview – Different Types of Dysgraphia Surface Dysgraphia – damage in lexical flow Using sub-lexical flow instead Symptoms: replacing homo-phonetic letters, difficulty in irregular words No mistakes will appear for univalent words Similar symptoms will appear for children (w/o dysgraphia) Phonological Dysgraphia – damage in sub-lexical flow Difficulty in writing non-familiar words (which require translation of phoneme into grapheme) No mistakes when using lexical flow (e.g. for familiar words) Peripheral Dysgraphia – damage in grapheme buffer Word length is one of the most critical factors Symptoms: re-ordering of internal letters, doubling letters, omitting letters
Overview – Natural Language Processing Purpose: Machine’s understanding of human-generated text Common terminology: Tokenization Lemmatization / Stemming “Stop Words” Part of Speech Tagging Text Search, TF-IDF Levenshtein Distance For spell checking / fuzzy search Ranking by the level of distance Semantic Understanding Popular Open-Source Library: Lucene.NET Provides many generic NLP capabilities
NLP to Solve Dysgraphia Symptoms Regular spell checker For which cases will it work well? Is it good enough for Dysgraphia? Customized spell checker How will it work? What is required? Isn’t it better? Symptoms we chose to handle Homophonic replacement of letters (“Dyscravia”) Doubling letters (Grapheme Buffer Dysgraphia) Changing internal order (Grapheme Buffer Dysgraphia)
The Dyspeller Application Classification Module Use a series of tests (presented as a “game”) Determines “Dysgraphia Profile” – common symptoms Personalized Spell Checker For every misspelled word, we look for the nearest correct word Search is done not by Levenshtein distance, but by “Personalized Dysgraphia Distance” The distance between two words is calculated by: Number of Dysgraphia symptoms which are typical for this specific user, that are needed to be fixed in order to generate word A from word B. Publishing Module The corrected text can be sent via SMS or Email to any of the contacts.
Dyspeller - Design Suggestion Processing – Calculating Dysgraphia Distance HTTP/GET: Suggestions by symptoms Double Letter Symptom Internal Reorder Symptom Phonetic Replace Symptom Valid Words Data Set Response: misspelled word -> suggestions list (JSON format)
DEMO
Thank You. References: Gvion, Friedmann, Yachini – Dysgraphia (2008) Letter position dysgraphia (Aviah Gvion, Naama Friedmann) – 2009 Dyscravia: Voicing substitution dysgraphia (Aviah Gvion, Naama Friedmann) – 2010