Download presentation
Presentation is loading. Please wait.
Published byLorraine Jefferson Modified over 9 years ago
1
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/091 Language modelling (word FST) Operational model for categorizing mispronunciations step 1: decode visual image prompted image = ‘circus’ step 2: convert graphemes to phonemes step 3: articulate phonemes spoken utterance: correct, miscue (step3) or error (steps 1, 2) (cursus) (/k y r s y s/) (circus) (/s i r k y s/) (circus) (k i r k y s) (/k y - k y r s y s /) (/ka - i - Er - k y s/)
2
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/092 Language modelling (word FST) Prevalence of errors of different types (Chorec data) mispronunciation category normal children children with reading deficiency Step 1 – all errors43%51% – real words16%26% Step 2 (g2p errors)16%5% Step 3 (restart, spell)29%41% Children with RD tend to guess more often Important to model steps 1 and 3 step 2 not so important
3
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/093 Creation of word FST : model step 1 correct pronunciation predictable errors (prediction model needed) s ta r t tAr s t logP = -5.8 logP = -7.2 s tra t
4
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/094 Creation of word FST : model step 3 Per branch in previous FST Correctly articulated Restarts (fixed probabilities for now) Spelling (phonemic) (fixed probabilities for now) s tr a t Es teEr a te
5
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/095 Modelling image decoding errors Model 1 : memory model –adopted in listen project –per target word create list of errors found in database keep those with P(list entry = error | TW) > TH –advantages very simple strategy can model real words + non-real-word errors –disadvantages cannot model unseen errors probably low precision
6
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/096 Modelling image decoding errors Model 2 : extrapolation model (idea from..) –look for existing words that expected to belong to vocabulary of child (= mental lexicon) bare good resemblance with target word –select lexicon entries from that vocabulary feature based: expose (dis)similarities with TW features: length differences, alignment agreement, word categories, graphemes in common, … decision tree P(entry = decoding error | features) keep those with P > TH –advantage: can model not previously seen errors –disadvantage: can only model real word errors
7
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/097 Modelling image decoding errors Model 3 : rule based model (under dev.) –look for frequently observed transformations at subword level grapheme deletions, insertions, substitutions (e.g. d b) grapheme inversions (e.g. leed deel) combinations –learn decision tree per transformation –advantages more generic better recall/precision compromise can model real word + non-real word errors –disadvantage more complex + time consuming to train
8
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/098 Modelling results so far Measures (over target words with error) –recall= nr of predicted errors / total nr of errors –precision= nr of predicted errors / nr of predictions –F-rate = 2.R.P/(R+P) –branch= average nr of predictions per word Data : test set from Chorec database modelrecall (%)precision (%)F-ratebranch memory2815.80.201.8 extrapolation23 7.70.102.9 combination3515.20.232.0
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.