ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/091 Language modelling (word FST) Operational model for categorizing mispronunciations.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Prep Year Curriculum What will my child learn?.
Levenshtein-distance-based post-processing shared task spotlight Antal van den Bosch ILK / CL and AI, Tilburg University Ninth Conference on Computational.
PHONICS The teaching of Phonics and Spelling at Fountains Earth Primary School.
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
Chapter 8. Pipelining. Instruction Hazards Overview Whenever the stream of instructions supplied by the instruction fetch unit is interrupted, the pipeline.
Year 1 Objectives: Reading
Making a Clay Mask 6 Step 1 Step 2 Step 3Decision Point Step 5 Step 4 Reading ComponentsTypical Types of Tasks and Test Formats Phonological/Phonemic.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Learning from Observations Chapter 18 Section 1 – 4.
Construction of phoneme-to-phoneme converters
Development of the Ability to read Words : Update By Linnea C. Ehri Presented by Pat Edwards & Hakim Shahid.
1 Evaluating the Effect of Predicting Oral Reading Miscues Satanjeev Banerjee, Joseph Beck, Jack Mostow Project LISTEN ( Carnegie.
Methodologies for improving the g2p conversion of Dutch names Henk van den Heuvel, Nanneke Konings (CLST, Radboud Universiteit Nijmegen) Jean-Pierre Martens.
Phonological Analysis of Child Speech Relational Analysis.
Spelling : Best Practices Kristan Bachner Ashley Smith Michele Renner By:
Machine Transliteration T BHARGAVA REDDY (Knowledge sharing)
Language Assessment 4 Listening Comprehension Testing Language Assessment Lecture 4 Listening Comprehension Testing Instructor Tung-hsien He, Ph.D. 何東憲老師.
Teaching the higher levels of phonics: practical ideas from practitioners.
Processing of large document collections Part 3 (Evaluation of text classifiers, applications of text categorization) Helena Ahonen-Myka Spring 2005.
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Letters and Sounds John Cross CE Primary.
St Urban’s Catholic Primary School Phonics Parent Evening Wednesday 3 rd December 2014.
Published by the California Department of Education (2009)
AUTOMATED TEXT CATEGORIZATION: THE TWO-DIMENSIONAL PROBABILITY MODE Abdulaziz alsharikh.
Machine Learning Queens College Lecture 2: Decision Trees.
Presented by: Fang-Hui Chu Boosting HMM acoustic models in large vocabulary speech recognition Carsten Meyer, Hauke Schramm Philips Research Laboratories,
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Assessment of Phonology
Visual Information Systems Recognition and Classification.
Developing an automated assessment tool for children’s oral reading Leen Cleuren March
05/03/03-06/03/03 7 th Meeting Edinburgh Naïve Bayes Fact Extractor (NBFE) v.1.
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
Problem Representation
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University.
Object Recognition a Machine Translation Learning a Lexicon for a Fixed Image Vocabulary Miriam Miklofsky.
Wednesday 23rd September
Large Vocabulary Continuous Speech Recognition. Subword Speech Units.
1 Introduction to Natural Language Processing ( ) Language Modeling (and the Noisy Channel) AI-lab
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Language and Communication Definitions Developmental scales Communication disorders Speech Disorders Language Disorders Interventions.
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/091 Objective intelligibility assessment of pathological speakers Catherine Middag,
Reception Workshop September 2015.
13 th October Vocabulary We use the correct terminology with the children right from Early Years. It may sound complicated but it actually makes.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Assessing reading development PGCE FT English 11/12 Semester 2, week 4.
Data Mining and Decision Support
Phonics Welcome. Please help yourself to refreshments.
1 Applying Principles To Reading Presented By Anne Davidson Michelle Diamond.
Helping your child to read. Presentation to Nursery and Reception Parents and Carers. October 2014 St. Michael’s Primary School.
Linguistic Phonics Coordinator’s Training Pack 2.
Phonics and Reading Workshop for Year 1 Parents Tuesday 8 th December 2015.
Chapter 5 The Oral Approach.
VISUAL WORD RECOGNITION. What is Word Recognition? Features, letters & word interactions Interactive Activation Model Lexical and Sublexical Approach.
Phonics.
Phonics.
A Multisensory Approach to Reading Instruction
Olivier Siohan David Rybach
Helping your child to read. Presentation Reception Parents and Carers.
Welcome [Please note: You may need one and a half to two hours for this presentation – to include looking at a mini Alphabetic Code Chart, taking the parents.
Helping your child at home
Phonics EYFS and Year One Thursday 10th November 2016.
Speaker : chia hua Authors : Long Qin, Ming Sun, Alexander Rudnicky
The role of the Arabic orthography in reading and spelling
Baring the Big 5.
Learning to read through phonics
9am, Level 5 - Westbury site
Estimating Link Signatures with Machine Learning Algorithms
What will my child learn?
Research on the Modeling of Chinese Continuous Speech Recognition
Presentation transcript:

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/091 Language modelling (word FST) Operational model for categorizing mispronunciations step 1: decode visual image prompted image = ‘circus’ step 2: convert graphemes to phonemes step 3: articulate phonemes spoken utterance: correct, miscue (step3) or error (steps 1, 2) (cursus) (/k y r s y s/) (circus) (/s i r k y s/) (circus) (k i r k y s) (/k y - k y r s y s /) (/ka - i - Er - k y s/)

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/092 Language modelling (word FST) Prevalence of errors of different types (Chorec data) mispronunciation category normal children children with reading deficiency Step 1 – all errors43%51% – real words16%26% Step 2 (g2p errors)16%5% Step 3 (restart, spell)29%41% Children with RD tend to guess more often Important to model steps 1 and 3 step 2 not so important

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/093 Creation of word FST : model step 1 correct pronunciation predictable errors (prediction model needed) s ta r t tAr s t logP = -5.8 logP = -7.2 s tra t

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/094 Creation of word FST : model step 3 Per branch in previous FST Correctly articulated Restarts (fixed probabilities for now) Spelling (phonemic) (fixed probabilities for now) s tr a t Es teEr a te

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/095 Modelling image decoding errors Model 1 : memory model –adopted in listen project –per target word create list of errors found in database keep those with P(list entry = error | TW) > TH –advantages very simple strategy can model real words + non-real-word errors –disadvantages cannot model unseen errors probably low precision

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/096 Modelling image decoding errors Model 2 : extrapolation model (idea from..) –look for existing words that expected to belong to vocabulary of child (= mental lexicon) bare good resemblance with target word –select lexicon entries from that vocabulary feature based: expose (dis)similarities with TW features: length differences, alignment agreement, word categories, graphemes in common, … decision tree  P(entry = decoding error | features) keep those with P > TH –advantage: can model not previously seen errors –disadvantage: can only model real word errors

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/097 Modelling image decoding errors Model 3 : rule based model (under dev.) –look for frequently observed transformations at subword level grapheme deletions, insertions, substitutions (e.g. d  b) grapheme inversions (e.g. leed  deel) combinations –learn decision tree per transformation –advantages more generic  better recall/precision compromise can model real word + non-real word errors –disadvantage more complex + time consuming to train

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE symposium - 6/2/098 Modelling results so far Measures (over target words with error) –recall= nr of predicted errors / total nr of errors –precision= nr of predicted errors / nr of predictions –F-rate = 2.R.P/(R+P) –branch= average nr of predictions per word Data : test set from Chorec database modelrecall (%)precision (%)F-ratebranch memory extrapolation combination