Download presentation
Presentation is loading. Please wait.
Published byJoshua Waters Modified over 9 years ago
1
STARDUST PROJECT – Speech Recognition for People with Severe Dysarthria Mark Parker Specialist Speech and Language Therapist
2
Project Team DoH NEAT University of Sheffield Barnsley District General Hospital Prof P Enderby/ M Parker – Clinical Speech Therapy Prof P Green/ Dr Athanassios Hatzis – Computer Sciences Prof M Hawley/ Dr Simon Brownsall – Medical Physics
3
What is Dysarthria? A neurological motor speech impairment characterised by slow, weak, imprecise and/or uncoordinated movements of the speech musculature. May be congenital or acquired 170/100 000 (Emerson & Enderby 1995)
4
Severity Rating zTypically based on ‘intelligibility’ z‘…the extent a listener understands the speech produced…’ (Yorkston et al, 1999) zNot a pure measure – interaction of events zMild 70-90% zModerate 40-70% zSevere 10-40%
5
Aim VRS used to access other technology Many of the people with severe dysarthria will have associated severe physical disability ECA operated with switching systems slow, laborious, positioning VRS to supplement or replace switching
6
Background zVoice recognition systems ycommercially available packages -mobile phones, WP packages-Dragon Dictate yContinuous vs Discrete zNormal speech - with recognition training can get >90% recognition rates (Rose and Galdo, 1999) zDysarthric speech - mild 10-15% lower recognition rates (Ferrier, 1992 ), zDeclining rapidly as speech deteriorates 30-40% single words (Thomas-Stonell, 1998) - functionally useless
7
Intelligibility vs Consistency zDifference between machine recognition and human perception z‘Normal’ speech may be 100% intelligible and with a narrow band of differences across time (consistency). z‘Severe’ dysarthria may be completely unintelligible but may show consistency of key elements (or not)
8
Development of the system z10-12 volunteers - severe dysarthria and physical disability zSpeech <30% intelligibility rating zVideo/DAT recording/computer sampling zAssessing for the range of phonetic contrasts that can be achieved
9
Development of a system (2) zDiscrete system - the number of contrasts that can be achieved will determine the number of commands that the VRS can handle zDon’t need intelligibility - need consistency zDetermine what word/sound/phonetic contrast will represent what command
10
Development of a system (3) zTrain the VRS - neural networks and hidden Markhov modelling zSpeech consistency training zImplement the system
11
Current position zSoftware development – sophisticated recording and data logging facility to be combined with ‘consistency’ measure and spectography package. zDeveloping ‘user friendliness’ and possibility of ‘remote’ usage. zIdentifying & Recording EC commands z‘Labelling’ the sample zAttempting to define measures of baseline consistency at an ‘acoustic’ level zExperimenting with recognition accuracy of commercially available product - Sicare
12
Labelling zBreaking an utterance into component parts zTo establish the extent of variance over time
13
Sicare testing zRecognition rates compatible with previous research zBegins to illustrate the points at which a recogniser becomes ‘confused’ zMay illustrate the areas where distinction has to be made zMay start to illustrate some of the key acoustic factors that are crucial in dysarthric speech and VR zNon adapted commercial product functionally useless for this population
14
Subsidiary Questions zIs dysarthric speech consistent? zDoes the underlying acoustic/soundwave pattern contain consistent differences in contrasts that are not perceptually distinguishable? zCan consistency be trained in the absence of intelligibility? zDoes increasing consistency increase intelligibility?
15
Normal speech “alarm” 1&2
16
Normal speech “alarm” 2
17
Normal speech “television”
18
Dysarthric speech “television”
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.