Download presentation
Presentation is loading. Please wait.
2
Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University of Sheffield
3
Introduction Building voices for VIVOCA (communication aids) Speech synthesis techniques Future research: personalisation of synthetic voices
4
Current speech synthesis: communication aids High quality voices available E.g. Toby Churchill Lightwriter – DECtalk™ (Fonix) for American English – Acapela for British English Personalisation limited: age, gender, language
5
Personalisation Voice = identity – Gender – Age – Geographic background – Socio-economic background – Ethnic background – As that individual Maintains social relationships Maintains social closeness Sets group membership
6
VIVOCA Voice Input Voice Output Communication Aid Speech Recognition Dysarthric speech input Text-to-Speech Synthesis Recognised text Intelligible synthesised speech output Retain elements of clients’ identity for synthesised speech output Intelligible and personalised synthesised speech output
7
VIVOCA: personalisation Sheffield/Barnsley user group Retain local accent – geographic identity Speaker database – Arctic database: 593 + 20 sentences Professional local speakers – Ian McMillan – Christa Ackroyd
8
Concatenative synthesis Input data Text input Synthesised speech Speech recordings Unit segmentation Unit database Unit selection Concatenation + smoothing i a sh Festvox: http://festvox.org/ +… ++ …
9
Concatenative synthesis High quality Natural sounding Sounds like original speaker Need a lot of data (~600 sentences) Can be inconsistent Difficult to manipulate prosody
10
HMM synthesis yes yes
11
HMM synthesis procedure Input data Text input Speaker model Synthesised speech Speech recordings TrainingSynthesis e t HTS http://hts.sp.nitech.ac.jp/
12
HMM synthesis Consistent Intelligible Needs relatively little input (~20 mins) Can be adapted with small amount of data (>5 sentences) Easier to manipulate Buzzy quality Less natural than concatenative
13
Future research Further personalisation for individuals with progressive speech disorders – Capturing the essence of a voice Voice banking – Before deterioration Adaptation using HMM synthesis – Before or during deterioration
14
Thank you This work is sponsored by EPSRC Doctoral Training grant For further details of VIVOCA see: http://www.shef.ac.uk/cast/ Email: S.Creer@dcs.shef.ac.uk
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.