Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University.

Similar presentations


Presentation on theme: "Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University."— Presentation transcript:

1

2 Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University of Sheffield

3 Introduction Building voices for VIVOCA (communication aids) Speech synthesis techniques Future research: personalisation of synthetic voices

4 Current speech synthesis: communication aids High quality voices available E.g. Toby Churchill Lightwriter – DECtalk™ (Fonix) for American English – Acapela for British English Personalisation limited: age, gender, language

5 Personalisation Voice = identity – Gender – Age – Geographic background – Socio-economic background – Ethnic background – As that individual Maintains social relationships Maintains social closeness Sets group membership

6 VIVOCA Voice Input Voice Output Communication Aid Speech Recognition Dysarthric speech input Text-to-Speech Synthesis Recognised text Intelligible synthesised speech output Retain elements of clients’ identity for synthesised speech output Intelligible and personalised synthesised speech output

7 VIVOCA: personalisation Sheffield/Barnsley user group Retain local accent – geographic identity Speaker database – Arctic database: 593 + 20 sentences Professional local speakers – Ian McMillan – Christa Ackroyd

8 Concatenative synthesis Input data Text input Synthesised speech Speech recordings Unit segmentation Unit database Unit selection Concatenation + smoothing i a sh Festvox: http://festvox.org/ +… ++ …

9 Concatenative synthesis High quality Natural sounding Sounds like original speaker  Need a lot of data (~600 sentences)  Can be inconsistent  Difficult to manipulate prosody

10 HMM synthesis yes yes

11 HMM synthesis procedure Input data Text input Speaker model Synthesised speech Speech recordings TrainingSynthesis e t HTS http://hts.sp.nitech.ac.jp/

12 HMM synthesis Consistent Intelligible Needs relatively little input (~20 mins) Can be adapted with small amount of data (>5 sentences) Easier to manipulate  Buzzy quality  Less natural than concatenative

13 Future research Further personalisation for individuals with progressive speech disorders – Capturing the essence of a voice Voice banking – Before deterioration Adaptation using HMM synthesis – Before or during deterioration

14 Thank you This work is sponsored by EPSRC Doctoral Training grant For further details of VIVOCA see: http://www.shef.ac.uk/cast/ Email: S.Creer@dcs.shef.ac.uk


Download ppt "Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University."

Similar presentations


Ads by Google