Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University.

Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University of Sheffield

Introduction Building voices for VIVOCA (communication aids) Speech synthesis techniques Future research: personalisation of synthetic voices

Current speech synthesis: communication aids High quality voices available E.g. Toby Churchill Lightwriter – DECtalk™ (Fonix) for American English – Acapela for British English Personalisation limited: age, gender, language

Personalisation Voice = identity – Gender – Age – Geographic background – Socio-economic background – Ethnic background – As that individual Maintains social relationships Maintains social closeness Sets group membership

VIVOCA Voice Input Voice Output Communication Aid Speech Recognition Dysarthric speech input Text-to-Speech Synthesis Recognised text Intelligible synthesised speech output Retain elements of clients’ identity for synthesised speech output Intelligible and personalised synthesised speech output

VIVOCA: personalisation Sheffield/Barnsley user group Retain local accent – geographic identity Speaker database – Arctic database: 593 + 20 sentences Professional local speakers – Ian McMillan – Christa Ackroyd

Concatenative synthesis Input data Text input Synthesised speech Speech recordings Unit segmentation Unit database Unit selection Concatenation + smoothing i a sh Festvox: http://festvox.org/ +… ++ …

Concatenative synthesis High quality Natural sounding Sounds like original speaker  Need a lot of data (~600 sentences)  Can be inconsistent  Difficult to manipulate prosody

HMM synthesis yes yes

HMM synthesis procedure Input data Text input Speaker model Synthesised speech Speech recordings TrainingSynthesis e t HTS http://hts.sp.nitech.ac.jp/

HMM synthesis Consistent Intelligible Needs relatively little input (~20 mins) Can be adapted with small amount of data (>5 sentences) Easier to manipulate  Buzzy quality  Less natural than concatenative

Future research Further personalisation for individuals with progressive speech disorders – Capturing the essence of a voice Voice banking – Before deterioration Adaptation using HMM synthesis – Before or during deterioration

Thank you This work is sponsored by EPSRC Doctoral Training grant For further details of VIVOCA see: http://www.shef.ac.uk/cast/ Email: S.Creer@dcs.shef.ac.uk

Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University.

Similar presentations

Presentation on theme: "Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University.

Similar presentations

Presentation on theme: "Modern speech synthesis: communication aid personalisation Sarah Creer Stuart Cunningham Phil Green Clinical Applications of Speech Technology University."— Presentation transcript:

Similar presentations

About project

Feedback