Download presentation
Presentation is loading. Please wait.
1
Text to Speech using AWS Polly
developed by Stuart James (ESRF) presented by Andy Götz (ESRF)
2
Robots are coming … R.U.R. is a 1920 science fiction play by the Czech writer Karel Čapek. R.U.R. stands for Rossumovi Univerzální Roboti (Rossum’s Universal Robots)
3
History of Project ESRF had a TextTalker (TTS device server) for alarm messages in the control room since the beginning of the ESRF Based on Microsoft TTS SDK it required having a Windows PC Recent Windows TTS SDK was not as good and we wanted to avoid Windows Looked for a long time for a Linux alternative. Found many solutions but most had mediocre voices (even Festival) Last year discovered Amazon’s AWS Polly service – a low-cost cloud service for converting text to speech with high quality voices and many languages
4
AWS Polly – a cloud web service
Client APIs exist for Python, C++, Java, … Text conversion takes < 500 ms $4.00 per 1 million characters for speech (~23 hours) 5 million characters free per month for first 12 months
5
TextToSpeech architecture
6
TextToSpeech device server
An advanced device server written for converting text to speech based on the AWS Polly service Written in C++, uses Pulse api for audio Caches messages to limit number of calls to Polly Keeps statistics and track of messages converted to speech Device classes catalogue entry:
7
TextTo Speech Flow Control
8
Device server C++ notes
TTS library is 100% unit tested Using pkg-config to configure tango Code checking with clang Depends on C++14 (uses futures) Try it !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.