Text-To-Speech System for English -Vishal Kanjariya
Text-to-Speech Synthesis Text-to-Speech (TTS) synthesizer : It is a Mobile based system that should be able to read any text aloud whether it was directly introduced in the mobile algo. by an operator or scanned and submitted to an Optical Character Recognition (OCR) system Voice response system are application of speech synthesis technology and broadly classified in two types 1. Limited vocabulary system 2. Unlimited vocabulary system
General functional diagram of Text-to-Speech system NATURAL LANGUAGE PROCESSING Linguistic Formalism Inference Engines Logical Inferences DIGITAL SIGNAL PROCESSING Mathematical Models Algorithms Computations Phonemes Prosody Speech TEXT-TO-SPEECH SYNTHESIZER
Human Speech Production System
Architecture of TTS system Text Analysis Document Structure Detection Text Normalization Linguistic Analysis Prosodic Analysis Pitch & Duration attachment Speech Synthesis Voice Rendering Raw text or tagged text Tagged text Tagged phone Controls Phonetic Analysis Grapheme-to-Phoneme Conversion
Requirement
Concatenative Synthesis It requires neither rules nor manual tuning. Stores segments Choice of segments eg. Words, Syllables, Demi-syllables, Diaphones, Phones. Segment concatenation
Text-to-Speech Synthesis System for English Language 1. English Script 2. Design of Synthesizer a. Speech Synthesis Model b. Structure of Database c. Linguistic Rules 3. Implementation of Synthesizer a. Database Creation b. Algorithm c. Applying Rules
Algorithm Initialize the program - Initialize GUI. - Load all sound files in Buffer array. - Load default values of rules. On key type event (Marathi keyboard help) - If typed key does not form a text which is displayed in loaded help, then remove the old help table and load a new help which displays a possible combinations of typed consonant followed by all vowels.
Synthesize speech - Read readable text (English format) - Normalize input text. - Parse this text into words. - Parse these words into phonemes (Speech Units). - For each word, process all units as follows * Get index of Unit * Get index of previous and next unit * Calculate the values of Length, decay and silence by applying rules. * Apply these values to the indexed speech segment.
Most frequent words Most of the speech and text databases in other languages include more spontaneous and daily used utterances, in order to achieve a more natural evaluation of the language tendencies and evolution. Thus we have chosen for our database a collection of newspaper articles and a random selection of sentences. The statistics presented in Table I are based on the News-RO corpus and define the top 40 most frequent words and their frequencies. As expected, among the most frequent words as in any language, we find mainly prepositions. The difference between the top most words and even the close up followers is of significant importance. There are around 60,000 different words in the corpus with a total of 1,730,000 occurrences. Read readable text (English format) Every word given different probability. Example ~ that, is, am, they, etc.
Most frequent symbols top 40 most frequent syllables extracted from the text corpus. We also present their accented characteristic, as in most of the English languages, the accent positioning can change the meaning of the word. There are a total of 2920 different syllables in the TTS-text corpus and they add up to about 48,000 syllables. Syllable Accent Frequency[%] a 0 3.02 o 1 0.66 te 0 2.36 ta 1 0.61 de 1 2.13 ni 0 0.57 Etc…
Most frequent phonemes and diphones Phonemes and diphones are important in all text-to-speech systems. As they are the building blocks of any word or utterance, their full coverage and correct use determine the degree of freedom for the resulting system. The TTS phoneset used comprises 32 phonemes presented in and 731 diphones. The diphones have been counted based on their occurrence in at least 10 words in the around 120,000. Phoneme Example word Frequency[%] e He 10.64 L Lol 4.67 a ram 10.33 S Sister 4.12 i nirma 7.09 O Motor 4.05 r are 6.78 K Act 3.74 u you 6.67 M mother 3.39
On amplify event the synthesize speech On waveform Event draw waveform of synthesize
Applications 1. Talking Calculator 2. Computer generated wiring instruction 3. Aids for the blind 4. Telephone inquiry service 5. Teaching machines
Bibliography 1. Indian TTS convergence ministry of india 2. Romanian language statistics and resources for text-to- speech systems. 4. http://www.phobos.ro/demos/tts/index.html. 5.http://www.baum.ro/index.php?language=ro&pagina=tts online.