Presentation is loading. Please wait.

Presentation is loading. Please wait.

Indian Institute of Technology Bombay

Similar presentations


Presentation on theme: "Indian Institute of Technology Bombay"— Presentation transcript:

1 Indian Institute of Technology Bombay
SPEECH SYNTHESIS Indian Institute of Technology Bombay Department of Computer Science and Engineering Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay

2 Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay
INDIAN INSTITUTE OF TECHNOLOGY BOMBAY TEXT TO SPEECH FOR MARATHI Synthesis Methods 1. Articulatory Synthesis Not well developed 2. Formant Synthesis Poor Quality 3. Concatenative Synthesis -- Good and mostly used method Concatenative Synthesis It employs : Pre - stored Speech Units Speech Units: 1. Sentences and phrases Usefull d in small applications like appliance responses Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay

3 Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay
INDIAN INSTITUTE OF TECHNOLOGY BOMBAY 2. Words : Limited Vocabulary systems, used in raiway announcements 3. Diaphones: Used in Unlimited Vocabulary TTS application Quality : intelligible and OK but requires all diaphone date base 4. Phoneme : Used in Unlimited vocabulary TTS applications Quality : Lowest language speech Unit so more concatenative distortion but very small data base. Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay

4 Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay
The quality is progressively lower in TTS using lower language units. But It is challenge to make the system using (3) and (4) intelligible and reasonably good quality Experimental Systems based on (3) and (4) are under investigation at IIT Bombay Quality Number of sentences low high Sentences/phrases Words and Phrasesl Diaphone concatenation phoneme concatenation INDIAN INSTITUTE OF TECHNOLOGY BOMBAY Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay

5 Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay
INDIAN INSTITUTE OF TECHNOLOGY BOMBAY Text Analysis Text Normalisation Linguistic Analysis TTS ARCHITECTURE Tagged Text Phonetic Analysis Grapheme to Phoneme Conversion Tagged Phonemes Prosodic Analysis Pitch and Duration Controls Speech Synthesis Voice rendering Audio stream Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay

6 Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay
INDIAN INSTITUTE OF TECHNOLOGY BOMBAY Size of vocabulary depends on the approach Used At diaphone level there are approx 500 basic uttarances are required to be stored Each Unit requires approximately 6000 samples requiring 30,00000 bytes (3 MB) (8 bit samples at 8000 samples/sec) with 4 variations becomes 12 MB At phoneme level: Consonants are very small in duration (500 samples) taking total size to approx 40*500 bytes= 20 K plus 12 vowels each requiring 6000 samples 72 K. Approx 100K bytes are adequate. . It is our basic philosophy to use only one basic sample and create variants by processing the speech signal for the requirements of pitch duration stress etc.. Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay

7 Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay
INDIAN INSTITUTE OF TECHNOLOGY BOMBAY Demonstration of the TTS Employing Diaphones The system can take any text input and produces the phonetic audio output It is does some processing of waveform while concatenating the waveforms to create better sound effects like decay etc. Tags have been predefined for forming words so that duration of individual units is modified. No sentence level prosodic has been put up. Future Work 1. Make Rules for generating tags Difficulty: No linguistic research available on this aspect on Marathi 2. Remove concatenative distortion by processing signals, Should be possible to some extent. Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay

8 Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay
INDIAN INSTITUTE OF TECHNOLOGY BOMBAY THANK YOU Text to Speech Synthesis Prof Moreshwear R Bhujade, CSE, IIT Bombay


Download ppt "Indian Institute of Technology Bombay"

Similar presentations


Ads by Google