Chapter 15 Speech Synthesis Principles 15.1 History of Speech Synthesis 15.2 Categories of Speech Synthesis 15.3 Chinese Speech Synthesis 15.4 Speech Generation.

Slides:



Advertisements
Similar presentations
Presented by Erin Palmer. Speech processing is widely used today Can you think of some examples? Phone dialog systems (bank, Amtrak) Computers dictation.
Advertisements

Chapter Thirteen: Multiplexing and Multiple- Access Techniques.
Tamara Berg Advanced Multimedia
Analysis and Digital Implementation of the Talk Box Effect Yuan Chen Advisor: Professor Paul Cuff.
Voiceprint System Development Design, implement, test unique voiceprint biometric system Research Day Presentation, May 3 rd 2013 Rahul Raj (Team Lead),
Chapter 4 - The World Was Once Analog Introduction Sound, Vibrations, And Analog Recording (phonograph record) - An analog device maintains an exact physical.
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
1 Frequency Domain Analysis/Synthesis Concerned with the reproduction of the frequency spectrum within the speech waveform Less concern with amplitude.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
MULTI LINGUAL ISSUES IN SPEECH SYNTHESIS AND RECOGNITION IN INDIAN LANGUAGES NIXON PATEL Bhrigus Inc Multilingual & International Speech.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
Bootstrapping a Language- Independent Synthesizer Craig Olinsky Media Lab Europe / University College Dublin 15 January 2002.
Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Introduction to Speech Synthesis ● Key terms and definitions ● Key processes in sythetic speech production ● Text-To-Phones ● Phones to Synthesizer parameters.
SWE 423: Multimedia Systems Chapter 7: Data Compression (1)
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Text-To-Speech Synthesis An Overview. What is a TTS System  Goal A system that can read any text Automatic production of new sentences Not just audio.
1 Speech synthesis 2 What is the task? –Generating natural sounding speech on the fly, usually from text What are the main difficulties? –What to say.
Digital signal Processing Digital signal Processing ECI Semester /2004 Telecommunication and Internet Engineering, School of Engineering, South.
Chapter 14 Recording and Editing Sound. Getting Started FAQs: − How does audio capability enhance my PC? − How does your PC record, store, and play digital.
Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.
A Text-to-Speech Synthesis System
Natural Language Processing and Speech Enabled Applications by Pavlovic Nenad.
Track: Speech Technology Kishore Prahallad Assistant Professor, IIIT-Hyderabad 1Winter School, 2010, IIIT-H.
COMPUTER SYSTEM.
Arabic TTS (status & problems) O. Al Dakkak & N. Ghneim.
Digital Sound and Video Chapter 10, Exploring the Digital Domain.
04/08/04 Why Speech Synthesis is Hard Chris Brew The Ohio State University.
Chapter 7 SPEECH COMMUNICATIONS
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
CP SC 881 Spoken Language Systems. 2 of 23 Auditory User Interfaces Welcome to SLS Syllabus Introduction.
Foundations of Computer Science Computing …it is all about Data Representation, Storage, Processing, and Communication of Data 10/4/20151CS 112 – Foundations.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
1 Computational Linguistics Ling 200 Spring 2006.
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
Chapter 15 Recording and Editing Sound. 2Practical PC 5 th Edition Chapter 15 Getting Started In this Chapter, you will learn: − How sound capability.
Segmental encoding of prosodic categories: A perception study through speech synthesis Kyuchul Yoon, Mary Beckman & Chris Brew.
Testing phonological processing Katrin Skoruppa, PhD
Page 1 NOLISP, Paris, May 23rd 2007 Audio-Visual Audio-Visual Subspaces Audio Visual Reduced Audiovisual Subspace Principal Component & Linear Discriminant.
1 Speech Synthesis User friendly machine must have complete voice communication abilities Voice communication involves Speech synthesis Speech recognition.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
Introduction to Computational Linguistics
© 2013 by Larson Technical Services
CONTENTS INTRODUCTION TO A.I. WORKING OF A.I. APPLICATIONS OF A.I. CONCLUSIONS ON A.I.
Chapter 12 The Principles of Computer Music Contents Digital Audio Processing Noise Reduction Audio Compression Digital Rights Management (DRM)
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.
1 Speech Processing. 2 Speech Processing: Text:  Spoken language processing Huang, Acero, Hon, Prentice Hall, 2000  Discrete time processing of speech.
Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Chapter 1. SIGNAL PROCESSING:  Signal processing is concerned with the efficient and accurate extraction of information in a signal process.  Signal.
IIT Bombay ISTE, IITB, Mumbai, 28 March, SPEECH SYNTHESIS PC Pandey EE Dept IIT Bombay March ‘03.
XP Practical PC, 3e Chapter 14 1 Recording and Editing Sound.
Chapter 15 Recording and Editing Sound
Computer Science: An Overview Eleventh Edition
Natural Language Processing and Speech Enabled Applications
Mr. Darko Pekar, Speech Morphing Inc.
Chapter 5 Homomorphic Processing(1)
Text-To-Speech System for English
Speech Generation: From Concept and from Text
EXPERIMENTS WITH UNIT SELECTION SPEECH DATABASES FOR INDIAN LANGUAGES
Indian Institute of Technology Bombay
Artificial Intelligence 2004 Speech & Natural Language Processing
Fromkin's Utterance Generator
Presentation transcript:

Chapter 15 Speech Synthesis Principles 15.1 History of Speech Synthesis 15.2 Categories of Speech Synthesis 15.3 Chinese Speech Synthesis 15.4 Speech Generation and Synthesizer

15.1 History of Speech Synthesis (1) It should back-trace to17 century. First synthesizer was invented in 18 century. Basically it is a kind of machine to generate voice like sound by mechanics, later by electronics, at last now by computer. Fant, Flanagan and Klatt’s contribution. The speech communication process.

15.2 Categories of Speech Synthesis (1) Basically the speech synthesis systems could be classified into four categories: 1. Parameter Based System 2. Rule Based System 3. Waveform Based System 4. Text to Speech System

Categories of Speech Synthesis (1) Parametric Analysis-Synthesis It takes syllable or semi-syllable or phoneme to be synthetic unit. At first, the analysis for the units is performed, that means to extract the parameters one frame by one frame, and after encoding these parameters compose a speech database. When output, corresponding parameters are taken from the base, after editing and concatenating, sent to the synthesizer in which the parameters control the generation of the signal to output. Used parameters include amplitude(intensity), fundamental frequency(pitch), formants(timbre). Data rate is low, structure is complex, quality is poorer compared with waveform synthesis.

Categories of Speech Synthesis (2) Synthesis-by-rule It generates speech by using phonetic rules. The stored units are small : parameters of phonemes, di-phones,semi-syllable and syllables and rules of how to compose the syllables by phonemes and words or sentences by syllables. The rules must consider the effects of co-articulation and so on. Rules could be divided into formants frequency rules, duration rules and tone rules and intonation rules. The required memory is more less than parametric approach.But the quality is low, because the rules are not so good and complete.

Categories of Speech Synthesis (3) Waveform Coding SynthesisIt takes word, phrase or sentence as the synthesis unit. The original units are recorded and encoded(probably compressed) to compose the speech-base. When output, the corresponding waveform is taken and after some processing the signal is generated and output.This kind of system is easy to construct, low cost and the natureness is good. But it required much more memory. With the development of IC, it is getting a good way for synthesis. Simple example is the device for reporting the bus stations. A lot of chips can do almost same simple things.

Categories of Speech Synthesis (4) Text-to-Speech Conversion System System input is a text string. It contains linguistic processing, semantic dictionary; phonological processing, phonetic dictionary;phonetic processing(prosodic rules, pronounciation variants rules and so on) and speech waveform generation. So real TSS is an Artificial Intelligence system. It is one research direction for a lot of people. Now, the intelligibility is OK, but the natureness is not so good. There are still a lot of work to be done in this area.

Categories of Speech Synthesis (5) Some basic terminology : Synthetic Unit Synthetic Parameters Database for synthesis Speech Synthesizer Quality of Synthetic Speech

15.3 Chinese Speech Synthesis (1) Started in 1960’s Got fast development in later 1970’s. Now all kind of synthesizer exist. Some are compressed waveform in firmware and replay when needed. PSOLA(Pitch-Synchronous Overlay Addition) got wide application in waveform based synthesis. Now the techniques are getting better.

Chinese Speech Synthesis (2) A Chinese Text-to-Speech System consists of : Text Analyzer – Word Segmentation Program to segment sentence into word sequence. The segment error rate is about 10%. Main troubles will be ambiguity (overlay) of two words and words out of dictionary (new words). Although there are a lot of approaches were proposed, no one can fully solve these problems. According to the word and phonetic dictionary the recorded speech will be taken from the database. Rules for editing and processing the recorded speech

Chinese Speech Synthesis (3) After processing the regenerated signals are sent into the audio card to generate speech. During the processing we can make all kinds of changes.

15.4 Speech Generation and Synthesizer Please see the book on page