Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001.

Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001

The Speech Signal No-stacionary signal No-stacionary signal Voiced – almost periodic (Concept of pitch) Voiced – almost periodic (Concept of pitch) Unvoiced (aleatory) Unvoiced (aleatory) Transitions (Bursts,...) Transitions (Bursts,...) Range of the Pitch Range of the Pitch Male : Male : Female : Female :

Sampling Theory Low-pass filter SampleHold on Low-pass filter X(n) has to be limited in band The sampling frequency has to be higher or equal to 2 times the maximum frequency in x(n)

Linear Filters Finite impulse response filters

Matlab : Graphical visualization – Optimization in a hiperbolic (quadratic) surface Mean squared error - E Wei ght

SDSP : Looking through time time amplitude Speech signal : Analog and digital Sampling rate quantization

SDSP : Transformation and Digital filters Transformations Z-Transforms, Fourier transforms Digital filters FIR, IIR

SDSP – Frame based analysis Hanning window : w Waveform multiplied for the hanning window : xw Magnitude of the spectrum of xw Freq. Response of the LP-filter

SDSP - Looking at frequency components through time Current Previous Current Previous Before smoothing After smoothing

SDSP : Vector quantization Voronoi Space : Centroid and Distortion meassure

TTS - Waveform generation for TTS Analysis and Resynthesis – Coding and Decoding Analysis and Resynthesis – Coding and Decoding LP Analysis A(z) InverseFilter 1 A(z) PitchMarks Prototypes Sampling SynthesisFilter A( z ) TFIResidue Synthesis x e E n StorageEnviroment x A A A F o OriginalSpeechSignal SynthesizedSpeechSignal Coding Decoding Prosodic Information. Marks Marks F o E n U/UV U/UV.. Parametrization : Mapping the waveform into a set of parameters Reconstruction: Synthesis of the waveform from the set of parameters. Prosody : F0 F0 Duration Duration Amplitude Amplitude A – LP coeficients e – LP residue En – Prototypes Fo – Fundamental frequency U/UV – Voiced / Unvoiced transitions

TTS - Waveform generation for TTS Speech coding Speech coding Parametric coders, Waveform coders, Hybrid coders Parametric coders, Waveform coders, Hybrid coders TTS – Concatenative approach TTS – Concatenative approach Time scale and Frequency scale modifications Time scale and Frequency scale modifications Spectral smoothings Spectral smoothings Unit selection Unit selection OriginalResynthesized sin(x+  ) Modified : sin(x+  ) OriginalTTS

ASR - Automatic Speech Recognition Front-End Signal Processing Front-End Signal Processing Feature extraction Feature extraction Perceptual domain, Articulatory domain Perceptual domain, Articulatory domain Acoustic modeling Acoustic modeling HMM : Hidden Markov Model HMM : Hidden Markov Model ANN/HMM : Hybrid models - Artificial Neural Network and HMM ANN/HMM : Hybrid models - Artificial Neural Network and HMM Statistical Language Modeling Statistical Language Modeling N-grammars, smoothing techniques N-grammars, smoothing techniques Search : Decoding Search : Decoding Viterbi, Stack decoding,... Viterbi, Stack decoding,...

ASR – HMM - Topology Ergotic model Left-right model

ASR – HMM – Basic principle aaaaa a a aa a aa a

ASR – HMM - Viterbi alignment

ASR – HMM – Forward-Backward

ASR – ANN/HMM

Evaluation : Exercises and Simulations List of Exercises List of Exercises SDSP, TTS, ASR SDSP, TTS, ASR Simulations Simulations SDSP SDSP Vector quantization Vector quantization TTS TTS Waveform Interpolation Waveform Interpolation ASR ASR Acoustic modeling using : HMM and ANN+HMM Acoustic modeling using : HMM and ANN+HMM Language modeling Language modeling Decoding Decoding

Evaluation : Report Reports Reports Write the analysis and results of the simulation in a format of a paper Write the analysis and results of the simulation in a format of a paper 4 pages, two colunms. Sections Abstract Introduction Brief theoretical description of the method Methodology used to perform the experiment Results Conclusions and suggestions for further works Bibliograph

Days of classes Normal semester 2001 October : 18, 25, (01 is a hollyday) November : 8, 15, 22, 29 December : 6,13,20 2002 January : 10,17,24,31 February : 7,14 Total : 15 days. Option two 2001 October : 18, 25 November : 8, 15, 22, 29 2002 February : 7,14 March : An one week block seminar : 1.5 hours a day. Total : 13 days. Option one 2001 October : 16,18,23,25,30 November : 6,8,13,15,20,22,27,29 2002 February : 5,7,12,14 Total : 17 days. Option three 2002 March : An one week block seminar : 3 hours a day. Equivalent to 15 days

Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001.

Similar presentations

Presentation on theme: "Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001.

Similar presentations

Presentation on theme: "Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001."— Presentation transcript:

Similar presentations

About project

Feedback