Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001.

Slides:



Advertisements
Similar presentations
STQ Workshop, Sophia-Antipolis, February 11 th, 2003 Packet loss concealment using audio morphing Franck Bouteille¹ Pascal Scalart² Balazs Kövesi² ¹ PRESCOM.
Advertisements

Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
EE513 Audio Signals and Systems Digital Signal Processing (Synthesis) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
A System for Hybridizing Vocal Performance By Kim Hang Lau.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Communications Systems ASU Course EEE455/591 Instructor: Joseph Hui Monarch Institute of Engineering.
Digital Signal Processing – Chapter 11 Introduction to the Design of Discrete Filters Prof. Yasser Mostafa Kadah
Introduction The aim the project is to analyse non real time EEG (Electroencephalogram) signal using different mathematical models in Matlab to predict.
A 12-WEEK PROJECT IN Speech Coding and Recognition by Fu-Tien Hsiao and Vedrana Andersen.
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
6/3/20151 Voice Transformation : Speech Morphing Gidon Porat and Yizhar Lavner SIPL – Technion IIT December
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
4/25/2001ECE566 Philip Felber1 Speech Recognition A report of an Isolated Word experiment. By Philip Felber Illinois Institute of Technology April 25,
CS 188: Artificial Intelligence Fall 2009 Lecture 21: Speech Recognition 11/10/2009 Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
Digital Communications I: Modulation and Coding Course Term 3 – 2008 Catharina Logothetis Lecture 2.
A PRESENTATION BY SHAMALEE DESHPANDE
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.
Representing Acoustic Information
Introduction to Spectral Estimation
Speech Signal Processing I Edmilson Morais and Prof. Greg. Dogil October, 25, 2001.
Practical Signal Processing Concepts and Algorithms using MATLAB
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
EE513 Audio Signals and Systems Digital Signal Processing (Systems) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Lecture 41 Practical sampling and reconstruction.
Modulation, Demodulation and Coding Course Period Sorour Falahati Lecture 2.
Isolated-Word Speech Recognition Using Hidden Markov Models
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling V. Karjigi , P. Rao Dept. of Electrical Engineering,
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin EE445S Real-Time Digital Signal Processing Lab Fall.
Prepared by: Waleed Mohamed Azmy Under Supervision:
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Lecture 2 Signals and Systems (I)
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
MPEG Audio coders. Motion Pictures Expert Group(MPEG) The coders associated with audio compression part of MPEG standard are called MPEG audio compressor.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Page 1 NOLISP, Paris, May 23rd 2007 Audio-Visual Audio-Visual Subspaces Audio Visual Reduced Audiovisual Subspace Principal Component & Linear Discriminant.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
Speech Signal Processing I
Module Overview. Aims apply your programming skills to an applied study of Digital Image Processing, Digital Signal Processing and Neural Networks investigate.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
National Taiwan University, Taiwan
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
Lecture#10 Spectrum Estimation
Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
High Quality Voice Morphing
Mr. Darko Pekar, Speech Morphing Inc.
ARTIFICIAL NEURAL NETWORKS
Vocoders.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Linear Predictive Coding Methods
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001

The Speech Signal No-stacionary signal No-stacionary signal Voiced – almost periodic (Concept of pitch) Voiced – almost periodic (Concept of pitch) Unvoiced (aleatory) Unvoiced (aleatory) Transitions (Bursts,...) Transitions (Bursts,...) Range of the Pitch Range of the Pitch Male : Male : Female : Female :

Sampling Theory Low-pass filter SampleHold on Low-pass filter X(n) has to be limited in band The sampling frequency has to be higher or equal to 2 times the maximum frequency in x(n)

Linear Filters Finite impulse response filters

Matlab : Graphical visualization – Optimization in a hiperbolic (quadratic) surface Mean squared error - E Wei ght

SDSP : Looking through time time amplitude Speech signal : Analog and digital Sampling rate quantization

SDSP : Transformation and Digital filters Transformations Z-Transforms, Fourier transforms Digital filters FIR, IIR

SDSP – Frame based analysis Hanning window : w Waveform multiplied for the hanning window : xw Magnitude of the spectrum of xw Freq. Response of the LP-filter

SDSP - Looking at frequency components through time Current Previous Current Previous Before smoothing After smoothing

SDSP : Vector quantization Voronoi Space : Centroid and Distortion meassure

TTS - Waveform generation for TTS Analysis and Resynthesis – Coding and Decoding Analysis and Resynthesis – Coding and Decoding LP Analysis A(z) InverseFilter 1 A(z) PitchMarks Prototypes Sampling SynthesisFilter A( z ) TFIResidue Synthesis x e E n StorageEnviroment x A A A F o OriginalSpeechSignal SynthesizedSpeechSignal Coding Decoding Prosodic Information. Marks Marks F o E n U/UV U/UV.. Parametrization : Mapping the waveform into a set of parameters Reconstruction: Synthesis of the waveform from the set of parameters. Prosody : F0 F0 Duration Duration Amplitude Amplitude A – LP coeficients e – LP residue En – Prototypes Fo – Fundamental frequency U/UV – Voiced / Unvoiced transitions

TTS - Waveform generation for TTS Speech coding Speech coding Parametric coders, Waveform coders, Hybrid coders Parametric coders, Waveform coders, Hybrid coders TTS – Concatenative approach TTS – Concatenative approach Time scale and Frequency scale modifications Time scale and Frequency scale modifications Spectral smoothings Spectral smoothings Unit selection Unit selection OriginalResynthesized sin(x+  ) Modified : sin(x+  ) OriginalTTS

ASR - Automatic Speech Recognition Front-End Signal Processing Front-End Signal Processing Feature extraction Feature extraction Perceptual domain, Articulatory domain Perceptual domain, Articulatory domain Acoustic modeling Acoustic modeling HMM : Hidden Markov Model HMM : Hidden Markov Model ANN/HMM : Hybrid models - Artificial Neural Network and HMM ANN/HMM : Hybrid models - Artificial Neural Network and HMM Statistical Language Modeling Statistical Language Modeling N-grammars, smoothing techniques N-grammars, smoothing techniques Search : Decoding Search : Decoding Viterbi, Stack decoding,... Viterbi, Stack decoding,...

ASR – HMM - Topology Ergotic model Left-right model

ASR – HMM – Basic principle aaaaa a a aa a aa a

ASR – HMM - Viterbi alignment

ASR – HMM – Forward-Backward

ASR – ANN/HMM

Evaluation : Exercises and Simulations List of Exercises List of Exercises SDSP, TTS, ASR SDSP, TTS, ASR Simulations Simulations SDSP SDSP Vector quantization Vector quantization TTS TTS Waveform Interpolation Waveform Interpolation ASR ASR Acoustic modeling using : HMM and ANN+HMM Acoustic modeling using : HMM and ANN+HMM Language modeling Language modeling Decoding Decoding

Evaluation : Report Reports Reports Write the analysis and results of the simulation in a format of a paper Write the analysis and results of the simulation in a format of a paper 4 pages, two colunms. Sections Abstract Introduction Brief theoretical description of the method Methodology used to perform the experiment Results Conclusions and suggestions for further works Bibliograph

Days of classes Normal semester 2001 October : 18, 25, (01 is a hollyday) November : 8, 15, 22, 29 December : 6,13, January : 10,17,24,31 February : 7,14 Total : 15 days. Option two 2001 October : 18, 25 November : 8, 15, 22, February : 7,14 March : An one week block seminar : 1.5 hours a day. Total : 13 days. Option one 2001 October : 16,18,23,25,30 November : 6,8,13,15,20,22,27, February : 5,7,12,14 Total : 17 days. Option three 2002 March : An one week block seminar : 3 hours a day. Equivalent to 15 days