Speech Signal Processing I

Slides:



Advertisements
Similar presentations
Building an ASR using HTK CS4706
Advertisements

Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Communications Systems ASU Course EEE455/591 Instructor: Joseph Hui Monarch Institute of Engineering.
Introduction The aim the project is to analyse non real time EEG (Electroencephalogram) signal using different mathematical models in Matlab to predict.
A 12-WEEK PROJECT IN Speech Coding and Recognition by Fu-Tien Hsiao and Vedrana Andersen.
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
6/3/20151 Voice Transformation : Speech Morphing Gidon Porat and Yizhar Lavner SIPL – Technion IIT December
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
4/25/2001ECE566 Philip Felber1 Speech Recognition A report of an Isolated Word experiment. By Philip Felber Illinois Institute of Technology April 25,
CS 188: Artificial Intelligence Fall 2009 Lecture 21: Speech Recognition 11/10/2009 Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
A PRESENTATION BY SHAMALEE DESHPANDE
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.
T Digital Signal Processing and Filtering
Speech Signal Processing I Edmilson Morais and Prof. Greg. Dogil October, 25, 2001.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
Statistical automatic identification of microchiroptera from echolocation calls Lessons learned from human automatic speech recognition Mark D. Skowronski.
Isolated-Word Speech Recognition Using Hidden Markov Models
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling V. Karjigi , P. Rao Dept. of Electrical Engineering,
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
Prepared by: Waleed Mohamed Azmy Under Supervision:
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
Jacob Zurasky ECE5526 – Spring 2011
MPEG Audio coders. Motion Pictures Expert Group(MPEG) The coders associated with audio compression part of MPEG standard are called MPEG audio compressor.
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001.
Advanced Topics in Speech Processing (IT60116) K Sreenivasa Rao School of Information Technology IIT Kharagpur.
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
Page 1 NOLISP, Paris, May 23rd 2007 Audio-Visual Audio-Visual Subspaces Audio Visual Reduced Audiovisual Subspace Principal Component & Linear Discriminant.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
Module Overview. Aims apply your programming skills to an applied study of Digital Image Processing, Digital Signal Processing and Neural Networks investigate.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
National Taiwan University, Taiwan
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.
Automatic Speech Recognition A summary of contributions from multiple disciplines Mark D. Skowronski Computational Neuro-Engineering Lab Electrical and.
Digital Signal Processing Using MATLAB®V.4 Associate Prof. Supervisor of Master Degree Student LILI Office: Room 402, Electromechanic Building
Introduction Part I Speech Representation, Models and Analysis Part II Speech Recognition Part III Speech Synthesis Part IV Speech Coding Part V Frontier.
HMM-Based Speech Synthesis Erica Cooper CS4706 Spring 2011.
CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31,
1 Electrical and Computer Engineering Binghamton University, State University of New York Electrical and Computer Engineering Binghamton University, State.
Automated Speach Recognotion Automated Speach Recognition By: Amichai Painsky.
Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
Mr. Darko Pekar, Speech Morphing Inc.
MATLAB Distributed, and Other Toolboxes
ARTIFICIAL NEURAL NETWORKS
Vocoders.
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Linear Predictive Coding Methods
CS 188: Artificial Intelligence Fall 2008
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Stuttgart, October 18, 2001

Goals of the Course Our part Your part ? Basic theoretical concepts about Speech Signal Processing - SDSP Waveform generation for TTS systems - TTS Automatic Speech Recognition (Statistical approach)- ASR Fundaments of programing in Matlab It will be the tool used for our simulations Your part ? Describe and justify the important aspects and drawbacks in the algorithm. Next term: Speech Signal Processing II Going deeper into more Theoretical and Pratical aspects of : SSP, TTS and ASR.

Tutorial of Matlab Principles of linear algebra Programing in Matlab Vectors, Matrices, linear systems Programing in Matlab Variables, operators, ... if statements, switch statements, for loops, while loops, continue statements, break statements, ... I/O operations Graphical visualization Executable files Subroutines

Matlab : Graphical visualization [X,Y] = meshgrid(-8:.5:8); R = sqrt(X.^2 + Y.^2) + eps; Z = sin(R)./R; mesh(X,Y,Z,'EdgeColor','black') surf(X,Y,Z,'FaceColor','red','EdgeColor','none'); camlight left; lighting phong

Matlab : Graphical visualization – Optimization in a hiperbolic (quadratic) surface Mean squared error - E Weight

SDSP : Looking through time Speech signal : Analog and digital amplitude quantization Sampling rate time

SDSP : Transformation and Digital filters Transformations Z-Transforms, Fourier transforms Digital filters FIR, IIR

SDSP – Frame based analysis Waveform multiplied for the hanning window : xw Hanning window : w Magnitude of the spectrum of xw Freq. Response of the LP-filter

SDSP - Looking at frequency components through time Current Previous Before smoothing After smoothing

SDSP : Vector quantization Voronoi Space : Centroid and Distortion meassure

TTS - Waveform generation for TTS Analysis and Resynthesis – Coding and Decoding L P A n a l y s i ( z ) I v e r F t 1 c h M k o p S m g T R d u x E O C D f . U / V Parametrization : Mapping the waveform into a set of parameters Reconstruction: Synthesis of the waveform from the set of parameters. Prosody : F0 Duration Amplitude A – LP coeficients e – LP residue En – Prototypes Fo – Fundamental frequency U/UV – Voiced / Unvoiced transitions

TTS - Waveform generation for TTS Speech coding Parametric coders, Waveform coders, Hybrid coders TTS – Concatenative approach Time scale and Frequency scale modifications Spectral smoothings Unit selection Original TTS Original Resynthesized Modified : sin(x+)

ASR - Automatic Speech Recognition Front-End Signal Processing Feature extraction Perceptual domain, Articulatory domain Acoustic modeling HMM : Hidden Markov Model ANN/HMM : Hybrid models - Artificial Neural Network and HMM Statistical Language Modeling N-grammars, smoothing techniques Search : Decoding Viterbi, Stack decoding, ...

ASR – HMM - Topology Ergotic model Left-right model

ASR – HMM – Basic principle

ASR – HMM - Viterbi alignment 5 1 2 ( a ) b c d

ASR – HMM – Forward-Backward

ASR – ANN/HMM

Evaluation : Exercises and Simulations List of Exercises SDSP, TTS, ASR Simulations SDSP Vector quantization TTS Waveform Interpolation ASR Acoustic modeling using : HMM and ANN+HMM Language modeling Decoding

Evaluation : Report Reports 4 pages, two colunms. Sections Write the analysis and results of the simulation in a format of a paper 4 pages, two colunms. Sections Abstract Introduction Brief theoretical description of the method Methodology used to perform the experiment Results Conclusions and suggestions for further works Bibliograph

Days of classes Normal semester 2001 October : 18, 25, (01 is a hollyday) November : 8, 15, 22, 29 December : 6,13,20 2002 January : 10,17,24,31 February : 7,14 Total : 15 days. Option two October : 18, 25 March : An one week block seminar : 1.5 hours a day. Total : 13 days. Option one October : 16,18,23,25,30 November : 6,8,13,15,20,22,27,29 February : 5,7,12,14 Total : 17 days. Option three March : An one week block seminar : 3 hours a day. Equivalent to 15 days