Speech Recognition in Noise

Slides:



Advertisements
Similar presentations
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Spectral envelope analysis of TIMIT corpus using LP, WLSP, and MVDR Steve Vest Matlab implementation of methods by Tien-Hsiang Lo.
Advanced Speech Enhancement in Noisy Environments
2004 COMP.DSP CONFERENCE Survey of Noise Reduction Techniques Maurice Givens.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
Advances in WP1 Nancy Meeting – 6-7 July
Communications & Multimedia Signal Processing Frequency Kalman Noise Reduction Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel.
Communications & Multimedia Signal Processing Meeting 6 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 6 July,
Communications & Multimedia Signal Processing Report of Work on Formant Tracking LP Models and Plans on Integration with Harmonic Plus Noise Model Qin.
Communications & Multimedia Signal Processing Meeting 7 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 23 November,
Communications & Multimedia Signal Processing Analysis of the Effects of Train noise on Recognition Rate using Formants and MFCC Esfandiar Zavarehei Department.
Single-Channel Speech Enhancement in Both White and Colored Noise Xin Lei Xiao Li Han Yan June 5, 2002.
MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION Source: Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop on Author.
Speech Enhancement Based on a Combination of Spectral Subtraction and MMSE Log-STSA Estimator in Wavelet Domain LATSI laboratory, Department of Electronic,
HIWIRE MEETING CRETE, SEPTEMBER 23-24, 2004 JOSÉ C. SEGURA LUNA GSTC UGR.
Communications & Multimedia Signal Processing Formant Based Synthesizer Qin Yan Communication & Multimedia Signal Processing Group Dept of Electronic.
Communications & Multimedia Signal Processing Formant Track Restoration in Train Noisy Speech Qin Yan Communication & Multimedia Signal Processing Group.
Communications & Multimedia Signal Processing 1 Speech Communication for Mobile and Hands-Free Devices in Noisy Environments EPSRC Project GR/S30238/01.
Student: Hsu-Yung Cheng Advisor: Jenq-Neng Hwang, Professor
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing.
Communications & Multimedia Signal Processing Analysis of Effects of Train/Car noise in Formant Track Estimation Qin Yan Department of Electronic and Computer.
Prediction of Fading Broadband Wireless Channels Torbjörn Ekman UniK-University Graduate Center Oslo, Norway JOINT BEATS/Wireless IP seminar, Loen.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
A VOICE ACTIVITY DETECTOR USING THE CHI-SQUARE TEST
Introduction to estimation theory Seoul Nat’l Univ.
Topics covered in this chapter
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
Study of Word-Level Accent Classification and Gender Factors
Data Processing Functions CSC508 Techniques in Signal/Data Processing.
AUDIO COMPRESSION msccomputerscience.com. The process of digitizing audio signals is called PCM PCM involves sampling audio signal at minimum rate which.
1 CS 551/651: Structure of Spoken Language Lecture 8: Mathematical Descriptions of the Speech Signal John-Paul Hosom Fall 2008.
Reporter: Shih-Hsiang( 士翔 ). Introduction Speech signal carries information from many sources –Not all information is relevant or important for speech.
Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.
1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.
Karman filter and attitude estimation Lin Zhong ELEC424, Fall 2010.
Basics of Neural Networks Neural Network Topologies.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
NOISE DETECTION AND CLASSIFICATION IN SPEECH SIGNALS WITH BOOSTING Nobuyuki Miyake, Tetsuya Takiguchi and Yasuo Ariki Department of Computer and System.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
Noise Reduction Two Stage Mel-Warped Weiner Filter Approach.
PhD Candidate: Tao Ma Advised by: Dr. Joseph Picone Institute for Signal and Information Processing (ISIP) Mississippi State University Linear Dynamic.
Speech Enhancement for ASR by Hans Hwang 8/23/2000 Reference 1. Alan V. Oppenheim,etc., ” Multi-Channel Signal Separation by Decorrelation ”,IEEE Trans.
Singer Similarity Doug Van Nort MUMT 611. Goal Determine Singer / Vocalist based on extracted features of audio signal Classify audio files based on singer.
Performance Comparison of Speaker and Emotion Recognition
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
ICASSP 2007 Robustness Techniques Survey Presenter: Shih-Hsiang Lin.
More On Linear Predictive Analysis
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 20,
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
Kalman Filter and Data Streaming Presented By :- Ankur Jain Department of Computer Science 7/21/03.
Speech Enhancement Summer 2009
Spectral and Temporal Modulation Features for Phonetic Recognition Stephen A. Zahorian, Hongbing Hu, Zhengqing Chen, Jiang Wu Department of Electrical.
Figure 11.1 Linear system model for a signal s[n].
Vocoders.
Linear Prediction.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Frequency Domain Perceptual Linear Predicton (FDPLP)
8-Speech Recognition Speech Recognition Concepts
Missing feature theory
Wiener Filtering: A linear estimation of clean signal from the noisy signal Using MMSE criterion.
Speaker Identification:
Presented by Chen-Wei Liu
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

Speech Recognition in Noise Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 25 May, 2004

Contents The use of formant features in speech recognition - Variable-Order LP Formant Tracker with Kalman Filtering - Results Kalman De-noising - Tracking and Filtering the Frequency Trajectories (RASTA) - How Kalman Filter is applied to de-noising problem - Advantages of Kalman

Variable-Order LP Formant tracker Higher order of LP modelling for higher resolution Continuity criteria for better classification Kalman Filtering for smoother Tracks

Formant Feature (FF) Vectors In addition to the Frequency of poles their Band Widths and Magnitudes are used as well The HMM models are trained on mono-phones.

FF vs. MFCC with and without energy component Mono-phone recognition in Train noise Better performance of FF in severe noisy conditions

Robustness of dynamic FF to noise Mono-phone recognition in Train noise Dynamic Features are much more robust to noise

The use of the Formants for consonant recognition Mono-phone recognition in Train noise Higher Recognition rates than vowels in higher SNR More sensitive to noise because of the lower energy level

De-noising the speech by filtering frequency trajectories

RelAtive SpecTrA (RASTA) Processing Filtering the frequency trajectories of the cubic root of power spectrum using a fixed IIR filter

The use of FIR filters in RASTA Filtering the frequency trajectories of the power spectrum using a bank of non-casual FIR filters not adaptive experimentally derived Filters’ Impulse Response

Kalman Filter adaptively updates itself with noise covariance Kalman Filtering Kalman Filter adaptively updates itself with noise covariance

How Kalman Filter is applied to de-noising problem Noise Modelling and updating Neighbour Trajectory Segment Frequency Bin Trajectory VAD Noise Modelling Prior Noise Model and Trajectory Statistics Spectral Subtraction Observation Predictor Predicted Error covariance Noise Covariance Mean Kalman Gain Estimator Output Kalman Filtering

A more informed noise reduction Advantages of Kalman A more informed noise reduction Combining the prediction and the observation of the frequency trajectory Adaptively updating the noise model while filtering the trajectory (in comparison with RASTA) Could (and probably should) be combined with spectral subtraction for improved performance