Survey of ICASSP 2013 section: feature for robust automatic speech recognition Repoter: Yi-Ting Wang 2013/06/19.

Slides:



Advertisements
Similar presentations
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Advertisements

Combining Heterogeneous Sensors with Standard Microphones for Noise Robust Recognition Horacio Franco 1, Martin Graciarena 12 Kemal Sonmez 1, Harry Bratt.
Frederico Rodrigues and Isabel Trancoso INESC/IST, 2000 Robust Recognition of Digits and Natural Numbers.
Improved Neural Network Based Language Modelling and Adaptation J. Park, X. Liu, M.J.F. Gales and P.C. Woodland 2010 INTERSPEECH Bang-Xuan Huang Department.
Histogram-based Quantization for Distributed / Robust Speech Recognition Chia-yu Wan, Lin-shan Lee College of EECS, National Taiwan University, R. O. C.
Distribution-Based Feature Normalization for Robust Speech Recognition Leveraging Context and Dynamics Cues Yu-Chen Kao and Berlin Chen Presenter : 張庭豪.
AN INVESTIGATION OF DEEP NEURAL NETWORKS FOR NOISE ROBUST SPEECH RECOGNITION Michael L. Seltzer, Dong Yu Yongqiang Wang ICASSP 2013 Presenter : 張庭豪.
Survey of INTERSPEECH 2013 Reporter: Yi-Ting Wang 2013/09/10.
Advances in WP1 Turin Meeting – 9-10 March
Model-Based Fusion of Bone and Air Sensors for Speech Enhancement and Robust Speech Recognition John Hershey, Trausti Kristjansson, Zhengyou Zhang, Alex.
HIWIRE MEETING Nancy, July 6-7, 2006 José C. Segura, Ángel de la Torre.
HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez.
MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION Source: Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop on Author.
Introduction to Image Quality Assessment
ADSP Final Project 2005/12/7. Option1: Channel Equalizer H Channel H + S N R=S*H+N In frequency domain.
HIWIRE MEETING CRETE, SEPTEMBER 23-24, 2004 JOSÉ C. SEGURA LUNA GSTC UGR.
Advances in WP1 and WP2 Paris Meeting – 11 febr
1 New Technique for Improving Speech Intelligibility for the Hearing Impaired Miriam Furst-Yust School of Electrical Engineering Tel Aviv University.
Why is ASR Hard? Natural speech is continuous
Robust Automatic Speech Recognition by Transforming Binary Uncertainties DeLiang Wang (Jointly with Dr. Soundar Srinivasan) Oticon A/S, Denmark (On leave.
Introduction to Automatic Speech Recognition
Speech Segregation Based on Sound Localization DeLiang Wang & Nicoleta Roman The Ohio State University, U.S.A. Guy J. Brown University of Sheffield, U.K.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
Lecture 41 Practical sampling and reconstruction.
Image Recognition and Processing Using Artificial Neural Network Md. Iqbal Quraishi, J Pal Choudhury and Mallika De, IEEE.
HMM-BASED PSEUDO-CLEAN SPEECH SYNTHESIS FOR SPLICE ALGORITHM Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang Wen-Yi Chu Department of Computer Science & Information.
Adaptation Techniques in Automatic Speech Recognition Tor André Myrvoll Telektronikk 99(2), Issue on Spoken Language Technology in Telecommunications,
Data Processing Functions CSC508 Techniques in Signal/Data Processing.
Cepstral Vector Normalization based On Stereo Data for Robust Speech Recognition Presenter: Shih-Hsiang Lin Luis Buera, Eduardo Lleida, Antonio Miguel,
Video Tracking Using Learned Hierarchical Features
Blind speech dereverberation using multiple microphones Inseon JANG, Seungjin CHOI Intelligent Multimedia Lab Department of Computer Science and Engineering,
November 1, 2005IEEE MMSP 2005, Shanghai, China1 Adaptive Multi-Frame-Rate Scheme for Distributed Speech Recognition Based on a Half Frame-Rate Front-End.
Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.
LOG-ENERGY DYNAMIC RANGE NORMALIZATON FOR ROBUST SPEECH RECOGNITION Weizhong Zhu and Douglas O’Shaughnessy INRS-EMT, University of Quebec Montreal, Quebec,
Overview of Part I, CMSC5707 Advanced Topics in Artificial Intelligence KH Wong (6 weeks) Audio signal processing – Signals in time & frequency domains.
ICASSP Speech Discrimination Based on Multiscale Spectro–Temporal Modulations Nima Mesgarani, Shihab Shamma, University of Maryland Malcolm Slaney.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Recognition of Speech Using Representation in High-Dimensional Spaces University of Washington, Seattle, WA AT&T Labs (Retd), Florham Park, NJ Bishnu Atal.
A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj NIPS 2009.
Hsin-Ju Hsieh 謝欣汝, Wen-hsiang Tu 杜文祥, Jeih-weih Hung 洪志偉 暨南國際大學電機工程學系 報告者:汪逸婷 2012/03/20.
AMSP : Advanced Methods for Speech Processing An expression of Interest to set up a Network of Excellence in FP6 Prepared by members of COST-277 and colleagues.
Robust Feature Extraction for Automatic Speech Recognition based on Data-driven and Physiologically-motivated Approaches Mark J. Harvilla1, Chanwoo Kim2.
Subproject II: Robustness in Speech Recognition. Members (1/2) Hsiao-Chuan Wang (PI) National Tsing Hua University Jeih-Weih Hung (Co-PI) National Chi.
ICASSP 2006 Robustness Techniques Survey ShihHsiang 2006.
Final Review Course web page: vision.cis.udel.edu/~cv May 21, 2003  Lecture 37.
ICASSP 2007 Robustness Techniques Survey Presenter: Shih-Hsiang Lin.
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
PHASE-BASED DUAL-MICROPHONE SPEECH ENHANCEMENT USING A PRIOR SPEECH MODEL Guangji Shi, M.A.Sc. Ph.D. Candidate University of Toronto Research Supervisor:
Noise Reduction in Speech Recognition Professor:Jian-Jiun Ding Student: Yung Chang 2011/05/06.
January 2001RESPITE workshop - Martigny Multiband With Contaminated Training Data Results on AURORA 2 TCTS Faculté Polytechnique de Mons Belgium.
Survey of Robust Speech Techniques in ICASSP 2009 Shih-Hsiang Lin ( 林士翔 ) 1Survey of Robustness Techniques in ICASSP 2009.
Feature Transformation and Normalization Present by Howard Reference : Springer Handbook of Speech Processing, 3.3 Environment Robustness (J. Droppo, A.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
Speech Enhancement based on
1 LOW-RESOURCE NOISE-ROBUST FEATURE POST-PROCESSING ON AURORA 2.0 Chia-Ping Chen, Jeff Bilmes and Katrin Kirchhoff SSLI Lab Department of Electrical Engineering.
Speech Enhancement Summer 2009
Deep Learning Amin Sobhani.
PLIP BASED UNSHARP MASKING FOR MEDICAL IMAGE ENHANCEMENT
Systems Biology for Translational Medicine
DeLiang Wang (Jointly with Dr. Soundar Srinivasan) Oticon A/S, Denmark
Representation Learning with Deep Auto-Encoder
Missing feature theory
DCT-based Processing of Dynamic Features for Robust Speech Recognition Wen-Chi LIN, Hao-Teng FAN, Jeih-Weih HUNG Wen-Yi Chu Department of Computer Science.
A maximum likelihood estimation and training on the fly approach
Neural networks (3) Regularization Autoencoder
LECTURE 34: Autoencoders
Speech / Non-speech Detection
Presented by Chen-Wei Liu
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

Survey of ICASSP 2013 section: feature for robust automatic speech recognition Repoter: Yi-Ting Wang 2013/06/19

Outline

Robust automatic speech recognition Over the years, much effort has been devoted on developing techniques for noise robust Automatic Speech Recognition(ASR). The goal is to make the ASR system more resistant to the mismatch between training and testing condition. Noise reduction techniques: – Speech enhancement at the signal level – Robust feature extraction – Adapting the back-end models.

Robust feature extraction Feature Transformation Feature Compensation Feature Reconstruction

A robust frontend for ASR : combining denoising, noise masking and feature normalization

Filtering on the temporal probability sequence in histogram equalization for robust speech recognition HEQFHEQ

Ideal ratio mask estimation using deep neural networks for robust speeh recognition

Solve the problem in reverberant environments It is well known that the distortions caused by reverberation, background noise, etc., are highly nonlinear in the cepstral domain. Dereverberation via suppression and enhancement can be applied for reverberation compensation. The drawback is undesirable if the late reverberation is not estimated precisely.

Noise model transfer using affine transformation with application to large vocabulary reverberant speech recognition

Joint sparse representation based cepstral-domain dereverberation for distant-talking speech recognition Most existing sparse representation methods only consider sparse modeling in a single signal space, and few considers dictionary learning across different signal spaces.

feature enhencement Solve the mismatch problem