IMPROVING RECOGNITION PERFORMANCE IN NOISY ENVIRONMENTS Joseph Picone 1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.

Slides:



Advertisements
Similar presentations
aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition.
Advertisements

Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Advances in WP1 Trento Meeting January
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
Advanced Speech Enhancement in Noisy Environments
Acoustic Model Adaptation Based On Pronunciation Variability Analysis For Non-Native Speech Recognition Yoo Rhee Oh, Jae Sam Yoon, and Hong Kook Kim Dept.
Distribution-Based Feature Normalization for Robust Speech Recognition Leveraging Context and Dynamics Cues Yu-Chen Kao and Berlin Chen Presenter : 張庭豪.
VOICE CONVERSION METHODS FOR VOCAL TRACT AND PITCH CONTOUR MODIFICATION Oytun Türk Levent M. Arslan R&D Dept., SESTEK Inc., and EE Eng. Dept., Boğaziçi.
PERFORMANCE ANALYSIS OF AURORA LARGE VOCABULARY BASELINE SYSTEM Naveen Parihar, and Joseph Picone Center for Advanced Vehicular Systems Mississippi State.
HIWIRE MEETING Paris, February 11, 2005 JOSÉ C. SEGURA LUNA GSTC UGR.
An Energy Search Approach to Variable Frame Rate Front-End Processing for Robust ASR Julien Epps and Eric H. C. Choi National ICT Australia Presenter:
AN INVESTIGATION OF DEEP NEURAL NETWORKS FOR NOISE ROBUST SPEECH RECOGNITION Michael L. Seltzer, Dong Yu Yongqiang Wang ICASSP 2013 Presenter : 張庭豪.
Survey of INTERSPEECH 2013 Reporter: Yi-Ting Wang 2013/09/10.
Advances in WP1 Turin Meeting – 9-10 March
The 1980’s Collection of large standard corpora Front ends: auditory models, dynamics Engineering: scaling to large vocabulary continuous speech Second.
Advances in WP1 Nancy Meeting – 6-7 July
Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,
HIWIRE MEETING Nancy, July 6-7, 2006 José C. Segura, Ángel de la Torre.
MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION Source: Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop on Author.
Speech Recognition in Noise
Advances in WP2 Chania Meeting – May
Advances in WP1 and WP2 Paris Meeting – 11 febr
HIWIRE MEETING Trento, January 11-12, 2007 José C. Segura, Javier Ramírez.
Communication matters from day one Trends in HORIZON 2020
Department of Electrical Engineering and Information Sciences Institute of Communication Acoustics (IKA) 1 Institute of Communication Acoustics (IKA)
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
Kinect Player Gender Recognition from Speech Analysis
The 2000 NRL Evaluation for Recognition of Speech in Noisy Environments MITRE / MS State - ISIP Burhan Necioglu Bryan George George Shuttic The MITRE.
"Dude, Where's My... Signals and Systems Textbook?" Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
An Analysis of the Aurora Large Vocabulary Evaluation Authors: Naveen Parihar and Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical.
REVISED CONTEXTUAL LRT FOR VOICE ACTIVITY DETECTION Javier Ram’ırez, Jos’e C. Segura and J.M. G’orriz Dept. of Signal Theory Networking and Communications.
2010/12/11 Frequency Domain Blind Source Separation Based Noise Suppression to Hearing Aids (Part 1) Presenter: Cian-Bei Hong Advisor: Dr. Yeou-Jiunn Chen.
Performance Analysis of Advanced Front Ends on the Aurora Large Vocabulary Evaluation Author: Naveen Parihar Inst. for Signal and Info. Processing Dept.
Compensating speaker-to-microphone playback system for robust speech recognition So-Young Jeong and Soo-Young Lee Brain Science Research Center and Department.
NONLINEAR DYNAMIC INVARIANTS FOR CONTINUOUS SPEECH RECOGNITION Author: Daniel May Mississippi State University Contact Information: 1255 Louisville St.
LOG-ENERGY DYNAMIC RANGE NORMALIZATON FOR ROBUST SPEECH RECOGNITION Weizhong Zhu and Douglas O’Shaughnessy INRS-EMT, University of Quebec Montreal, Quebec,
ICASSP Speech Discrimination Based on Multiscale Spectro–Temporal Modulations Nima Mesgarani, Shihab Shamma, University of Maryland Malcolm Slaney.
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments 張智星
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
INSTITUTE FOR SIGNAL AND INFORMATION PROCESSING Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State.
PhD Candidate: Tao Ma Advised by: Dr. Joseph Picone Institute for Signal and Information Processing (ISIP) Mississippi State University Linear Dynamic.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Speech Enhancement for ASR by Hans Hwang 8/23/2000 Reference 1. Alan V. Oppenheim,etc., ” Multi-Channel Signal Separation by Decorrelation ”,IEEE Trans.
ICASSP 2006 Robustness Techniques Survey ShihHsiang 2006.
M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:
Performance Analysis of Advanced Front Ends on the Aurora Large Vocabulary Evaluation Authors: Naveen Parihar and Joseph Picone Inst. for Signal and Info.
Copyright © 2013 by Educational Testing Service. All rights reserved. Evaluating Unsupervised Language Model Adaption Methods for Speaking Assessment ShaSha.
Detection of Vowel Onset Point in Speech S.R. Mahadeva Prasanna & Jinu Mariam Zachariah Department of Computer Science & Engineering Indian Institute.
PHASE-BASED DUAL-MICROPHONE SPEECH ENHANCEMENT USING A PRIOR SPEECH MODEL Guangji Shi, M.A.Sc. Ph.D. Candidate University of Toronto Research Supervisor:
Network Training for Continuous Speech Recognition Author: Issac John Alphonso Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng.
Wrapping Snakes For Improved Lip Segmentation Matthew Ramage Dr Euan Lindsay (Supervisor) Department of Mechanical Engineering.
January 2001RESPITE workshop - Martigny Multiband With Contaminated Training Data Results on AURORA 2 TCTS Faculté Polytechnique de Mons Belgium.
Speech Enhancement based on
An Analysis of the Aurora Large Vocabulary Evaluation Authors: Naveen Parihar and Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical.
Page 1 of 10 ASR – effect of five parameters on the WER performance of HMM SR system Sanjay Patil, Jun-Won Suh Human and Systems Engineering Experimental.
1 LOW-RESOURCE NOISE-ROBUST FEATURE POST-PROCESSING ON AURORA 2.0 Chia-Ping Chen, Jeff Bilmes and Katrin Kirchhoff SSLI Lab Department of Electrical Engineering.
Liverpool Keele Contribution.
Course Projects Speech Recognition Spring 1386
HUMAN LANGUAGE TECHNOLOGY: From Bits to Blogs
Two-Stage Mel-Warped Wiener Filter SNR-Dependent Waveform Processing
DeLiang Wang (Jointly with Dr. Soundar Srinivasan) Oticon A/S, Denmark
Missing feature theory
HUMAN AND SYSTEMS ENGINEERING:
Speech / Non-speech Detection
Network Training for Continuous Speech Recognition
3. Adversarial Teacher-Student Learning (AT/S)
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

IMPROVING RECOGNITION PERFORMANCE IN NOISY ENVIRONMENTS Joseph Picone 1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State University Contact Information: Box 9571 Mississippi State University Mississippi State. Mississippi Tel: Fax: Three-time workshop survivor (’97-’99)! CLSP SUMMER PLANNING WORKSHOP

OVERVIEW AURORA LVCSR EVALUATION WSJ 5K (closed task) with seven (digitally-added) noise conditions Common ASR system Two participants: QIO: QualC., ICSI, OGI; MFA: Moto., FrTel., Alcatel Client/server applications Evaluate robustness in noisy environments Propose a standard for LVCSR applications Performance Summary Site Test Set Clean Noise (Sennh) Noise (MultiM) Base (TS1)15%59%75% Base (TS2)19%33%50% QIO (TS2)17%26%41% MFA (TS2)15%26%40%

STATE OF THE ART ADAPTIVE SIGNAL PROCESSING Commercial front ends use adaptive noise compensation: Advanced front ends use a variety of techniques including subspace methods, normalization, and multiple time scales: Aurora LVCSR eval did not address acoustic modeling issues and speaker/channel adaptation (by design).

PROPOSAL SUMMARY Focus on Aurora task (TS2): –multiple microphones; representative noise conditions –adaptation/multipass processing within a single utterance –establish benchmarks prior to workshop (incl. adaptation) SIGNAL PROCESSING VS. ACOUSTIC MODELS Some possible themes: – knowledge vs. statistics – phone-dependent spectral models of speech and noise – multi-time scale analysis – subspace methods to separate speech and noise – iterative refinement Parallel research tracks: – noise robust front end processing – phone/state-specific features and/or noise models

J. Picone, "Improving Speech Recognition Performance in Noisy Environments,” Mississippi State University, November 8, 2002 ( N. Parihar and J. Picone, “DSR Front End LVCSR Evaluation – Baseline Recognition System Description,” Aurora Working Group, European Telecommunications Standards Institute, November 1, 2001 ( D. Machola, et al, “Evaluation of a Noise-Robust DSR Front End on Aurora Databases,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September A. Adamia, et al, “Qualcomm-ICSI-OGI Features For ASR,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September C.P. Chen, et al, “Front End Post-Processing and Back End Model Enhancement on the Aurora 2.0/3.0 Databases,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September P. Mot´ý¡cek and L. Burget, “Noise Estimation For Efficient Speech Enhancement and Robust Speech Recognition,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September J. Chen, et al, “Recognition of Noisy Speech Using Normalized Moments,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September J. Wu and Q. Huo, “An Environment Compensated Minimum Classification Error Training Approach and Its Evaluation in Aurora 2 Database,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September G. Saon and J.M. Huerta, “Improvements to the IBM Aurora 2 Multi-Condition System,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September REFERENCES AURORA AND ICSLP’2002