IMPROVING RECOGNITION PERFORMANCE IN NOISY ENVIRONMENTS Joseph Picone 1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State University Contact Information: Box 9571 Mississippi State University Mississippi State. Mississippi Tel: Fax: Three-time workshop survivor (’97-’99)! CLSP SUMMER PLANNING WORKSHOP
OVERVIEW AURORA LVCSR EVALUATION WSJ 5K (closed task) with seven (digitally-added) noise conditions Common ASR system Two participants: QIO: QualC., ICSI, OGI; MFA: Moto., FrTel., Alcatel Client/server applications Evaluate robustness in noisy environments Propose a standard for LVCSR applications Performance Summary Site Test Set Clean Noise (Sennh) Noise (MultiM) Base (TS1)15%59%75% Base (TS2)19%33%50% QIO (TS2)17%26%41% MFA (TS2)15%26%40%
STATE OF THE ART ADAPTIVE SIGNAL PROCESSING Commercial front ends use adaptive noise compensation: Advanced front ends use a variety of techniques including subspace methods, normalization, and multiple time scales: Aurora LVCSR eval did not address acoustic modeling issues and speaker/channel adaptation (by design).
PROPOSAL SUMMARY Focus on Aurora task (TS2): –multiple microphones; representative noise conditions –adaptation/multipass processing within a single utterance –establish benchmarks prior to workshop (incl. adaptation) SIGNAL PROCESSING VS. ACOUSTIC MODELS Some possible themes: – knowledge vs. statistics – phone-dependent spectral models of speech and noise – multi-time scale analysis – subspace methods to separate speech and noise – iterative refinement Parallel research tracks: – noise robust front end processing – phone/state-specific features and/or noise models
J. Picone, "Improving Speech Recognition Performance in Noisy Environments,” Mississippi State University, November 8, 2002 ( N. Parihar and J. Picone, “DSR Front End LVCSR Evaluation – Baseline Recognition System Description,” Aurora Working Group, European Telecommunications Standards Institute, November 1, 2001 ( D. Machola, et al, “Evaluation of a Noise-Robust DSR Front End on Aurora Databases,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September A. Adamia, et al, “Qualcomm-ICSI-OGI Features For ASR,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September C.P. Chen, et al, “Front End Post-Processing and Back End Model Enhancement on the Aurora 2.0/3.0 Databases,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September P. Mot´ý¡cek and L. Burget, “Noise Estimation For Efficient Speech Enhancement and Robust Speech Recognition,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September J. Chen, et al, “Recognition of Noisy Speech Using Normalized Moments,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September J. Wu and Q. Huo, “An Environment Compensated Minimum Classification Error Training Approach and Its Evaluation in Aurora 2 Database,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September G. Saon and J.M. Huerta, “Improvements to the IBM Aurora 2 Multi-Condition System,” International Conference on Spoken Language Processing, Denver, Colorado, USA, pp , September REFERENCES AURORA AND ICSLP’2002