HIWIRE MEETING Trento, January 11-12, 2007 José C. Segura, Javier Ramírez.

Slides:



Advertisements
Similar presentations
Higher Order Cepstral Moment Normalization (HOCMN) for Robust Speech Recognition Speaker: Chang-wen Hsu Advisor: Lin-shan Lee 2007/02/08.
Advertisements

Jose-Luis Blanco, Javier González, Juan-Antonio Fernández-Madrigal University of Málaga (Spain) Dpt. of System Engineering and Automation May Pasadena,
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Advances in WP1 Trento Meeting January
Advanced Speech Enhancement in Noisy Environments
Acoustic Model Adaptation Based On Pronunciation Variability Analysis For Non-Native Speech Recognition Yoo Rhee Oh, Jae Sam Yoon, and Hong Kook Kim Dept.
Author :Panikos Heracleous, Tohru Shimizu AN EFFICIENT KEYWORD SPOTTING TECHNIQUE USING A COMPLEMENTARY LANGUAGE FOR FILLER MODELS TRAINING Reporter :
HIWIRE MEETING Paris, February 11, 2005 JOSÉ C. SEGURA LUNA GSTC UGR.
Robust Voice Activity Detection for Interview Speech in NIST Speaker Recognition Evaluation Man-Wai MAK and Hon-Bill YU The Hong Kong Polytechnic University.
Advances in WP1 Turin Meeting – 9-10 March
Advances in WP1 Nancy Meeting – 6-7 July
Communications & Multimedia Signal Processing Frequency Kalman Noise Reduction Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel.
HIWIRE MEETING Nancy, July 6-7, 2006 José C. Segura, Ángel de la Torre.
HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez.
Advances in WP2 Trento Meeting – January
HIWIRE MEETING Chania, May 10-11, 2007 José C. Segura.
MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION Source: Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop on Author.
HIWIRE MEETING CRETE, SEPTEMBER 23-24, 2004 JOSÉ C. SEGURA LUNA GSTC UGR.
Non-native Speech Languages have different pronunciation spaces
Speech Recognition in Noise
Advances in WP2 Chania Meeting – May
HIWIRE Progress Report Trento, January 2007 Presenter: Prof. Alex Potamianos Technical University of Crete Presenter: Prof. Alex Potamianos Technical University.
LORIA Irina Illina Dominique Fohr Chania Meeting May 9-10, 2007.
Advances in WP1 and WP2 Paris Meeting – 11 febr
Voice Activity Detection (VAD)
LORIA Irina Illina Dominique Fohr Christophe Cerisara Torino Meeting March 9-10, 2006.
Advances in WP1 Chania Meeting – May
HIWIRE MEETING Athens, November 3-4, 2005 José C. Segura, Ángel de la Torre.
Department of Electrical Engineering | University of Texas at Dallas Erik Jonsson School of Engineering & Computer Science | Richardson, Texas ,
Florian Bacher & Christophe Sourisse [ ] Seminar in Interactive Systems.
Department of Electrical Engineering and Information Sciences Institute of Communication Acoustics (IKA) 1 Institute of Communication Acoustics (IKA)
Introduction to Automatic Speech Recognition
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
HMM-BASED PSEUDO-CLEAN SPEECH SYNTHESIS FOR SPLICE ALGORITHM Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang Wen-Yi Chu Department of Computer Science & Information.
A VOICE ACTIVITY DETECTOR USING THE CHI-SQUARE TEST
Cepstral Vector Normalization based On Stereo Data for Robust Speech Recognition Presenter: Shih-Hsiang Lin Luis Buera, Eduardo Lleida, Antonio Miguel,
Reporter: Shih-Hsiang( 士翔 ). Introduction Speech signal carries information from many sources –Not all information is relevant or important for speech.
REVISED CONTEXTUAL LRT FOR VOICE ACTIVITY DETECTION Javier Ram’ırez, Jos’e C. Segura and J.M. G’orriz Dept. of Signal Theory Networking and Communications.
Compensating speaker-to-microphone playback system for robust speech recognition So-Young Jeong and Soo-Young Lee Brain Science Research Center and Department.
Ekapol Chuangsuwanich and James Glass MIT Computer Science and Artificial Intelligence Laboratory,Cambridge, Massachusetts 02139,USA 2012/07/2 汪逸婷.
LOG-ENERGY DYNAMIC RANGE NORMALIZATON FOR ROBUST SPEECH RECOGNITION Weizhong Zhu and Douglas O’Shaughnessy INRS-EMT, University of Quebec Montreal, Quebec,
Speaker independent Digit Recognition System Suma Swamy Research Scholar Anna University, Chennai 10/22/2015 9:10 PM 1.
IMPROVING RECOGNITION PERFORMANCE IN NOISY ENVIRONMENTS Joseph Picone 1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
ICASSP Speech Discrimination Based on Multiscale Spectro–Temporal Modulations Nima Mesgarani, Shihab Shamma, University of Maryland Malcolm Slaney.
NOISE DETECTION AND CLASSIFICATION IN SPEECH SIGNALS WITH BOOSTING Nobuyuki Miyake, Tetsuya Takiguchi and Yasuo Ariki Department of Computer and System.
1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.
Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments 張智星
A Baseline System for Speaker Recognition C. Mokbel, H. Greige, R. Zantout, H. Abi Akl A. Ghaoui, J. Chalhoub, R. Bayeh University Of Balamand - ELISA.
1 Development of the Embedded Speech Recognition Interface done for AIBO ICSI Presentation January 2003.
In-car Speech Recognition Using Distributed Microphones Tetsuya Shinde Kazuya Takeda Fumitada Itakura Center for Integrated Acoustic Information Research.
Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information Motoyuki Suzuki, Toru Hosoya, Akinori Ito, and Shozo Makino EURASIP.
PhD Candidate: Tao Ma Advised by: Dr. Joseph Picone Institute for Signal and Information Processing (ISIP) Mississippi State University Linear Dynamic.
Subproject II: Robustness in Speech Recognition. Members (1/2) Hsiao-Chuan Wang (PI) National Tsing Hua University Jeih-Weih Hung (Co-PI) National Chi.
Performance Comparison of Speaker and Emotion Recognition
A DYNAMIC APPROACH TO THE SELECTION OF HIGH ORDER N-GRAMS IN PHONOTACTIC LANGUAGE RECOGNITION Mikel Penagarikano, Amparo Varona, Luis Javier Rodriguez-
ICASSP 2006 Robustness Techniques Survey ShihHsiang 2006.
ICASSP 2007 Robustness Techniques Survey Presenter: Shih-Hsiang Lin.
Copyright © 2013 by Educational Testing Service. All rights reserved. Evaluating Unsupervised Language Model Adaption Methods for Speaking Assessment ShaSha.
A New Approach to Utterance Verification Based on Neighborhood Information in Model Space Author :Hui Jiang, Chin-Hui Lee Reporter : 陳燦輝.
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
Noise Reduction in Speech Recognition Professor:Jian-Jiun Ding Student: Yung Chang 2011/05/06.
Spatial vs. Blind Approaches for Speaker Separation: Structural Differences and Beyond Julien Bourgeois RIC/AD.
January 2001RESPITE workshop - Martigny Multiband With Contaminated Training Data Results on AURORA 2 TCTS Faculté Polytechnique de Mons Belgium.
Feature Transformation and Normalization Present by Howard Reference : Springer Handbook of Speech Processing, 3.3 Environment Robustness (J. Droppo, A.
1 LOW-RESOURCE NOISE-ROBUST FEATURE POST-PROCESSING ON AURORA 2.0 Chia-Ping Chen, Jeff Bilmes and Katrin Kirchhoff SSLI Lab Department of Electrical Engineering.
Using Speech Recognition to Predict VoIP Quality
Two-Stage Mel-Warped Wiener Filter SNR-Dependent Waveform Processing
Speaker Identification:
Speech / Non-speech Detection
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

HIWIRE MEETING Trento, January 11-12, 2007 José C. Segura, Javier Ramírez

2 HIWIRE Meeting – Trento, January, 2007 Schedule  PEQ  HAFE  IS07 setup  New improvements in robust VAD  Revised multiple observation LRT (MO-LRT)  Improve noise reduction and frame-dropping

3 HIWIRE Meeting – Trento, January, 2007 PEQ  Evaluation  AURORA2, AURORA3, AURORA4  Compared to HEQ  PEQ shows better performance on all databases  Results using Loquendo recognizer  Improved results  Slight degradation on clean conditions

4 HIWIRE Meeting – Trento, January, 2007 PEQ / HEQ comparative results

5 HIWIRE Meeting – Trento, January, 2007 HAFE  In collaboration with TUC-NTUA  Released two C modules, integrated in HAFE V1.0  Basic Analysis  VAD (LTSD)  Wiener filter (optional)  Output: WAV / MFCC / FB  Post-Processing  PEQ (optional)  Regression computation (optional)  Frame-Dropping (optional)  CMS /CMVN (optional)

6 HIWIRE Meeting – Trento, January, 2007 IS07 setup  Prepared an HTK setup for evaluation on the HIWIRE database  Training scripts based on LORIA ones  Test scripts include MLLR adaptation with variable number of utterances  Baseline results  Only for clean data  With and without adaptation

7 HIWIRE Meeting – Trento, January, 2007 IS07 setup (without adaptation)

8 HIWIRE Meeting – Trento, January, 2007 IS07 (with adaptation)

9 HIWIRE Meeting – Trento, January, 2007 A review of MO-LRT VAD  Multiple observation likelihood ratio test:  Given 2N+1 independent observations of the noisy speech  Hypothesis test:  G 0 : All the observations in the buffer are non-speech  G 1 :“““noisy speech  Gaussian model: where

10 HIWIRE Meeting – Trento, January, 2007 Hangover analysis

11 HIWIRE Meeting – Trento, January, 2007 Hangover analysis

12 HIWIRE Meeting – Trento, January, 2007  Revised MO-LRT  Given 2N+1 independent observations of the noisy speech:  All the possible hypothesis on the individual observations: h k = 0 :x k = n h k = 1 :x k = s + n  Hypothesis subsets

13 HIWIRE Meeting – Trento, January, 2007  Revised MO-LRT  We assume that just a single speech to non-speech or non- speech to speech transition can occur in h

14 HIWIRE Meeting – Trento, January, 2007  Compared to Sohn et al. VAD.

15 HIWIRE Meeting – Trento, January, 2007

16 HIWIRE Meeting – Trento, January, 2007 ROC curves in quiet noise conditions (stopped car and engine running) and close talking microphone.

17 HIWIRE Meeting – Trento, January, 2007 ROC curves in high noise conditions (high speed over a good road) and distant talking microphone.

18 HIWIRE Meeting – Trento, January, 2007  Presented at ICASSP 2007:  Javier Ramirez, José C. Segura, Juan M. Górriz, “Revised contextual LRT for voice activity detection”, ICASSP  Under review:  Javier Ramírez, José C. Segura, Juan M. Górriz and Luz García, “Improved Voice Activity Detection Using Contextual Multiple Hypothesis Testing for Robust Speech Recognition”, IEEE Transactions on Audio, Speech and Language Processing.