Presentation is loading. Please wait.

Presentation is loading. Please wait.

HIWIRE MEETING Athens, November 3-4, 2005 José C. Segura, Ángel de la Torre.

Similar presentations


Presentation on theme: "HIWIRE MEETING Athens, November 3-4, 2005 José C. Segura, Ángel de la Torre."— Presentation transcript:

1 HIWIRE MEETING Athens, November 3-4, 2005 José C. Segura, Ángel de la Torre

2 2 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Schedule  HIWIRE database evaluations  Non-linear feature normalization  ECDF segmental implementation  Parametric equalization  Robust VAD  Bispectrum-based VAD  Model-based feature compensation  VTS results on AURORA4  Including uncertainty caused by noise

3 3 HIWIRE Meeting – Athens, 3 - 4 November, 2005 HIWIRE database evaluations  PARAMETERS: MFCC_0_D_A_Z (39 component)  MODELS:  TIMIT: 46 phone models / 3 states / 128 Gaussians (17.664 G)  WSJ16k: 16.825 triphones / 3.608 tied-states / 6 Gaussians (21.648 G)  WSJ16kFon: 40 phone models / 3 states / 128 Gaussians (15.360 G)  ADAPTATION:  MLLR: 32 regression classes / 50 adaptation utterances  GRAMMAR:  LORIA & Word-Loop  MODIFICATIONS: Some transcriptions have been modified to match the grammar definition

4 4 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Transcription modifications BEGIN { lista = LISTA; nfrase = 0; } { linea=$0; gsub("-","_",linea); gsub("Due_to_","Due_to ",linea); gsub("Mayday_Mayday","Mayday Mayday",linea); gsub("Pan_Pan","Pan Pan",linea); gsub("three hundred twenty","three_hundred_twenty",linea); gsub("one hundred sixty","one_hundred_sixty",linea); printf("%s\n",tolower(linea)); nfrase = nfrase+1; }

5 5 HIWIRE Meeting – Athens, 3 - 4 November, 2005 HIWIRE database results RESULTS WITHOUT ADAPTATION (WER) RESULTS WITH MLLR (WER)

6 6 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Schedule  HIWIRE database evaluations  Non-linear feature normalization  ECDF segmental implementation  Parametric equalization  Robust VAD  Bispectrum-based VAD  Model-based feature compensation  VTS results on AURORA4  Including uncertainty caused by noise

7 7 HIWIRE Meeting – Athens, 3 - 4 November, 2005 ECDF segmental implementation  ECDF segmental implementation  Provided LOQUENDO with a reference “C” implementation of segmental Gaussian transformation to be tested within LOQUENDO recognizer  Current work  Nonlinear feature transformation with a clean reference to avoid the problem of system retraining

8 8 HIWIRE Meeting – Athens, 3 - 4 November, 2005  HEQ limitations  Influence of relative amount of silence in utterances  With a parametric model, a more robust equalization can be obtained Parametric Equalization (1) PARAMETRIC NONLINEAR FEATURE EQUALIZATION FOR ROBUST SPEECH RECOGNITION (submitted ICASSP’06)

9 9 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Parametric Equalization (2) CLASS-DEPENDENT LINEAR EQUALIZATION SOFT DECISSION VAD (two-class Gaussian classifier on C 0 ) NONLINEAR INTERPOLATION

10 10 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Parametric Equalization (3)

11 11 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Parametric Equalization (4)  In comparison with HEQ, PEQ transformations are smoother  For C 0 a monotonic transformation is obtained  For other coefficients, the interpolated transformation is not monotonic

12 12 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Parametric Equalization (5)  BASE  MFCC_0_D_A_Z (39 component)  HEQ  Quantile based CDF-transformation  Clean reference  Implemented over MFCC_0 / CMS and regressions computed after HEQ  AFE  Standard implementation  PEQ  Clean reference  Implemented over MFCC_0 / CMS and regressions computed after PEQ

13 13 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Parametric Equalization (6)  Current work  Development of an on-line version  Relax the diagonal covariance assumption  Investigate the normalization of dynamic features  Using a more detailed model of speech frames  (i.e. More than one Gaussian)

14 14 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Schedule  HIWIRE database evaluations  Non-linear feature normalization  ECDF segmental implementation (LOQ)  Parametric equalization  Robust VAD  Bispectrum-based VAD  Model-based feature compensation  VTS results on AURORA4  Including uncertainty caused by noise

15 15 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Bispectrum-based VAD (1)  Motivations:  Ability of higher order statistics to detect signals in noise  Polyspectra methods rely on an a priori knowledge of the input processes  Issues to be addressed:  Computationally expensive  Variance of the bispectrum estimators is much higher than that of power spectral estimators for identical data record size  Solution: Integrated bispectrum  J. K. Tugnait, “Detection of non-Gaussian signals using integrated polyspectrum,” IEEE Trans. on Signal Processing, vol. 42, no. 11, pp. 3137–3149, 1994.  Computationally efficient and reduced variance statistical test based on the integrated polyspectra  Detection of an unknown random, stationary, non-Gaussian signal in Gaussian noise

16 16 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Bispectrum-based VAD (2)  Integrated bispectrum:  Defined as a cross spectrum between the signal and its square, and therefore, it is a function of a single frequency variable  Benefits:  Its computation as a cross spectrum leads to significant computational savings  The variance of the estimator is of the same order as that of the power spectrum estimator  Properties  Bispectrum of a Gaussian process is identically zero, its integrated bispectrum is as well

17 17 HIWIRE Meeting – Athens, 3 - 4 November, 2005  Two alternatives explored for formulating the decision rule:  Estimation by block averaging:  MO-LRT  Given a set of N= 2m+1 consecutive observations: Bispectrum-based VAD (3)

18 18 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Bispectrum-based VAD (4)  Likelihoods  Variances

19 19 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Bispectrum-based VAD results (1)

20 20 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Bispectrum-based VAD results (2)

21 21 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Bispectrum-based VAD results (3)

22 22 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Schedule  HIWIRE database evaluations  Non-linear feature normalization  ECDF segmental implementation (LOQ)  Parametric equalization  Robust VAD  Bispectrum-based VAD  Model-based feature compensation  VTS results on AURORA4  Including uncertainty caused by noise

23 23 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Schedule  Model-based feature compensation  VTS: results on AURORA4  VTS formulation  VTS vs non linear feature normalization procedures  VTS results on AURORA 4  Including uncertainty caused by noise  Including uncertainty in noise compensation  Wiener filtering + uncertainty: results on Aurora 2  Wiener filtering + uncertainty: results on Aurora 4  VTS + uncertainty: formulation  Numerical integration of probabilities: formulation

24 24 HIWIRE Meeting – Athens, 3 - 4 November, 2005 VTS formulation  VTS: Vector Taylor Series approach to remove additive (and channel) noise  References:  P.J. Moreno. “Speech recognition in noisy environments” Ph.D. Thesis, Carnegie-Mellon University, Pittsburgh, Pensilvania, Apr. 1996.  A. de la Torre. “Técnicas de mejora de la representación en los sistemas de reconocimiento automático del habla” Ph.D. Thesis, University of Granada, Spain, Apr. 1999.

25 25 HIWIRE Meeting – Athens, 3 - 4 November, 2005 VTS formulation  VTS provides an estimation of the clean speech in a statistical framework:  Log-FBO domain, assumed additive noise:  Effect of noise described using the “correction function” g():

26 26 HIWIRE Meeting – Athens, 3 - 4 November, 2005  Auxiliary functions f() and h(): 1st and 2nd derivatives:  VTS provides estimation of noisy-speech Gaussian given the clean- speech and the noise Gaussians:  Noisy-speech Gaussian obtained with the expected values: VTS formulation

27 27 HIWIRE Meeting – Athens, 3 - 4 November, 2005 VTS formulation  Noisy-speech Gaussian: formulas:  Models for noise and clean speech:

28 28 HIWIRE Meeting – Athens, 3 - 4 November, 2005 VTS formulation  Model for clean speech provides the model for noisy speech, and also P(k|y) (posterior probability of each Gaussian):  Estimation of clean speech:

29 29 HIWIRE Meeting – Athens, 3 - 4 November, 2005 VTS vs non-linear feature normalization  VTS:  Statistical framework:  Model for noise in log-FBO domain: 1 Gaussian PDF  Model for clean-speech in log-FBO domain: Gaussian mixture  Noise assumed to be additive in FBO domain  Accurate description of noise process ACCURATE COMPENSATION  Non-linear feature normalization:  No a-priori assumption  Component-by-component MORE FLEXIBLE, LESS ACCURATE

30 30 HIWIRE Meeting – Athens, 3 - 4 November, 2005 VTS results on AURORA 4 ExperimentTrain mode Test size WER exp. 01-07 WER exp. 08-14 WER exp. 01-14 BaselineClean16640.53 %50.60 %45.57 % HEQClean16632.19 %42.74 %37.47 % Parametric non-linear EQ Clean16628.78 %34.27 %31.53 % VTSClean16629.46 %37.22 %33.34 % VTS (noise known) Clean16626.97 %32.25 %26.97 % AFEClean16627.57 %34.99 %31.28 % BaselineMulti16624.58 %29.88 %27.23 %

31 31 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Including uncertainty in noise compensation  Noise is a random process: we do not know n, but p(n)  Then, from an observation y we cannot find x, but p(x|y, x, n )  Usually, compensation procedures provide E[x|y, x, n ]  What about uncertainty of x ?  Mean and variance of x :

32 32 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Including uncertainty in noise compensation

33 33 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Including uncertainty in noise compensation  An approach for the estimation of the variance:  Evaluation of HMM Gaussians:

34 34 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Wiener filt. + uncertainty: results on AURORA 2  Preliminary results with Wiener filtering:  Results on Aurora 2 with Wiener filtering + uncertainty Train modeWER Set AWER Set BWER Set CAver. WER Wiener Clean 15.75 %15.87 %17.62 %16.17 % Wiener + Uncert. Clean 12.13 %12.90 %13.28 %12.67 % Wiener Multi 8.91 %10.44 %10.95 %9.93 % Wiener + Uncert. Multi 8.87 %10.34 %10.69 %9.82 %

35 35 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Wiener filter + uncertainty: results on AURORA 4 ExperimentTrain mode Test size WER exp. 01-07 WER exp. 08-14 WER exp. 01-14 BaselineClean16640.53 %50.60 %45.57 % HEQClean16632.19 %42.74 %37.47 % Parametric non-linear EQ Clean16628.78 %34.27 %31.53 % VTSClean16629.46 %37.22 %33.34 % Wiener + Uncertainty Clean16627.68 %33.79 %30.74 % AFEClean16627.57 %34.99 %31.28 % BaselineMulti16624.58 %29.88 %27.23 %

36 36 HIWIRE Meeting – Athens, 3 - 4 November, 2005 VTS + uncertainty: formulation  VTS based estimation of clean speech:  VTS based estimation of variance:

37 37 HIWIRE Meeting – Athens, 3 - 4 November, 2005 Numerical integration of probabilities: formulation  Computation of expected values:  Numerical integration of expected values:

38 HIWIRE MEETING Athens, November 3-4, 2005 José C. Segura, Ángel de la Torre


Download ppt "HIWIRE MEETING Athens, November 3-4, 2005 José C. Segura, Ángel de la Torre."

Similar presentations


Ads by Google