Presentation is loading. Please wait.

Presentation is loading. Please wait.

HIWIRE MEETING CRETE, SEPTEMBER 23-24, 2004 JOSÉ C. SEGURA LUNA GSTC UGR.

Similar presentations


Presentation on theme: "HIWIRE MEETING CRETE, SEPTEMBER 23-24, 2004 JOSÉ C. SEGURA LUNA GSTC UGR."— Presentation transcript:

1 HIWIRE MEETING CRETE, SEPTEMBER 23-24, 2004 JOSÉ C. SEGURA LUNA GSTC UGR

2 2 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Schedule  VAD for noise suppression & frame-dropping  Long-Term Spectral divergence  Subband OS-based detector  Non-linear feature normalization  Histogram equalization  OS-based equalization  Segmental implementation

3 3 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 VAD (1)  VAD: motivation  To get an estimation of the background noise for  Wiener filter design  Spectral subtraction  To discard non-speech frames WIENER FILTER / SS VAD FRAME DROPPING NOISE ESTIMATION RECOGNIZER NOISY SPEECH

4 4 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 VAD (2)  Our approach  Use of rather long time spans (~100ms) instead of instantaneous measures  Increase discrimination  Use an statistical model in the log-FBE domain  Smoother estimations  Use a feedback decision coupled with noise suppression  VAD works on less noisy speech  Use of Order Statistics  More robust estimation

5 5 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Long-Term Spectral Divergence (1)  J. Ramírez, J.C. Segura, C. Benítez, A. de la Torre and A.J. Rubio, Efficient voice activity detection algorithms using long-term speech information, Speech Communication 42 (2004) 271–287

6 6 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Long-Term Spectral Divergence (2)

7 7 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Long-Term Spectral Divergence (3)

8 8 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Long-Term Spectral Divergence (4)

9 9 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Long-Term Spectral Divergence (5)

10 10 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Long-Term Spectral Divergence (7)  Recognition experiments with AURORA 2 and 3

11 11 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Long-Term Spectral Divergence (6)

12 12 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Subband OSF VAD (1)  J. Ramírez, J.C. Segura, C. Benítez, A. de la Torre, and A.J. Rubio, An Effective Subband OSF-based VAD with Noise Reduction for Robust Speech Recognition, IEEE Trans. On Speech and Audio Processing (to appear in 2005)  Decision is based on averaged QSNR defined as a inter-quantile difference  Feedback structure  VAD operates over the noise-reduced signal

13 13 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Subband OSF VAD (2)

14 14 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Subband OSF VAD (3)

15 15 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Subband OSF VAD (4)

16 16 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Subband OSF VAD (5)

17 17 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Accurate VAD  Open topics  New alternatives to improve the performance  New decision criteria based on OS- filters  Already used for edge detection in images  Computational efficiency  Development of computationally efficient algorithms

18 18 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Feature normalization  Objective  Transform features to remove undesired variability  Linear techniques  CMS  Cepstral mean subtraction  Removes the effect of linear channel distortion  CMVN  Cepstral mean and variance normalization  Extension of CMS to deal with variance reduction caused by the additive noise

19 19 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Feature normalization  Non-linear feature distortion  Environment effects are non-linear for MFCC features  And can hardly be removed with linear techniques  Because not only the location (mean) and scale (variance) of the feature distributions are affected, but also the shape (affecting higher order moments of the distribution)  Non-linear extensions  CDF-matching approaches (HEQ and related)  Have been proved to be more effective than linear ones  Give normalization for not only the two first moments of the probability distributions

20 20 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 CDF-matching based equalization  The main idea  Transform the features to match a given PDF  In the one-dimensional case CDF-matching gives the solution

21 21 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Equalization and robust classifiers

22 22 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Invariance  CMS is invariant to additive bias  CMVN is invariant to linear transformations  Equalization to a reference distribution is invariant to any invertible transformation (including non-linear ones)

23 23 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 HEQ for robust speech recognition (1)  A. de la Torre, A.M. Peinado, J.C. Segura, J.L. Pérez, C. Benítez and A.J. Rubio, Histogram equalization of speech representation for robust speech recognition, IEEE Tans. On Speech and Audio Processing (to appear in 2005)  Transformation of each component of the MFCC vector to a Gaussian reference  Cumulative distribution are estimated using histograms  Performance compared with CMS, CMVN and model-based feature compensation (VTS)  Combination with (VTS)

24 24 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 HEQ for robust speech recognition (2)

25 25 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 HEQ for robust speech recognition (3)

26 26 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 HEQ for robust speech recognition (4)

27 27 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 HEQ for robust speech recognition (5)

28 28 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Segmental HEQ (1)  J.C. Segura, C. Benítez, A. de la Torre, A.J. Rubio and J. Ramírez, Cepstral Domain Segmental Nonlinear Feature Transformations for Robust Speech Recognition, IEEE Signal Processing Letters, 11(5), May 2004  A segmental implementation of HEQ for non-stationary noise  A temporal buffer is used for the histogram estimation instead of the full sentence  The algorithmic delay is T frames

29 29 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Segmental HEQ (2)

30 30 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 OSEQ: An efficient implementation (1)  A very computationally efficient algorithm based on Order Statistics

31 31 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 OSEQ: An efficient implementation (2)

32 32 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Feature normalization  Open topics  Reference distribution Clean speech / Gaussian / ¿Others?  Dynamic features normalization (  and   ) After, before or simultaneously [Obuchi, Stern, EUSP’03]  Progressive normalization Not all MFCC are equally affected and do not have equal discriminative power [de Wet, …, ICASSP’03] Lower order moments normalization [Hsu, Lee, ICASSP’04]  Parametric techniques Actual approaches are non-parametric [ Haverinen, Kiss, EUSP’03]  New applications Speaker independence and adaptation Multi-stream normalization

33 33 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 Combination of techniques  Development of a combined robust front-end  An accurate VAD  For noise parameter estimation  A noise reduction technique  Spectral subtraction or Wiener filter  Statistical feature compensation  A Frame-Dropping algorithm  To discard non-speech frames  And a Feature normalization block  For residual non-linear distortion compensation

34 34 José C. Segura Luna HIWIRE Meeting – Crete, 23-24 September, 2004 VAD (1)  Development of a combined robust front-end WIENER FILTER / SS VAD FRAME DROPPING NOISE ESTIMATION FEATURE EQUALIZATION NOISY SPEECH RECOGNIZER

35 HIWIRE MEETING CRETE, SEPTEMBER 23-24, 2004 JOSÉ C. SEGURA LUNA GSTC UGR


Download ppt "HIWIRE MEETING CRETE, SEPTEMBER 23-24, 2004 JOSÉ C. SEGURA LUNA GSTC UGR."

Similar presentations


Ads by Google