Presentation is loading. Please wait.

Presentation is loading. Please wait.

Communications & Multimedia Signal Processing Meeting 6 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 6 July,

Similar presentations


Presentation on theme: "Communications & Multimedia Signal Processing Meeting 6 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 6 July,"— Presentation transcript:

1 Communications & Multimedia Signal Processing Meeting 6 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 6 July, 2005

2 Communications & Multimedia Signal Processing Contents Review of Noise Reduction Methods (more Results) –Review of the methods –DFT-Kalman, a new method for parameter estimation –Evaluation results and sample speech signals FTLP-HNM Model –FTLP-HNM for gap restoration Noise Station –An Interface for the programs

3 Communications & Multimedia Signal Processing Review of Noise Reduction Methods Most noise reduction systems fit to this block- diagram The de-noising method is based on: –Spectral subtraction, or –Bayesian Estimation

4 Communications & Multimedia Signal Processing Spectral Subtraction Where S, X and N are the speech, noisy speech and noise spectral amplitudes, k is the frequency index, α is the power exponent A and B are attenuation and subtraction coefficients respectively and T is the dynamic threshold Spectral subtraction methods vary with the methods used to for estimation of A and B Spectral subtraction method is generally formulized as:

5 Communications & Multimedia Signal Processing Spectral Subtraction Simple SS: Constant A and B (e.g. A=1, B=1, T=0 α=1 or 2) Adaptive Spectral Subtraction: –Using a posteriori SNR (uses only the speech information in current frame) –Using a priori SNR (tracks the fluctuations of speech in successive frames) –Using a posteriori and a priori SNRs (e.g. optimized to give the MMSE) Different algorithms are used for calculation of the threshold The number of negative values resulting from spectral subtraction could be large and depends on the noise spectrum and SNR

6 Communications & Multimedia Signal Processing Bayesian Estimation Frames are independent: –Estimation of ST-DFT components (real and imaginary) Gaussian-Gaussian (Wiener) Other distributions for speech and noise (various estimators by Martin) –Estimation of the amplitude and using noisy phase Amplitude, log-Amplitudes, Power (different parameters to be estimated) Gaussian, Gaussian Mixtures (needs training), Laplacian (computationally not feasible) Criteria: MMSE, MAP, Joint phase and amplitude MAP, etc. –Methods for parameter estimation use inter-frame information Frames are not independent: –DFT-Kalman

7 Communications & Multimedia Signal Processing Bayesian Estimation Wiener: speech always suppressed Distributions vary from phoneme to phoneme and frequency to frequency Average Symetric Kullback-Leibler Distance

8 Communications & Multimedia Signal Processing DFT-Kalman Incorporate the AR model of the short-time DFT trajectories for estimation Gaussian Distribution Noise in each ST-DFT channel is assumed to be WGN

9 Communications & Multimedia Signal Processing DFT-Kalman During noise only periods the output converges to zero, making the whole output zero In order to avoid too small values of LP error covariance, Q, during speech active periods: Q=max (Q,m×|X(k)| 2 ) (0.05) 2 <m<(0.30) 2 Small values of m results in further reduction of background noise but results in more distortion of the speech signal.

10 Communications & Multimedia Signal Processing DFT-Kalman Another method is based on spectral subtraction of the ST-DFT Trajectories. An autocorrelation vector is obtained using spectral subtraction at the start of the speech after long noise-only periods: Where L+1 is the number of samples used in calculation of the autocorrelation vector and X r (n) is the real component of the ST-DFT trajectories at frame n and an arbitrary frequency. Similar equations hold for the imaginary components.

11 Communications & Multimedia Signal Processing DFT-Kalman Where n 1 is the frame index of the first speech segment detected. Regardless of the presence of speech if the variance of the excitation of the AR model is lower than a fixed threshold, a weighted average of the spectral subtraction-based autocorrelation and the autocorrelation of the previous estimates of the ST-DFT trajectories is used: This autocorrelation is linearly combined with the estimated autocorrelation obtained from previous estimated samples:

12 Communications & Multimedia Signal Processing Evaluation of the methods The correlation coefficient between different distortion measures and the mean opinion score (MOS) of 90 sentences is calculated (noisy, clean and de-noised) (number of listeners: 10) PESQ has the highest correlation with the MOS results

13 Communications & Multimedia Signal Processing PESQ – Car Noise SASS: Simple Amplitude SSBPSS: a post. Power SSMBSS: Multiband SS SSAPR: a priori Amplitude SSPSS: Parametric SS MMSE STSA: Ephraim’s Amp. EstimatorMMSE LSA: Ephraim’s Log-Amp. Estimator GGDFT: Martin’s Gamma-Gamma DFT Estimator

14 Communications & Multimedia Signal Processing PESQ – Train Noise SASS: Simple Amplitude SSBPSS: a post. Power SSMBSS: Multiband SS SSAPR: a priori Amplitude SSPSS: Parametric SS MMSE STSA: Ephraim’s Amp. EstimatorMMSE LSA: Ephraim’s Log-Amp. Estimator GGDFT: Martin’s Gamma-Gamma DFT Estimator

15 Communications & Multimedia Signal Processing Mean Opinion Score – Car Noise SASS: Simple Amplitude SSBPSS: a post. Power SSMBSS: Multiband SS SSAPR: a priori Amplitude SSPSS: Parametric SS MMSE STSA: Ephraim’s Amp. EstimatorMMSE LSA: Ephraim’s Log-Amp. Estimator GGDFT: Martin’s Gamma-Gamma DFT Estimator

16 Communications & Multimedia Signal Processing Mean Opinion Score – Train Noise SASS: Simple Amplitude SSBPSS: a post. Power SSMBSS: Multiband SS SSAPR: a priori Amplitude SSPSS: Parametric SS MMSE STSA: Ephraim’s Amp. EstimatorMMSE LSA: Ephraim’s Log-Amp. Estimator GGDFT: Martin’s Gamma-Gamma DFT Estimator

17 Communications & Multimedia Signal Processing Sample Speech Signals Car Noise Noisy SASS BPSS MBSS SSAPR PSS Wiener MMSE STSA MMSE LSA GGDFT DFTK DFTSS Train Noise Noisy SASS BPSS MBSS SSAPR PSS Wiener MMSE STSA MMSE LSA GGDFT DFTK DFTSS Clean Signal SASS: Simple Amplitude SSBPSS: a post. Power SSMBSS: Multiband SS SSAPR: a priori Amplitude SSPSS: Parametric SS MMSE STSA: Ephraim’s Amp. EstimatorMMSE LSA: Ephraim’s Log-Amp. Estimator GGDFT: Martin’s Gamma-Gamma DFT Estimator

18 Communications & Multimedia Signal Processing Future and Present Work Investigate the effect of incorporating noise AR model in the Kalman formulation: Where F’s are the state transition matrices of speech and noise. Clean speech would a by- product of the Kalman filtering

19 Communications & Multimedia Signal Processing Future and Present Work Development of FTLP- HNM model together with the group and explore its potential for: –Gap Restoration, –Speech Enhancement, and –(possibly) Coding The problem with phase in gap restoration Sample

20 Communications & Multimedia Signal Processing Future and Present Work Further development of the Noise Station program

21 Communications & Multimedia Signal Processing Future and Present Work Current capabilities: –Open/Close/Save/Amplify/Play/Resample wave signals –Frame by Frame and overall viewing of signal/FFT/LP Spectrum/Excitation/Formants/Pitch Frequency/Harmonics –Add Noise/De-Noise (different methods)/Distortion Measurement –Formant/Pitch/Harmonic Tracking and viewing Future capabilities –An option for adding new methods (de-noising, pitch tracking, etc) easily

22 Communications & Multimedia Signal Processing Future and Present Work function output=MMSESTSA84_NS(signal,fs,P) % output=MMSESTSA84_NS(signal,fs,P) % HELP AND DIRECTIONS APPEARE HERE % Author: - % Date: Dec-04 % INITIALIZE ALL THE PARAMETERS HERE PARAMETER IS=.25; %INITIAL SILENCE LENGTH alpha=.99; %DECISION DIRECTED PARAMETER if (nargin>=3 & isstruct(P)) %EXTRACTING PARAMETERS if isfield(P,'alpha') alpha=IS.alpha; %DECISION DIRECTED PARAMETER else alpha=.99; %DECISION DIRECTED PARAMETER end if isfield(P,'IS') IS=P.IS; else IS=.25; %INITIAL SILENCE LENGTH end %THE PROGRAM STARTS HERE............... Template for the Programs


Download ppt "Communications & Multimedia Signal Processing Meeting 6 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 6 July,"

Similar presentations


Ads by Google