Communications & Multimedia Signal Processing Meeting 7 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 23 November, 2005
Communications & Multimedia Signal Processing Contents Kalman Filter: Speech and noise tracking HNM Model: The degree of “Harmonicity” Bandwidth extension Future work: noise reduction using HNM
Communications & Multimedia Signal Processing Kalman Filter: Speech and noise tracking Previous method: Modelling speech with an AR model X : Noisy S : Speech D : Noise Noise Variance
Communications & Multimedia Signal Processing Kalman Filter: Speech and noise tracking (cont.) New method: Modelling speech AND noise with AR models
Communications & Multimedia Signal Processing Kalman Filter: Speech and noise tracking (cont.) The noise AR models are obtained from noise-only periods. Results from the new model sound more natural Clean Speech Noisy Speech Kalman Old Kalman New SNR Method CarCar DFTKUN DFTKCN MMSE Wiener PSS TrainTrain DFTKUN DFTKCN MMSE Wiener PSS WGNWGN DFTKUN DFTKCN MMSE Wiener PSS
Communications & Multimedia Signal Processing HNM Model Harmonic sub-bands are modelled as the sum of a Gaussian and some random noise R: Random Noise with Rayleigh Distribution
Communications & Multimedia Signal Processing HNM Model Sample Reconstructed Frame
Communications & Multimedia Signal Processing HNM Model Original Synthesized, PESQ:3.91
Communications & Multimedia Signal Processing HNM Model Noise severely affects the A k Pitch, Harmonicity and Harmonic frequencies are much less distorted by noise Simple analysis/synthesis of noisy speech improves its quality (SNR<10dB)
Communications & Multimedia Signal Processing LP-HNM Decompose the signal to an LP model (AR or LSF) and an HNM model of the residual (f k,A k,V k ) Amplitude can be assumed to be equal (whitened by inverse modelling) Frequencies also may be assumed to be multiples of the fundamental frequency (later displaced slightly by LP modelling) LP-HNM synthesized PESQ: 3.50
Communications & Multimedia Signal Processing Bandwidth Extension One application of the model is Bandwidth Extension for getting 16KHz speech quality from 8KHz Speech Trained LP-HNM Model LP-HNM Analysis 8KHz Speech Signal 16KHz Speech Signal
Communications & Multimedia Signal Processing Bandwidth Extension Codebook Mapping is used to obtain higher LPF coefficients from lower LPF coefficients extracted from 8KHz signal Similar method is used to obtain the harmonicity degree of higher sub- bands
Communications & Multimedia Signal Processing Bandwidth Extension A shadow codebook for LP gain ratio (G 8 /G 16 ) is used for gain mapping Phase is extrapolated assuming a linear phase for the harmonics, some random noise is added to unvoiced sub-bands The performance of the system deteriorates in noise
Communications & Multimedia Signal Processing Future Work Tracking the HNM parameters using Kalman filter, in other words, rather than tracking DFT trajectories in one frequency bin, it might be better to track only the harmonic bins (reduced computational complexity) along the harmonic frequencies (intuitively makes more sense!)
Communications & Multimedia Signal Processing Future Work Some harmonics proved very difficult to recover from noise (e.g. 1-3). Investigate the possibility of a similar model based approach as the BWE method for estimating parameters of those harmonics. The Harmonicity of the sub-bands and the reciprocal noise level at those frequencies may be used as weights in the mapping process. Clean Speech De-noised Speech
Communications & Multimedia Signal Processing Future Work A is a parameter vector, W is the weighting vector (e.g. reciprocal of normalized noise spectrum). B j is the j th entry of the codebook. The result can be used for reconstructing speech