Download presentation
Presentation is loading. Please wait.
Published byEleanore Simpson Modified over 8 years ago
1
Presented By: Shamil. C Roll no: 68 E.I Guided By: Asif Ali Lecturer in E.I
2
Introduction Speech measurement with LDV Principe of LDV Measurement Setup Problem formulation Speech Enhancement Algorithm Speckle noise suppression LDV-Based time frequency VAD Spectral gain modification Experimental Results Conclusion
3
Achieving high speech intelligibility in noisy environments is one of the most challenging and important problems for existing speech- enhancement and speech-recognition systems. Recently, several approaches have been proposed that make use of auxiliary non acoustic sensors, such as bone and throat- microphones. Major drawback of most existing sensors is the requirement for a physical contact between the sensor and the speaker. Here present an alternative approach that enables a remote measurement of speech, using an auxiliary laser Doppler vibrometer (LDV) sensor.
5
fd(t) = 2ν(t) cos(α)/λ ν(t)=> instantaneous throat-vibrational velocity α => Angle between the object beam and the velocity vector λ =>laser wavelength. LDV-output signal after an FM-demodulator is Z(t) = f b + [2Av cos(α)/λ].cos(2πf v t). (1)
7
Employing the VibroMet™500V LDV. Consists of a remote laser-sensor head and an electronic controller. Operates at 780 nm wavelength. Can detect vibration frequencies from DC to over 40 kHz. Its operational working distance ranges from 1 cm to 5 m.
9
let y(n) =x(n) + d(n) y(n)-observed signal in the acoustic sensor. x(n) -Speech signal. d(n)-Un correlated additive noise signal. In the STFT domain, Y lk = X lk + D lk Where l= 0, 1,... is the frame index. k = 0, 1,..., N − 1is the frequency- bin index.
10
Use overlapping frames of N samples with a framing-step of M samples. Let H 0lk and H 1lk indicate, respectively, speech absence and presence hypotheses in the time-frequency bin (l, k), i.e., H 0lk : Y lk = D lk H 1lk : Y lk = X lk + D lk. X̂ lk = G lk Y lk.
11
The OM-LSA estimator minimizes the log spectral amplitude under signal presence uncertainty resulting in, G lk = {G H1lk }ˆP lk.G min ˆ1 − P lk. Where, G H1lk is a conditional gain function given H 1lk & G min << 1 is a constant attenuation factor. P lk is the conditional speech presence probability.
12
Denoting by ξlk and γlk we get, is the a priori probability for speech absence, -Posteriori SNR -Priori SNR
13
Speckle-Noise Suppression The output of the speckle-noise detector is, W l (n) = G l Z l (n) Where G l = Gs min <<1 for I l = 1(speckle noise is present) G l = 1 otherwise.
15
-Represents the noise-estimate bias -Smoothed-version of the power spectrum Then, we propose the following soft- decision VAD:
17
Speech in a given frame is defined by We attenuate high-energy transient components to the level of the stationary background noise by updating the gain floor to -Stationary noise-spectrum estimate -Smoothed noisy spectrum
19
Speckle noise was successfully attenuated from the LDV-measured signal using a kurtosis-based decision rule. A soft-decision VAD was derived in the time-frequency domain and the gain function of the OM-LSA algorithm was appropriately modified. The effectiveness of the proposed approach in suppressing highly non-stationary noise components was demonstrated.
20
I. Cohen and B. Berdugo, “Speech enhancement for nonstationary noise environment,” Signal Process., vol. 81 T. F. Quatieri, K. Brady, D. Messing, J. P. Campbell, W. M. Campbell, M. S. Brandstein, C. J.Weinstein, J. D. Tardelli, and P. D. Gatewood, “Exploiting nonacoustic sensors for speech encoding,” T. Dekens, W. Verhelst, F. Capman, and F. Beaugendre, “Improved speech recognition in noisy environments by using a throat microphone for accurate voicing detection,” in 18th European Signal Processing Conf. (EUSIPCO), Aallborg, Denmark, Aug. 2010, pp. 23–27 M. Johansmann, G. Siegmund, and M. Pineda, “Targeting the limits of laser doppler vibrometry,” http://www.metrolaserinc.com
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.