Download presentation
Presentation is loading. Please wait.
Published byMaximillian Booker Modified over 9 years ago
1
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Reduction of background noise in artificial larynx M. Tech. presentation by Santosh Pratapwar Supervisor: Prof. P. C. Pandey EE Dept, IIT Bombay Mar ’04
2
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Project objective Objective of the project was to develop a single microphone input based signal processing technique to enhance the speech corrupted by leakage noise 1
3
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Presentation overview ● Introduction ● Noise reduction techniques ● Spectral subtraction for enhancement of electrolaryngeal speech ● Quantile-based noise estimation ● Investigations with QBNE ● Conclusion & suggestions for further work 2
4
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Natural speech production Introduction 1/5 Glottal excitation to vocal tract 3
5
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work External electronic larynx (Barney et al 1959) Excitation to vocal tract from external vibrator Introduction 2/5 4
6
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Problems with artificial larynx ● Difficulty in coordinating controls ● Spectrally deficit ● Unvoiced segments substituted by voiced segments ● Background noise due to leakage of acoustic energy Introduction 3/5 5
7
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Model of noise generation Causes of noise generation : Leakage of vibrations produced by vibrator membrane Improper coupling of vibrator to neck tissue Introduction 4/5 6
8
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Characteristics of electrolaryngeal speech (Weiss et al 1979) SNR over 4-25 dB across subjects Most of the energy concentrated in 400-800 Hz Second peak found between 1-2 kHz 2-3 additional peaks between 2-4 kHz Freq. and mag. of peaks were Speaker dependent Introduction 5/5 7
9
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Methods of noise reduction Vibrator design Acoustic shielding of vibrator ( Epsy-Wilson et al 1996) Piezoelectric vibrators (Katsutoshi et al 1999) Signal processing 2-input noise cancellation based on LMS algorithm ( Epsy-Wilson et al 1996) Single input noise cancellation (Pandey et a 2002) based on spectral subtraction (Boll 1979 & Berouti et al 1979) Noise red. tech. 1/2 8
10
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Earlier work done at IIT Bombay Algorithms implemented by Hiren Shah (1999) Single input noise cancellation using following methods Ensemble averaging Fourier transform of ensemble averaged spectrum Single input LMS algorithm Algorithm implemented by Santosh Bhandarkar (2001) Single input noise cancellation based on spectral subtraction 9 Noise red. tech. 2/2
11
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Spectral subtraction for enhancement of electrolaryngeal speech (Pandey et al 2002) s(n) = e(n)*h v (n), l(n) = e(n)*h l (n) x(n) = s(n) + l(n) X n (e j ) = E n (e j )[H v n (e j ) + H l n (e j )] Assumption: h v (n) and h l (n) uncorrelated X n (e j ) 2 = E n (e j ) 2 [ H v n (e j ) 2 + H l n (e j ) 2 ] Noise estimation mode, s(n) = 0 X n (e j ) 2 = L n (e j ) 2 = E n (e j ) 2 H l n (e j ) 2 L(e j ) 2 : averaged over many segments Speech enhancement mode: Y n (e j ) 2 = X n (e j ) 2 - L(e j ) 2 contd… Spect. subtrn. 1/11 10
12
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Implementation using DFT Y n (k) 2 = X n (k) 2 - L(k) 2 y n (m) = IDFT [ Y n (k) e j X n (k)] Modified spectral subtraction (Berouti et al 1979) Y n (k) = X n (k) - L(k) Y n (k) = Y n (k) if Y n (k) L(k) = L(k) otherwise ( : subtraction, : spectral floor, : exp. factors) Output normalization factor for < 1 (Berouti et al 1979) G = {( X n (k) 2 - L(k) 2 )/ Y n (k) 2 }/ Spect. subtrn. 2/11 11
13
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Spectral subtraction method (Pandey et al 2002) with average based noise estimation (ABNE) during initial non-speech segment Spect. subtrn. 3/11 12
14
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Problem with spectral subtraction with ABNE ● Varying level of musical & broadband noise in the output Further Investigations with ABNE ● Effect of window positioning w. r. t. the excitation pulse ● Spectral subtraction with full-wave rectification (Pollok et al 1993) ● Extended spectral subtraction (Gustafsson et al 1999) Spect. subtrn. 4/11 13
15
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Recorded and enhanced speech with (α=2,β=0.001,γ=1,N=16 ms), speaker: SP, material: /a/, /i/,and /u/ using electrolarynx Servox Noise segment /a/ /u/ /i/ UnprocessedProcessed 14 Spect. subtrn. 5/11
16
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Recorded and enhanced speech with (α=2,β=0.001,γ=1,N=16 ms), speaker: SP, material: question-answer pair in English “ What is your name? My name is Santosh” using electrolarynx Servox Results 15 Spect. subtrn. 6/11
17
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Recorded and enhanced speech with (α=2,β=0.001,γ=1,N=16 ms), speaker: SP, material: question-answer pair in English “ What is your name? My name is Santosh” using electrolarynx Servox Results 16 Spect. subtrn. 7/11
18
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Extended spectral subtraction (Gustafsson et al 1999) ● Spectral subtraction without explicit calculation of phase spectrum X n (k) Y n (k) = |Y n (k) e j X n (k) = |Y n (k) X n (k) / |X n (k)| y n (m) = IDFT [ Y n (k) ] Spect. subtrn. 8/11 17
19
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Extended spectral subtraction (Gustafsson et al 1999) Spect. subtrn. 9/11 18
20
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Recorded and enhanced speech with (α=2,β=0.001,γ=1,N=16 ms), speaker: SP, material: question-answer pair in English “ What is your name? My name is Santosh” using electrolarynx Servox Results 19 Spect. subtrn. 10/11
21
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Drawback of avg. noise estimation during silence ● Two modes: noise estimation & speech enhancement ● Estimated noise considered stationary over entire speech enhancement mode Investigations for cont. noise estimation & signal enhancement ● System with voice activity detector (Berouti et al 1979) ● Without involving speech vs non-speech detection (Stahl et al 2000, Evans et al 2002, Houwu et al 2002) Spect. subtrn. 11/11 20
22
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Quantile-based noise estimation Basis for the technique ● During speech segments, frequency bins tend not to be permanently occupied by speech ● Speech / non-speech boundaries detected implicitly on per frequency basis ● Noise estimates updated throughout speech / non-speech periods Implementation ● QBNE with calculation of cum. prob. dist. function during non-speech segment ● QBNE for continuous updating of noise spectrum QBNE 1/5 21
23
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work QBNE with efficient calculation of C.P.D.F. Two modes of operations: ● Noise estimation mode: - DFT of windowed speech segments - Array of past DFT values for each frequency sample formed - C.P.D.F. for each frequency sample is computed - Quantile derived spectrum obtained from C.P.D.F. ● Speech enhancement mode: - Estimated noise used for spectral subtraction QBNE 2/5 22
24
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work QBNE for continuous updating of noise ● DFT of windowed speech segments ● FIFO array of past spectral values for each freq. sample is formed ● An efficient indexing algorithm used to sort the arrays to obtain particular quantile value: – A sorted value buffer and an index buffer, for each frequency sample – New data placed at locations of oldest data in sorted buffer by referring index buffer – In all sorted buffers only one value needs to be placed at correct position QBNE 3/5 23
25
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work QBNE 3/4 Spectral subtraction with QBNE QBNE 4/5 24
26
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Recorded and enhanced speech with (α=2,β=0.001,γ=1,N=16 ms), speaker: SP, material: question-answer pair in English “ What is your name? My name is Santosh” using electrolarynx Servox Results 25 QBNE 5/5
27
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Investigations with QBNE ● Single quantile value - Quantile value which gives best visual match between quantile derived spect. & avg. spect. of noise is selected ● Two quantile value - Two quantiles for two frequency bands, which estimates noise close to avg. spect. of noise, were selected ● Matched quantile values - Frequency dependent quantile selection - Estimated spectrum from noisy speech will be close match to the avg. spectrum of noise Invest. QBNE 1/8 26
28
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Investigations with QBNE (Contd..) ● Smoothened quantile values - Matched quantiles were averaged using 9 frequency values ● SNR based dynamic quantiles - Dynamic selection of quantiles depending on signal strength q(k) = [(q 1 (k) - q 0 (k)) SNR (k) / SNR 1 (k)] + q0 (k) q 0 (k) if q (k) < 0 q 1 (k) if q (k) > q 1 (k) Invest. QBNE 2/8 27
29
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Plot of matched quantiles, Smoothed quantiles, avg. quantiles, median quantiles Plot of avg. power spect. of noise and noise estimated using smoothed quantiles, and avg. quantiles Invest. QBNE 3/8 28
30
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Plot of SNR and frequency dependent quantiles for three different applications of vibrator Invest. QBNE 4/8 29
31
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Results Recorded and enhanced speech with (α=2,β=0.001,γ=1,N=16 ms), speaker: SP, material: question-answer pair in English “ What is your name? My name is Santosh” using electrolarynx Servox 30 Invest. QBNE 5/8
32
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Results Recorded and enhanced speech with (α=2,β=0.001,γ=1,N=16 ms), speaker: SP, material: question-answer pair in English “ What is your name? My name is Santosh” using electrolarynx Servox 31 Invest. QBNE 6/8
33
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Results Recorded and enhanced speech with (α=2,β=0.001,γ=1,N=16 ms), speaker: SP, material: Question in English “ Where were you a year ago?” using electrolarynx Servox 32 Invest. QBNE 7/8
34
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Results Recorded and enhanced speech with (α=2,β=0.001,γ=1), speaker: SP, material: question-answer pair in English “ What is your name? My name is Santosh” using electrolarynx NP-1, Servox, and Solatone 33 Invest. QBNE 8/8
35
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Conclusion ● QBNE technique implemented for cont. updating of noise spectrum ● Investigated different methods for selection of quantile values for noise estimation ● Results QBNE during non-speech segment are comparable with results using ABNE ● Results with smoothened quantiles and SNR based quantiles resulted in better quality speech ● Results with QBNE is effective for longer duration ● Results with QBNE using SNR based dynamic quantiles is effective during long pauses Conclusion 1/2 34
36
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work Suggestions for further work ● Evaluation of intelligibility and quality improvement ● Selection of optimum quantile values to be investigated for different models of electrolarynx and for different users ● Study of phase re-synthesis from magnitude spectrum using cepstral method ● Real-time implementation of noise reduction technique for use in an artificial larynx ● Analysis-synthesis for introducing small amount of jitter to make speech more natural Conclusion 2/2 35
37
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest. QBNE6 Conc., & future work
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.