1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future.

Slides:



Advertisements
Similar presentations
“Connecting the dots” How do articulatory processes “map” onto acoustic processes?
Advertisements

Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Speech Enhancement through Noise Reduction By Yating & Kundan.
Advanced Speech Enhancement in Noisy Environments
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Speech Sound Production: Recognition Using Recurrent Neural Networks Abstract: In this paper I present a study of speech sound production and methods for.
Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Analysis and Synthesis of Shouted Speech Tuomo Raitio Jouni Pohjalainen Manu Airaksinen Paavo Alku Antti Suni Martti Vainio.
Impact of Abnormal Acoustic Properties on the Perceived Quality of Electrolaryngeal Speech Geoffrey Meltzner ALPHATECH Inc., Burlington MA Robert E. Hillman.
Reduction of Additive Noise in the Digital Processing of Speech Avner Halevy AMSC 664 Final Presentation May 2009 Dr. Radu Balan Department of Mathematics.
Speech Enhancement Based on a Combination of Spectral Subtraction and MMSE Log-STSA Estimator in Wavelet Domain LATSI laboratory, Department of Electronic,
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
A PRESENTATION BY SHAMALEE DESHPANDE
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
A VOICE ACTIVITY DETECTOR USING THE CHI-SQUARE TEST
IIT Bombay ICA 2004, Kyoto, Japan, April 4 - 9, 2004   Introdn HNM Methodology Results Conclusions IntrodnHNM MethodologyResults.
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future.
Second International Conference on Intelligent Interactive Technologies and Multimedia (IITM 2013), March 2013, Allahabad, India 09 March 2013 Speech.
IIT Bombay AIM , IIT Bombay, 27 June ’03 1 Online Monitoring of Dissipation Factor Dayashankar Dubey (MTech) Suhas P. Solanki,
Nico De Clercq Pieter Gijsenbergh Noise reduction in hearing aids: Generalised Sidelobe Canceller.
Speech Enhancement Using Spectral Subtraction
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
EE Dept., IIT Bombay Indicon2013, Mumbai, Dec. 2013, Paper No. 524 (Track 4.1,
Reduction of Additive Noise in the Digital Processing of Speech Avner Halevy AMSC 663 Mid Year Progress Report December 2008 Professor Radu Balan 1.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
93 SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIES VOICE SPECTRUM SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIES.
EE Dept., IIT Bombay NCC 2013, Delhi, Feb. 2013, Paper 3.2_2_ ( Sat.16 th, 1135 – 1320, 3.2_2) Speech Enhancement.
EE Dept., IIT Bombay NCC 2015, Mumbai, 27 Feb.- 1 Mar. 2015, Paper No (28 th Feb., Sat., Session SI, 10:05 – 11:15, Paper.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
♥♥♥♥ 1. Intro. 2. VTS Var.. 3. Method 4. Results 5. Concl. ♠♠ ◄◄ ►► 1/181. Intro.2. VTS Var..3. Method4. Results5. Concl ♠♠◄◄►► IIT Bombay NCC 2011 : 17.
Nico De Clercq Pieter Gijsenbergh.  Problem  Solutions  Single-channel approach  Multichannel approach  Our assignment Overview.
♥♥♥♥ 1. Intro. 2.Spec.sub. 3.Est. noise 4.Intro.J& S 5.Results 6 Concl. ♠♠ ◄◄ ►► 1/191. Intro.2.Spec.sub.3.Est. noise4.Intro.J& S5.Results6 Concl ♠♠◄◄►►
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Noise Reduction Two Stage Mel-Warped Weiner Filter Approach.
Speech Enhancement Using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimation method.
Pitch Estimation by Enhanced Super Resolution determinator By Sunya Santananchai Chia-Ho Ling.
Study of Broadband Postbeamformer Interference Canceler Antenna Array Processor using Orthogonal Interference Beamformer Lal C. Godara and Presila Israt.
P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,
IIT Bombay 14 th National Conference on Communications, 1-3 Feb. 2008, IIT Bombay, Mumbai, India 1/27 Intro.Intro.
IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and.
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
EE Dept., IIT Bombay P. C. Pandey, "Signal processing for persons with sensorineural hearing loss: Challenges and some solutions,”
Vocal Tract & Lip Shape Estimation By MS Shah & Vikash Sethia Supervisor: Prof. PC Pandey EE Dept, IIT Bombay AIM-2003, EE Dept, IIT Bombay, 27 th June,
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
EE Dept., IIT Bombay Part B Sliding-band Dynamic Range Compression (N. Tiwari & P. C. Pandey, NCC 2014) P. C. Pandey, "Signal processing.
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
IIT Bombay ISTE, IITB, Mumbai, 28 March, SPEECH SYNTHESIS PC Pandey EE Dept IIT Bombay March ‘03.
UNIT-IV. Introduction Speech signal is generated from a system. Generation is via excitation of system. Speech travels through various media. Nature of.
Spectral subtraction algorithm and optimize Wanfeng Zou 7/3/2014.
Presented By: Shamil. C Roll no: 68 E.I Guided By: Asif Ali Lecturer in E.I.
Speech Enhancement Summer 2009
Vocoders.
Automated Detection of Speech Landmarks Using
Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
ON THE ARCHITECTURE OF THE CDMA2000® VARIABLE-RATE MULTIMODE WIDEBAND (VMR-WB) SPEECH CODING STANDARD Milan Jelinek†, Redwan Salami‡, Sassan Ahmadi*, Bruno.
Two-Stage Mel-Warped Wiener Filter SNR-Dependent Waveform Processing
Linear Prediction.
Speech Processing Final Project
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Enhancement of Electrolaryngeal Speech by Reducing Leakage Noise Using Spectral Subtraction by Prem C. Pandey EE Dept, IIT Bombay Electro Info Com’2007 / St Francis Inst. of Technology, Mumbai / 4-6 Jan’07

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Abstract Transcervical electrolarynx is a vibrator held against the neck tissue in order to provide excitation to the vocal tract, as a substitute to that provided by a natural larynx. It is of great help in verbal communication to a large number of laryngectomee patients. Its intelligibility suffers from the presence of a background noise, caused by leakage of the acoustic energy from the vibrator. Pitch synchronous application of spectral subtraction method, normally used for enhancement of speech corrupted by uncorrelated random noise, can be used for reduction of the self leakage noise for enhancement of electrolaryngeal speech. Average magnitude spectrum of leakage noise, obtained with lips closed, is subtracted from the magnitude spectrum of the noisy speech and the signal is reconstructed using the original phase spectrum. However, the spectrum of the leakage noise varies because of variation in the application pressure and movement of the throat tissue. A quantile based dynamic estimation of the magnitude spectrum without the need for silence/voice detection was found to be effective in noise reduction. 2

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Overview ● Introduction ● Spectral subtraction for enhancement of electrolaryngeal speech ● Quantile-based noise estimation ● Results, summary, & ongoing work 3

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Natural speech production Introduction 1/5 Glottal excitation to vocal tract 4

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay External electronic larynx (Barney et al 1959) Excitation to vocal tract from external vibrator Introduction 2/5 5

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Problems with artificial larynx ● Difficulty in coordinating controls ● Spectrally deficit ● Unvoiced segments substituted by voiced segments ● Background noise due to leakage of acoustic energy Introduction 3/5 6

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Model of noise generation Causes of noise generation : Leakage of vibrations produced by vibrator membrane Improper coupling of vibrator to neck tissue Introduction 4/5 7

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Methods of noise reduction Vibrator design  Acoustic shielding of vibrator ( Epsy-Wilson et al 1996)  Piezoelectric vibrators (Katsutoshi et al 1999) Signal processing  2-input noise cancellation based on LMS algorithm ( Epsy-Wilson et al 1996)  Single input noise cancellation ( Pandey et al 2002) based on spectral subtraction algorithm (Boll 1979 & Berouti et al 1979) Introduction 5/5 8

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Spectral subtraction for enhancement of electrolayngeal speech (Pandey et al 2000) s(n) = e(n)*h v (n), l(n) = e(n)*h l (n) x(n) = s(n) + l(n) X n (e j  ) = E n (e j  )[H v n (e j  ) + H l n (e j  )] Assumption: h v (n) and h l (n) uncorrelated   X n (e j  )  2 =  E n (e j  )  2 [  H v n (e j  )  2 +  H l n (e j  )  2 ] Noise estimation mode: s(n) = 0  X n (e j  )  2 =  L n (e j  )  2 =  E n (e j  )  2  H l n (e j  )  2  L(e j  )  2 : averaged over many segments Speech enhancement mode:  Y n (e j  )  2 =  X n (e j  )  2 -  L(e j  )  2 contd… Spect. subtrn. 1/4 9

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Implementation using DFT  Y n (k)  2 =  X n (k)  2 -  L(k)  2 y n (m) = IDFT [  Y n (k)  e j  X n (k)] Modified spectral subtraction (Berouti et al 1979)  Y n (k)   =  X n (k)   -  L(k)    Y n (k)   =  Y n (k)   if  Y n (k)     L(k)   =  L(k)   otherwise (  : subtraction,  : spectral floor,  : exp. factors) Output normalization factor for  < 1 (Berouti et al 1979) G = {(  X n (k)  2 -  L(k)  2 )/  Y n (k)  2 }/  Spect. subtrn. 2/4 10

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Spectral subtraction method with ABNE (Pandey et al 2002) Spect. subtrn. 3/4 11

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Drawback of averaged noise estimation during silence ● Two modes: noise estimation & speech enhancement ● Estimated noise considered stationary over entire speech enhancement mode ● Some musical & broadband noise in the output Investigations for continuous noise estimation & signal enhancement ● System with voice activity detector (Berouti et al 1979) ● Without involving speech vs non-speech detection (Stahl et al 2000, Evans et al 2002, Houwu et al 2002) Spect. subtrn. 4/4 12

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Quantile-based noise estimation Basis for the technique ● During speech segments, frequency bins tend not to be permanently occupied by speech ● Speech / non-speech boundaries detected implicitly on per frequency basis ● Noise estimates updated throughout non- speech and speech periods QBNE 1/6 13

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Implementation of QBNE ● DFT of windowed speech segments ● FIFO array of past spectral values for each freq. sample is formed ● An efficient indexing algorithm used to sort the arrays to obtain particular quantile value: – A sorted value buffer and an index buffer, for each frequency sample – New data placed at locations of oldest data in sorted buffer by referring index buffer – In all sorted buffers only one value needs to be placed at correct position QBNE 2/6 14

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay QBNE 3/4 Spectral subtraction with QBNE QBNE 3/6 10

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Investigations with QBNE ● Single quantile value - Quantile value which gives best visual match between quantile derived spect. & avg. spect. of noise is selected ● Two quantile values - Two quantiles for two frequency bands, which estimates noise close to avg. spect. of noise, were selected ● Frequency dependent quantile values - Estimated spectrum from noisy speech will be close match to the avg. spectrum of noise QBNE 4/6 16

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay QBNE 5/6 Investigations with QBNE (Contd..) ● Smoothened quantile values - Matched quantiles were averaged using 9 frequency values ● SNR based dynamic quantiles - Dynamic selection of quantiles depending on signal strength q(k) = [(q 1 (k) - q 0 (k)) SNR (k) / SNR 1 (k)] + q 0 (k) q 0 (k) if q (k) < 0 q 1 (k) if q (k) > q 1 (k) 17

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Plot of SNR and frequency dependent quantiles for three different applications of vibrator Frequency sample 18 QBNE 5/6

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Recorded and enhanced speech with (α=2,β=0.001,γ=1,N=16 ms), speaker: SP, material: /a/, /i/,and /u/ using electrolarynx Servox Noise segment /a/ /u/ /i/ UnprocessedProcessed Enhancement results 19 Results 1/3

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Enhancement results Results 2/3 Recorded and enhanced speech with (α=2,β=0.001,γ=1, Widow length=16 ms), speaker: SP, material: question-answer pair in English “ What is your name? My name is santosh” using electrolarynx Servox 20

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Recorded and enhanced speech with (α=2,β=0.001,γ=1), speaker: SP, material: question-answer pair in English “ What is your name? My name is santosh” using electrolarynx NP-1, Servox, and Solatone Results 3/3 Enhancement results 21

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Conclusion 1/2 Conclusion ● QBNE technique implemented for cont. updating of noise spectrum & different methods for selection of quantile values for noise estimation investigated ● Results with QBNE during non-speech segment are comparable with results using ABNE ● Smoothened quantiles and SNR based quantiles resulted in better quality speech ● QBNE is effective for longer duration ● QBNE using SNR based dynamic quantiles is effective during long pauses 22

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay Ongoing work ● Evaluation of intelligibility and quality improvement ● Selection of optimum quantile values for different models of electrolarynx and users ● Phase resynthesis from magnitude spectrum using cepstral method ● Real-time implementation of noise reduction, using ADSP- BF533 board ● Analysis-synthesis for introducing small amount of jitter to improve naturalness Conclusion 2/2 23

1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future work IIT Bombay P.C. Pandey / EE Dept / IIT Bombay