P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July 2009 1 DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,

Slides:



Advertisements
Similar presentations
Advanced Speech Enhancement in Noisy Environments
Advertisements

Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
EE Dept., IIT Bombay Workshop “AICTE Sponsored Faculty Development Programme on Signal Processing and Applications", Dept. of Electrical.
VOICE CONVERSION METHODS FOR VOCAL TRACT AND PITCH CONTOUR MODIFICATION Oytun Türk Levent M. Arslan R&D Dept., SESTEK Inc., and EE Eng. Dept., Boğaziçi.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
1 New Technique for Improving Speech Intelligibility for the Hearing Impaired Miriam Furst-Yust School of Electrical Engineering Tel Aviv University.
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.
Representing Acoustic Information
Patrick-André Savard, Philippe Gournay and Roch Lefebvre Université de Sherbrooke, Québec, Canada.
Real-time Implementation of Multi-band Frequency Compression for Listeners with Moderate Sensorineural Impairment [Ref.: N. Tiwari, P. C. Pandey, P. N.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
Figures for Chapter 6 Compression
EE Dept., IIT Bombay NCC2014 Kanpur, 28 Feb.- 2 Mar. 2014, Paper No (Session III, Sat., 1 st Mar., 1020 – 1200) A Sliding-band.
IIT Bombay ICA 2004, Kyoto, Japan, April 4 - 9, 2004   Introdn HNM Methodology Results Conclusions IntrodnHNM MethodologyResults.
Speech Enhancement Using Noise Estimation Based on
1 SPEECH PROCESSING FOR BINAURAL HEARING AIDS Dr P. C. Pandey EE Dept., IIT Bombay Feb’03.
Second International Conference on Intelligent Interactive Technologies and Multimedia (IITM 2013), March 2013, Allahabad, India 09 March 2013 Speech.
IIT Bombay Dr. Prem C. Pandey Dr. Pandey is a Professor in Electrical Engineering at IIT Bombay. He is currently also the Associate.
SPECTRO-TEMPORAL POST-SMOOTHING IN NMF BASED SINGLE-CHANNEL SOURCE SEPARATION Emad M. Grais and Hakan Erdogan Sabanci University, Istanbul, Turkey  Single-channel.
1 ELEN 6820 Speech and Audio Processing Prof. D. Ellis Columbia University Midterm Presentation High Quality Music Metacompression Using Repeated- Segment.
♠♠♠♠ 1Intro 2.Loudness 3.Method. 4.Results 5.Concl. ♦♦ ◄◄ ► ► 1/161Intro 2.Loudness 3.Method. 4.Results 5.Concl. ♦♦ ◄ ► IIT Bombay ICA 2010 : 20th Int.
Speech Enhancement Using Spectral Subtraction
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
EE Dept., IIT Bombay Indicon2013, Mumbai, Dec. 2013, Paper No. 524 (Track 4.1,
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
1/18 1.Intro 2. Implementation 3. Results 4. Con.
EE Dept., IIT Bombay NCC 2013, Delhi, Feb. 2013, Paper 3.2_2_ ( Sat.16 th, 1135 – 1320, 3.2_2) Speech Enhancement.
EE Dept., IIT Bombay NCC 2015, Mumbai, 27 Feb.- 1 Mar. 2015, Paper No (28 th Feb., Sat., Session SI, 10:05 – 11:15, Paper.
♠ 1.Intro 2. List. tests 3. Results 4 Concl.♠♠ 1.Intro 2. List. tests 3. Results 4 Concl. ♥♥ ◄◄ ► ► 1/17♥♥◄ ► IIT Bombay ICA 2010 : 20th Int. Congress.
♥♥♥♥ 1. Intro. 2. VTS Var.. 3. Method 4. Results 5. Concl. ♠♠ ◄◄ ►► 1/181. Intro.2. VTS Var..3. Method4. Results5. Concl ♠♠◄◄►► IIT Bombay NCC 2011 : 17.
Nico De Clercq Pieter Gijsenbergh.  Problem  Solutions  Single-channel approach  Multichannel approach  Our assignment Overview.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
IIT Bombay 1/26 Automated CVR Modification for Improving Perception of Stop Consonants A. R. Jayan & P. C. Pandey EE Dept, IIT.
Gammachirp Auditory Filter
Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE Speech Processing Instructor: Dr Kepuska.
EE Dept., IIT Bombay IEEE Workshop on Intelligent Computing, IIIT Allahabad, Oct Signal processing for improving speech.
Speech Enhancement Using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimation method.
7-Speech Quality Assessment Quality Levels Subjective Tests Objective Tests IntelligibilityNaturalness.
IIT Bombay 14 th National Conference on Communications, 1-3 Feb. 2008, IIT Bombay, Mumbai, India 1/27 Intro.Intro.
IIT Bombay {pcpandey,   Intro. Proc. Schemes Evaluation Results Conclusion Intro. Proc. Schemes Evaluation Results Conclusion.
IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and.
Laboratory for Experimental ORL K.U.Leuven, Belgium Dept. of Electrotechn. Eng. ESAT/SISTA K.U.Leuven, Belgium Combining noise reduction and binaural cue.
EE Dept., IIT Bombay P. C. Pandey, "Signal processing for persons with sensorineural hearing loss: Challenges and some solutions,”
Vocal Tract & Lip Shape Estimation By MS Shah & Vikash Sethia Supervisor: Prof. PC Pandey EE Dept, IIT Bombay AIM-2003, EE Dept, IIT Bombay, 27 th June,
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
EE Dept., IIT Bombay Part B Sliding-band Dynamic Range Compression (N. Tiwari & P. C. Pandey, NCC 2014) P. C. Pandey, "Signal processing.
EE Dept., IIT Bombay Workshop “Radar and Sonar Signal Processing,” NSTL Visakhapatnam, Aug 2015 Coordinator: Ms. M. Vijaya.
1 6-Speech Quality Assessment Quality Levels IntelligibilityNaturalness Subjective Tests Objective Tests.
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest.
7-Speech Quality Assessment Quality Levels Subjective Tests Objective Tests IntelligibilityNaturalness.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
IIT Bombay ISTE, IITB, Mumbai, 28 March, SPEECH SYNTHESIS PC Pandey EE Dept IIT Bombay March ‘03.
1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future.
EE Dept., IIT Bombay CEP-cum-TEQUIP-KITE Course “Digital Signal Processing”, IIT Bombay, 2–6 November 2015, Course Coordinator:
IIT Bombay ICSCN International Conference on Signal Processing, Communications and Networking 1/30 Intro.Intro. Clear speech.
Saketh Sharma, Nitya Tiwari, & Prem C. Pandey
Speech Enhancement Algorithm for Digital Hearing Aids
Speech and Singing Voice Enhancement via DNN
Speech Enhancement Summer 2009
Fletcher’s band-widening experiment (1940)
Precedence-based speech segregation in a virtual auditory environment
ARTIFICIAL NEURAL NETWORKS
Automated Detection of Speech Landmarks Using
A Smartphone App-Based
Speech and Audio Processing
Results from offline processing
INTRODUCTION TO THE SHORT-TIME FOURIER TRANSFORM (STFT)
Presentation transcript:

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P, Paper: S4P.1 Multi-Band Frequency Compression for Sensorineural Hearing Impairment P. N. Kulkarni 1 P. C. Pandey 2 D. S. Jangamashetti 3 1, 2 IIT Bombay, India 3 Basaveshwar Engg. College, Bagalkot, Kar., India

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July ABSTRACT Sensorineural hearing loss is associated with widening of the auditory filters, leading to increased spectral masking and degraded speech perception. Multi-band frequency compression can be used for reducing the effect of spectral masking. The speech spectrum is divided into a number of bands and spectral samples in each of these bands are compressed towards the band center, by a constant compression factor. In the present study, we have investigated the effectiveness of the scheme for different compression factors, in improving the speech perception. Evaluation of the scheme using the modified rhyme test showed maximum improvement in recognition scores for compression factor of 0.6: about 17 % for the normal-hearing subjects under simulated hearing loss, and 6 – 21 % for the subjects with moderate to severe sensorineural hearing loss.

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July INTRODUCTION Sensorineural hearing loss Widening of auditory filters, resulting in increased spectral masking and degradation in speech perception. Multi-band frequency compression for reducing the effect of increased spectral masking Spectrum divided into a number of bands and spectral components in each band compressed towards the center, for enhancing the spectral contrasts.

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July Earlier work Critical band based compression [Yasu et al., 2002, 2004] ▪ Magnitude spectrum compressed towards center of each critical band and associated with unaltered phase spectrum (segmentation with Hamming window, STFT, spectral modification, and overlap-add synthesis) ▪ Moderate improvement in the VCV recognition score for hearing- impaired subjects (unproc. 35.4%, proc. 38.3%). Objective of the investigation To select the most appropriate combination of segmentation, bandwidth, frequency mapping, and compression factor for analysis-synthesis.

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July SIGNAL PROCESSING Segmentation ▪ Fixed-frame : 20 ms frames with 50% overlap. ▪ Pitch-synch.: two local pitch period frames aligned to glottal closure instants (GCIs), with one pitch period overlap. Spectral analysis and modification ▪ Zero padding, 1024-point FFT ▪ Compression of complex spectral samples in a set of predefined bands towards the center by a fixed CF Resynthesis : IFFT and overlap-add Input speech Proc. speech

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July Factors affecting quality & intelligibility ▪ Segmentation ▪ Bandwidth ▪ Frequency mapping ▪ Comp. factor Bandwidth ▪ Constant bandwidth (no. of bands : 2 – 18) ▪ 1/3 octave bandwidth ▪ Auditory critical bandwidth (ACB) BW = ACB, Comp. factor = 0.6

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July Frequency mapping ▪ Sample-to-sample ▪ Superimposition of spectral samples ▪ Spectral segment Spectral segment mapping m, n : first and last FFT indices in the segment [ a, b ]. Output spectral sample = weighted sum of complex spectral samples in the input frequency segment [ a, b ] corresponding to the output sample k '.

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July Processing example ▪ / aka /: (a) unpr. (b) proc. (fixed-frame seg., spectral segment mapping. ACB, CF = 0.6). ▪ Harmonic structure in voiced segments & randomness in unvoiced segments approximately preserved (a) (b) MOS tests (normal hearing subjects) Highest scores for pitch-synch. segmentation, ACB, spectral seg. mapping [Kulkarni et al, Int. J. Speech Tech., vol. 10, pp , 2007]

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July LISTENING TESTS Modified Rhyme Test (MRT) for quantitative evaluation of speech intelligibility ▪ 300 CVC words, presented in six test lists with a carrier phrase. ▪ Automated test procedure for randomized presentation & recording of response and response time. Experiment I ▪ 6 normal-hearing subjects with simulated hearing loss: Broadband masking noise added to the processed speech with SNR constant on a short time (10 ms) basis ▪ SNR: ∞, 6, 3, 0, -3, -6, -9, -12, and -15 dB. ▪ 10,800 presentations per subject (300 words × 4 comp. factors × 9 SNR) Experiment II ▪ 11 subjects with moderate to severe sensorineural hearing loss (without using their hearing aids). ▪ 1,200 presentations per subject (300 words × 4 comp. factors)

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July Exp. I: Recognition scores (avg. across 6 normal hearing subjects) ▪ Processing improved recognition scores for SNR < 0 dB ▪ Best improvements observed for C.F. = 0.6 ( p < 0.001). ▪ Avg. Improvement of 17 % in recognition score for SNR < -6 dB ▪ SNR advantage of 6 dB at about 60% recognition score. 4. RESULTS

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July Exp. II : Recognition scores (for 11 hearing- impaired subjects) Compression factor Improvement in % R. S. 2 – 86 – 213 – 16

P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July CONCLUSION For both the group of subjects, max. improvement in recognition scores for CF = 0.6 ● Normal-hearing subjects ▪ Avg. Improvement of 17 % in recognition score for SNR < -6 dB. ▪ SNR advantage of 6 dB. ● Hearing-impaired subjects Improvement in % recognition score: 6 – 21.