IIT Bombay ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and.

Slides:



Advertisements
Similar presentations
ON THE REPRESENTATION OF VOICE SOURCE APERIODICITIES IN THE MBE SPEECH CODING MODEL Preeti Rao and Pushkar Patwardhan Department of Electrical Engineering,
Advertisements

Figures for Chapter 7 Advanced signal processing Dillon (2001) Hearing Aids.
Analysis and Digital Implementation of the Talk Box Effect Yuan Chen Advisor: Professor Paul Cuff.
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
2004 COMP.DSP CONFERENCE Survey of Noise Reduction Techniques Maurice Givens.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
G.S.MOZE COLLEGE OF ENGINNERING BALEWADI,PUNE -45.
Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson.
Reduction of Additive Noise in the Digital Processing of Speech Avner Halevy AMSC 664 Final Presentation May 2009 Dr. Radu Balan Department of Mathematics.
SOME SIMPLE MANIPULATIONS OF SOUND USING DIGITAL SIGNAL PROCESSING Richard M. Stern demo August 31, 2004 Department of Electrical and Computer.
Communications & Multimedia Signal Processing Report of Work on Formant Tracking LP Models and Plans on Integration with Harmonic Plus Noise Model Qin.
Single-Channel Speech Enhancement in Both White and Colored Noise Xin Lei Xiao Li Han Yan June 5, 2002.
Speech Enhancement Based on a Combination of Spectral Subtraction and MMSE Log-STSA Estimator in Wavelet Domain LATSI laboratory, Department of Electronic,
Communications & Multimedia Signal Processing 1 Speech Communication for Mobile and Hands-Free Devices in Noisy Environments EPSRC Project GR/S30238/01.
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
Communications & Multimedia Signal Processing Refinement in FTLP-HNM system for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing.
Analysis & Synthesis The Vocoder and its related technology.
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.
Source/Filter Theory and Vowels February 4, 2010.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
A VOICE ACTIVITY DETECTOR USING THE CHI-SQUARE TEST
IIT Bombay ICA 2004, Kyoto, Japan, April 4 - 9, 2004   Introdn HNM Methodology Results Conclusions IntrodnHNM MethodologyResults.
1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future.
Second International Conference on Intelligent Interactive Technologies and Multimedia (IITM 2013), March 2013, Allahabad, India 09 March 2013 Speech.
Multimedia Specification Design and Production 2013 / Semester 2 / week 3 Lecturer: Dr. Nikos Gazepidis
Speech Enhancement Using Spectral Subtraction
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
Reduction of Additive Noise in the Digital Processing of Speech Avner Halevy AMSC 663 Mid Year Progress Report December 2008 Professor Radu Balan 1.
93 SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIES VOICE SPECTRUM SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIES.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
♥♥♥♥ 1. Intro. 2. VTS Var.. 3. Method 4. Results 5. Concl. ♠♠ ◄◄ ►► 1/181. Intro.2. VTS Var..3. Method4. Results5. Concl ♠♠◄◄►► IIT Bombay NCC 2011 : 17.
♥♥♥♥ 1. Intro. 2.Spec.sub. 3.Est. noise 4.Intro.J& S 5.Results 6 Concl. ♠♠ ◄◄ ►► 1/191. Intro.2.Spec.sub.3.Est. noise4.Intro.J& S5.Results6 Concl ♠♠◄◄►►
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti / DSP 2009, Santorini, 5-7 July DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P,
IIT Bombay 14 th National Conference on Communications, 1-3 Feb. 2008, IIT Bombay, Mumbai, India 1/27 Intro.Intro.
Sound Waveforms Neil E. Cotter Associate Professor (Lecturer) ECE Department University of Utah CONCEPT U AL TOOLS.
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
Performance Comparison of Speaker and Emotion Recognition
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
Vocal Tract & Lip Shape Estimation By MS Shah & Vikash Sethia Supervisor: Prof. PC Pandey EE Dept, IIT Bombay AIM-2003, EE Dept, IIT Bombay, 27 th June,
SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
1 Introduction1 Introduction 2 Noise red. tech 3 Spect. Subtr. 4. QBNE 5 Invest. QBNE 6 Conc., & future work2 Noise red. tech 3 Spect. Subtr.4. QBNE5 Invest.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
IIT Bombay ISTE, IITB, Mumbai, 28 March, SPEECH SYNTHESIS PC Pandey EE Dept IIT Bombay March ‘03.
1 Introduction1 Introduction 2 Spectral subtraction 3 QBNE 4 Results 5 Conclusion, & future work2 Spectral subtraction 3 QBNE4 Results5 Conclusion, & future.
Speech Processing Laboratory, Temple University May 5, Structure-Based Speech Classification Using Nonlinear Embedding Techniques Uchechukwu Ofoegbu.
Speech Recognition with Matlab ® Neil E. Cotter ECE Department UNIVERSITY OF UTAH
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
7.0 Speech Signals and Front-end Processing References: , 3.4 of Becchetti of Huang.
UNIT-IV. Introduction Speech signal is generated from a system. Generation is via excitation of system. Speech travels through various media. Nature of.
Spectral subtraction algorithm and optimize Wanfeng Zou 7/3/2014.
Speech Enhancement Summer 2009
Vocoders.
Speech Enhancement with Binaural Cues Derived from a Priori Codebook
Linear Prediction.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Speech and Audio Processing
The Vocoder and its related technology
The Production of Speech
Linear Prediction.
Sound and Matlab® Neil E. Cotter ECE Department
Speech Processing Final Project
Sound and Matlab® Neil E. Cotter ECE Department
Presented by Chen-Wei Liu
Presenter: Shih-Hsiang(士翔)
Presentation transcript:

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 1 ICSCI 2004, Hyderabad, India, Feb’ 04 USE OF HARMONIC PLUS NOISE MODEL FOR REDUCTION OF SELF LEAKAGE IN ELECTROALARYNGEAL SPEECH Parveen K. Lehana 1, Prem C. Pandey 2, Santosh S. Pratapwar 2, Rockey Gupta 1 1 University of Jammu, India 2 IIT Bombay, India

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 2 ABSTRACT Artificial larynx is an assistive device for providing excitation to vocal tract as a substitute to a dysfunctional or removed larynx. The speech generated by electrolarynx, an external vibrator held against the neck tissue, is not natural and most of the time is unintelligible because of the improper shape of the excitation pulses and presence of a background noise caused by sound leakage from the vibrator. The objective of this paper is to enhance the intelligibility of electrolaryngeal speech by reducing the background noise using harmonic plus noise model (HNM). The alaryngeal speech and the leakage signal are analyzed using HNM and average harmonic spectrum of the leakage noise is subtracted from the harmonic magnitude spectrum of the noisy speech in each frame. HNM synthesis is carried out retaining the original phase spectra. Investigations show that the output is more natural and intelligible as compared to input speech signal and the enhanced signal obtained from spectral subtraction without HNM analysis and synthesis.

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 3 PRESENTATION OVERVIEW  Introduction  HNM Analysis / synthesis  Spectral subtraction with HNM  Methodology  Results  Conclusion & future plan

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 4 INTRODUCTION (1/5) NATURAL SPEECH PRODUCTION Glottal excitation to vocal tract

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 5 INTRODUCTION (2/5 ) If excitation and vocal tract transfer functions are then output speech is and can be simplified to where &

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 6 INTRODUCTION (3/5 ) External electronic larynx (transcervical electrolarynx) Excitation to vocal tract from external vibrator (creates background noise)

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 7 INTRODUCTION (4/5 ) External electronic larynx (transcervical electrolarynx) Leakage path: - back side of membrane/plate - improper tissue coupling

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 8 INTRODUCTION (5/5 ) RESEARCH OBJECTIVE The objective of this paper is to enhance the intelligibility of electrolaryngeal speech by reducing the background noise using harmonic plus noise model (HNM).

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 9 HNM ANALYSIS / SYNTHESIS (1/3) HARMONIC PLUS NOISE MODEL (Stylianou, 1995; 2001) Speech signal divided into: harmonic part noise part Harmonic part Noise part Parameters: Max. voiced frequency V/UV & pitch Harm. ampl. & phases Noise parameters

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 10 ANALYSIS / SYNTHESIS WITH HNM (2/3) ANALYSIS

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 11 ANALYSIS / SYNTHESIS WITH HNM (3/3) SYNTHESIS

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 12 SPECTRAL SUBTRACTION WITH HNM x ( n ) = e( n )* h v( n ) + e ( n )* h l( n ) Taking DFT: Xn ( ej  ) = En ( ej  ) [ Hvn ( ej  ) + Hln ( ej  ) ] Assumption: h v (n) & h(n) uncorrelated   Xn ( ej  )  2 =  En ( ej  )  2[  Hvn ( ej  )  2 +  Hln ( ej  )  2] During non-speech segment: s ( n ) = 0  Xn ( ej  )  2 =  Ln ( ej  )  2 =  En ( ej  )  2  Hln ( ej  )  2  L ( ej  )  2 : averaged over many segments  Yn ( k )  =  Xn ( k )  –  L ( k )   Yn ( k )  =  Yn ( k )  if  Yn ( k )    L ( k )   L ( k )  otherwise (  : subtraction,  : spectral floor,  : exp. factors) Here n is frame index and k is harmonic index

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 13 METHODOLOGY STEPS FOR HNM BASED SPECTRAL SUBTRACTION Non speech segments analyzed Average harmonic spectrum obtained Noisy speech analyzed and average harmonic spectrum of noise subtracted Resynthesis with noisy speech phase spectra For comparison, spectral subtraction using DFT derived magnitude is also carried out.

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 14 RESULTS (1/2)  Both DFT derived and HNM based harmonic spectrum significantly reduce the background noise  Both require empirical selection of the parameters  DFT derived spectral subtraction more effective during non-speech  HNM based spectral subtraction more effective during speech with less musical noise and enhanced formant structure  Saving in parameters and processing time in HNM based spectral subtraction

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 15 a) Recorded speech signal b) Processed (DFT derived) (  = 2,  = 0.001, and  =1) c) Processed (HNM derived) (  = 1,  = 0.1, and  = 1) RESULTS (2/2)

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 16 CONCLUSION HNM based method provides an effective subtraction of noise during the speech and hence can be used for improving intelligibility of electrolaryngeal speech. FURTHER PLAN  QBNE combined with HNM based spectral subtraction  Phase resynthesis from enhanced magnitude spectrum  Effect of artificial jitter in pitch on speech quality

IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and Future plan IntroductionAnalysis / synthesis Spec. Sub.MethodologyResultsConclusion and Future plan 17