UNIT-IV. Introduction Speech signal is generated from a system. Generation is via excitation of system. Speech travels through various media. Nature of.

Slides:



Advertisements
Similar presentations
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Advertisements

EE513 Audio Signals and Systems Digital Signal Processing (Synthesis) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
2004 COMP.DSP CONFERENCE Survey of Noise Reduction Techniques Maurice Givens.
OPTIMUM FILTERING.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Digital Image Processing
Communications & Multimedia Signal Processing Meeting 7 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 23 November,
Single-Channel Speech Enhancement in Both White and Colored Noise Xin Lei Xiao Li Han Yan June 5, 2002.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Speech Enhancement Based on a Combination of Spectral Subtraction and MMSE Log-STSA Estimator in Wavelet Domain LATSI laboratory, Department of Electronic,
Modeling of Mel Frequency Features for Non Stationary Noise I.AndrianakisP.R.White Signal Processing and Control Group Institute of Sound and Vibration.
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
1 Speech Enhancement Wiener Filtering: A linear estimation of clean signal from the noisy signal Using MMSE criterion.
EE513 Audio Signals and Systems Wiener Inverse Filter Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Adaptive Noise Cancellation ANC W/O External Reference Adaptive Line Enhancement.
Speech Signal Processing I Edmilson Morais and Prof. Greg. Dogil October, 25, 2001.
Digital Signals and Systems
1-1 Basics of Data Transmission Our Objective is to understand …  Signals, bandwidth, data rate concepts  Transmission impairments  Channel capacity.
Lecture 1. References In no particular order Modern Digital and Analog Communication Systems, B. P. Lathi, 3 rd edition, 1998 Communication Systems Engineering,
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
1 Part 5 Response of Linear Systems 6.Linear Filtering of a Random Signals 7.Power Spectrum Analysis 8.Linear Estimation and Prediction Filters 9.Mean-Square.
Wireless and Mobile Computing Transmission Fundamentals Lecture 2.
EE Audio Signals and Systems Digital Signal Processing (Synthesis) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
The Physical Layer Lowest layer in Network Hierarchy. Physical transmission of data. –Various flavors Copper wire, fiber optic, etc... –Physical limits.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Adv DSP Spring-2015 Lecture#9 Optimum Filters (Ch:7) Wiener Filters.
Course Outline (Tentative) Fundamental Concepts of Signals and Systems Signals Systems Linear Time-Invariant (LTI) Systems Convolution integral and sum.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
EE513 Audio Signals and Systems
8-1 Chapter 8: Image Restoration Image enhancement: Overlook degradation processes, deal with images intuitively Image restoration: Known degradation processes;
Speech Enhancement Using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimation method.
Study of Broadband Postbeamformer Interference Canceler Antenna Array Processor using Orthogonal Interference Beamformer Lal C. Godara and Presila Israt.
IIT Bombay ICSCI 2004, Hyderabad, India, Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and.
Performance Comparison of Speaker and Emotion Recognition
Lecture 10b Adaptive Filters. 2 Learning Objectives  Introduction to adaptive filtering.  LMS update algorithm.  Implementation of an adaptive filter.
Discrete-time Random Signals
APPLICATION OF A WAVELET-BASED RECEIVER FOR THE COHERENT DETECTION OF FSK SIGNALS Dr. Robert Barsanti, Charles Lehman SSST March 2008, University of New.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Impulse Response Measurement and Equalization Digital Signal Processing LPP Erasmus Program Aveiro 2012 Digital Signal Processing LPP Erasmus Program Aveiro.
SOME SIMPLE MANIPULATIONS OF SOUND USING DIGITAL SIGNAL PROCESSING Richard M. Stern demo January 15, 2015 Department of Electrical and Computer.
Linear Prediction.
Geology 6600/7600 Signal Analysis 26 Oct 2015 © A.R. Lowry 2015 Last time: Wiener Filtering Digital Wiener Filtering seeks to design a filter h for a linear.
WAVELET NOISE REMOVAL FROM BASEBAND DIGITAL SIGNALS IN BANDLIMITED CHANNELS Dr. Robert Barsanti SSST March 2010, University of Texas At Tyler.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
Presented By: Shamil. C Roll no: 68 E.I Guided By: Asif Ali Lecturer in E.I.
Adv DSP Spring-2015 Lecture#11 Spectrum Estimation Parametric Methods.
Speech Enhancement Algorithm for Digital Hearing Aids
Speech Enhancement Summer 2009
Vocoders.
Adaptive Filters Common filter design methods assume that the characteristics of the signal remain constant in time. However, when the signal characteristics.
Linear Prediction.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Chapter 16 Adaptive Filters
Two-Stage Mel-Warped Wiener Filter SNR-Dependent Waveform Processing
A Tutorial on Bayesian Speech Feature Enhancement
Digital Systems: Hardware Organization and Design
Linear Prediction.
EE Audio Signals and Systems
Wiener Filtering: A linear estimation of clean signal from the noisy signal Using MMSE criterion.
INTRODUCTION TO ADVANCED DIGITAL SIGNAL PROCESSING
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

UNIT-IV

Introduction Speech signal is generated from a system. Generation is via excitation of system. Speech travels through various media. Nature of speech may deviate due to issues associated with system, excitation or media. Speech signal characteristics changes and hence poor performance.

Different Sources of Degradation Internally or externally induced stress. Excitation source or vocal tract system disorders. Sensor and channel mismatches. Background noise present in the media. Reverberation present in the media. Other speakers’ speech present in the media. Speech signal degrades due to one or more of the above.

Scope of Speech Enhancement Degradations present in media. Background noise, reverberation and other speaker’s speech. Nature of degradation is different in each case. Issues to be addressed are different. Noisy speech enhancement. Enhancement of reverberant speech. Processing multispeaker speech.

Approaches for Speech Enhancement Two schools of thought! Estimate degradation and minimize the same from degraded speech. Use speech specific knowledge and enhance speech components from degraded speech. Both have their own merits. Intelligent combination takes benefit of both.

Human way of Speech Enhancement Two ears, binaural mechanism. Selective attention. Cognitive processing. Focus on speech components. Repeat request. We experience, but do not understand fully !

Noisy Speech Enhancement sd(n) = s(n)+d(n), where s(n) is clean speech, d(n) is background noise and sd(n) is noisy speech. Note: d(n) combines in additive way with s(n) and hence additive background noise. This is the model assumed for noisy speech. Examples of additive background noise: White noise, colored noise, factory noise, babble noise... How to achieve enhancement in case of noisy speech?

Enhancement by Estimating Degradation

Spectral Subtraction based Noisy Speech Enhancement We assume that the noise is additive and uncorrelated with the desired signal.

Enhancement using Speech-Specific Knowledge Noisy speech model: sd(n) = s(n) + d(n) Let w(n) be weight function estimated using speech- specific knowledge. w(n) gives more emphasis to speech-specific high SNR regions. Enhanced speech: ˜s(n) = sd(n) × w(n) = (s(n) + d(n)) × w(n).

Enhancement by Comb Filter There are many algorithms for enhancement of noise corrupted speech. Specific properties of voiced speech signals, which can be considered as quasi harmonic signals, are exploited here. The voiced speech signal x(t) can be considered as a sum of sine waves, whose frequencies are integral multiples of the fundamental frequency F0

Cont’d The number N is the assumed number of harmonics of the voiced speech signal. A comb filter is a filter with multiple pass bands and stop bands. For transmitting only the harmonic components of the speech signal, the pass bands must be centered at multiples of the speech fundamental frequency, i.e. the frequency response of the comb filter has to be a periodic function with period equal to the fundamental frequency.

Cont’d Because voiced speech signals have time varying fundamental frequency, the comb filter for the enhancement of voiced speech has to be an adaptive filter tuned by the instantaneous fundamental frequency of the speech. It means that the comb filter vary from frame to frame. A comb filter can be constructed by frequency transformation of a FIR or IIR prototype filter. Because almost all processing in speech enhancement algorithm based on spectral subtraction is performed in the spectral domain, it is appropriate to design and apply the comb filter in the spectral domain too.

Cont’d

Comb filtering by itself is not sufficient to suppress noisy background in noise degraded speech signal. Further it is difficult to estimate the fundamental frequency for noisy speech. Therefore we have used comb filtering as post processing operation in speech enhancement, e.g. by spectral subtraction. The comb filter is constructed and applied in the frequency domain only for voiced frames after the classical spectral speech enhancement by spectral subtraction. For the construction of the comb filter we have to know the actual value of the speech fundamental frequency F0. For its estimation it is appropriate to use a pitch determination algorithm also in the spectral domain. If the spectrum after the classical spectral speech enhancement is identified as unvoiced, comb filtering is not applied. Cont’d

Enhancement by wiener Filter Additive Noise : Let y[n] be a discrete-time noisy sequence y[n] = x[n] + b[n] where x [n] is the desired signal, and b[n] is the unwanted background noise. An alternative to spectral subtraction for recovering a signal corrupted by additive noise, is to find a linear filter h[n] such that the sequence xˆ[n] = y[n] ∗ h[n] where xˆ[n] is the estimate of x[n].

minimizes the mean-squared error (MMSE)