Chapter 5 Homomorphic Processing(1)

Slides:



Advertisements
Similar presentations
Analysis and Digital Implementation of the Talk Box Effect Yuan Chen Advisor: Professor Paul Cuff.
Advertisements

Digital Kommunikationselektronik TNE027 Lecture 5 1 Fourier Transforms Discrete Fourier Transform (DFT) Algorithms Fast Fourier Transform (FFT) Algorithms.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Chapter 8: The Discrete Fourier Transform
Sampling, Reconstruction, and Elementary Digital Filters R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2002.
Chapter 12 Fourier Transforms of Discrete Signals.
EE-2027 SaS, L11 1/13 Lecture 11: Discrete Fourier Transform 4 Sampling Discrete-time systems (2 lectures): Sampling theorem, discrete Fourier transform.
Copyright © Shi Ping CUC Chapter 3 Discrete Fourier Transform Review Features in common We need a numerically computable transform, that is Discrete.
Discrete-Time and System (A Review)
DTFT And Fourier Transform
1 Chapter 8 The Discrete Fourier Transform 2 Introduction  In Chapters 2 and 3 we discussed the representation of sequences and LTI systems in terms.
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.
Digital Signal Processing – Chapter 10
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
Zhongguo Liu_Biomedical Engineering_Shandong Univ. Chapter 8 The Discrete Fourier Transform Zhongguo Liu Biomedical Engineering School of Control.
Hossein Sameti Department of Computer Engineering Sharif University of Technology.
1 Lecture 1: February 20, 2007 Topic: 1. Discrete-Time Signals and Systems.
Real time DSP Professors: Eng. Julian S. Bruno Eng. Jerónimo F. Atencio Sr. Lucio Martinez Garbino.
Digital Signal Processing Chapter 3 Discrete transforms.
Fourier Analysis of Discrete Time Signals
Extends Euclidean space algebra to higher dimensions
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Chapter 6 Linear Predictive Coding (LPC) of Speech Signals 6.1 Basic Concepts of LPC 6.2 Auto-Correlated Solution of LPC 6.3 Covariance Solution of LPC.
Chapter 5 Finite-Length Discrete Transform
Lecture#10 Spectrum Estimation
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
Lecture 12: Parametric Signal Modeling XILIANG LUO 2014/11 1.
Copyright ©2010, ©1999, ©1989 by Pearson Education, Inc. All rights reserved. Discrete-Time Signal Processing, Third Edition Alan V. Oppenheim Ronald W.
Learning from the Past, Looking to the Future James R. (Jim) Beaty, PhD - NASA Langley Research Center Vehicle Analysis Branch, Systems Analysis & Concepts.
In summary If x[n] is a finite-length sequence (n  0 only when |n|
بسم الله الرحمن الرحيم Digital Signal Processing Lecture 14 FFT-Radix-2 Decimation in Frequency And Radix -4 Algorithm University of Khartoum Department.
بسم الله الرحمن الرحيم Lecture (12) Dr. Iman Abuel Maaly The Discrete Fourier Transform Dr. Iman Abuel Maaly University of Khartoum Department of Electrical.
1 Chapter 8 The Discrete Fourier Transform (cont.)
Lecture 19 Spectrogram: Spectral Analysis via DFT & DTFT
UNIT-II FOURIER TRANSFORM
Chapter 4 Discrete-Time Signals and transform
Ch. 2 : Preprocessing of audio signals in time and frequency domain
CHAPTER 5 Z-Transform. EKT 230.
DIGITAL SIGNAL PROCESSING ELECTRONICS
Figure 11.1 Linear system model for a signal s[n].
In summary If x[n] is a finite-length sequence (n0 only when |n|
Spectral Analysis Spectral analysis is concerned with the determination of the energy or power spectrum of a continuous-time signal It is assumed that.
ARTIFICIAL NEURAL NETWORKS
Linear Constant-coefficient Difference Equations
FFT-based filtering and the
Cepstrum and MFCC Cepstrum MFCC Speech processing.
EE Audio Signals and Systems
Fast Fourier Transform
Digital Systems: Hardware Organization and Design
4.1 DFT In practice the Fourier components of data are obtained by digital computation rather than by analog processing. The analog values have to be.
Chapter 8 The Discrete Fourier Transform
Sampling the Fourier Transform
DTFT from DFT samples by interpolation.
Z TRANSFORM AND DFT Z Transform
Lecture 18 DFS: Discrete Fourier Series, and Windowing
LECTURE 18: FAST FOURIER TRANSFORM
Chapter 2 Discrete Fourier Transform (DFT)
Digital Systems: Hardware Organization and Design
Linear Prediction.
Chapter 8 The Discrete Fourier Transform
Homomorphic Speech Processing
Chapter 8 The Discrete Fourier Transform
Speech Processing Final Project
Fast Fourier Transform (FFT) Algorithms
Fast Fourier Transform
CE Digital Signal Processing Fall Discrete Fourier Transform (DFT)
LECTURE 18: FAST FOURIER TRANSFORM
Fourier Transforms of Discrete Signals By Dr. Varsha Shah
Presentation transcript:

Chapter 5 Homomorphic Processing(1) In frequency domain, X(z) = E(z) V(z) (ignoring R(z)) In time domain x(n) is the convolution between the excitation e(n) and system unit sampling response v(n). How to get e(n) and v(n) from x(n) is important. This algorithm is called the de-convolution algorithm. There are two categories of it: parametric de-convolution and non-parametric de-convolution.

5.1 Principle of Homomorphic Processing (1) The general system for homomorphic processing is like following: x1(n)△x2(n)D△[.] x1(n)+x2(n)L[.] y1(n)+y2(n)D△-1[.]y1(n)△y2(n) Where △is an operation( multiplication or convolution), D△[.] is called the characteristic system by which x1(n)△x2(n) becomes x1(n)+x2(n). L[.] is a linear system by which output is y1(n)+y2(n),D△-1[.] creates y1(n)△y2(n)

Principle of Homomorphic Processing (2) Only discuss the convolution homomorphic signal processing system D*[.]: X(z) = Z[x(n)] = Σ x(n)Z-n from N1 to N2 X(Z) = ln[X(Z)] x(n) = Z-1[X(Z)]=∮ln[X(Z)]Zn-1dZ/(2πj) L[.] : y(n) = L[x(n)]

Principle of Homomorphic Processing (3) D△-1[.] : Y(Z)=Z[y(n)]=Σy (n)Z-n from -∞to∞ Y(Z)=exp[Y(Z)] Y(n)=Z[Y(Z)]=∮ exp[Y(Z)]Zn-1dZ/ (2πj) If x(n) = x1(n)* x2(n) Then x(n)=x1(n)+x2(n) and y(n)=y1(n)*y2(n) x(n) is called the Complex Cepstrum of x(n)

Principle of Homomorphic Processing (4) In most cases, the convergent areas of X(Z),X(Z),Y(Z),Y(Z) include the unit circle, so DTFT could be used to replace Z transform: X(expjω)=F[x(n)]=Σx(n)exp(-jωn) X(expjω)=ln[X(expjω)] x(n)=∫X(expjω)exp(jωn)dω/(2π)

Principle of Homomorphic Processing (5) Y(expjω)=F[y(n)]=Σy(n)exp(-jωn) Y(expjω)=exp[Y (expjω)] y(n)=∫ Y(expjω)exp(jωn)/ (2π) There are some properties: X(expjω)=Σx(n)exp(-jωn) Y(expjω)=Σy(n)exp(-jωn) If x(n) is a real sequence, x(n) is also a real sequence

Principle of Homomorphic Processing (6) How to do de-convolution? If in discrete time domain x(n)=x1(n)*x2(n), then in complex cepstrum domain x(n)=x1(n)+x2(n) Suppose x1(n) is 0 outside [n1,n2], x2(n) is 0 outside [n3,n4], and two intervals do not overlap, then properly designed L[.] could separate x1(n) if L[.] is a rectangle window over [n1,n2]. So L[.] is called Lifter.

Principle of Homomorphic Processing (7) Another Homomorphic processing system X(expjω)=F[x(n)]=Σx(n)exp(-jωn) C(expjω)=ln[|X(expjω)|] c(n)=∫ C(expjω)exp(jωn)dω/(2π) Inverse transformation is same. The only difference is replacing ln[X(expjω)] with ln[|X(expjω)|]. c(n) is called cepstrum.

Principle of Homomorphic Processing (8) If c1(n) and c2(n) are the cepstrums of x1(n) and x2(n), x(n)=x1(n)*x2(n), then the cepstrum of x(n)  c(n)=c1(n)+c2(n). The difference is that through the forward and backward transformation x(n) is no longer itself in the cepstrum case.

Principle of Homomorphic Processing (9) c(n) could be found by x(n). Suppose x(n)=xe(n)+xo(n) xe(n) = xe(-n), xo(n) = - xo(-n) xe(n) = [x(n) + x(-n)]/2 xo(n) = [x(n) - x(-n)]/2 Because the DTFT of an even symmetric sequence is a real function, c(n)= xe(n)= [x(n) + x(-n)]/2

5.3 Practical Algorithms for finding the Complex cepstrum and cepstrum(1) Because directly computing x(n) by X(Z) involves solving the high order algebraic equations, it is not practical. We can use the formula for DTFT, but for computer, it should use DFT or FFT to do. Suppose x(n) has limited length [0,N-1] by zero padding.

Practical Algorithms for finding the Complex cepstrum and cepstrum(2) X(k) = Σx(n)exp(-j2πnk/N) n,k=0~N-1 X(k) = lnX(k) k=0~N-1 Or C(k)=ln|X(k)| k=0~N-1 x(n) = ΣX(k)exp(j2πnk/N) /N k,n=0~N-1 c(n) = ΣC(k)exp(j2πnk/N) /N k,n=0~N-1 Be careful for anti-aliasing : N>2max{na,nb} See Fig. 4-5 on page 57 for system

5.4 Application of Homomorphic Processing Characteristics of complex cepstrum and cepstrum of speech signals Application in U/V decision and pitch estimation Application in Extraction of formants Application in Speech Synthesis

Characteristics of complex cepstrum and cepstrum of speech signal (1) Characteristics of complex cepstrum and cepstrum of speech signals In Z domain, X(Z)=E(Z)V(Z) In complex cepstrum domain x(n)=e(n)+v(n) For voiced phone e(n) is a periodic sequence. Suppose the period is Np, (Np=Tp fs) e(n)=Σr=0Rδ(n-rNp) (See page 59) So e(n) !=0 only on n=mNp, m=1,2,3,… Tp is 2.5ms-20ms. If fs=10kHz, Np is 25-200.

Characteristics of complex cepstrum and cepstrum of speech signal (2) v(n) is small outside [-25,25]. So if a Lifter L[n] = 1 |n|<25 and 0 |n|>=25 is used, the v(n) could be separated. Then v(n) could be estimated by the inverse characteristic system. If a Lifter L[n] = 1 |n|>=25 and 0 |n|<25 is used, the e(n) could be separated and e(n) could be restored. For unvoiced phones e(n) has the property of noise, e(n) has no obvious peaks, it is in all time domain; v(n) is only in low time domain. Please see the examples on page 60-61, diagram 4-6 and 4-7.

Application in U/V decision and pitch estimation (1) In the complex cepstrum and cepstrum of voiced there exist some peaks in multiple period of the pitch (Np). This is the main basis for distinguishing the unvoiced(U) and voiced(V). Also by Np and fs the Tp could be estimated. But the trouble is for voiced sometime the peaks are not obvious; and for unvoiced the random peak is possible. Absolute threshold and relative threshold for one frame Decision by a couple of frames The first peak should be at Np. If fs=10kHz, Np=25-200, so the search area should be around this range. The frame length should be at least 200 points (20ms)

Application in Extraction of formants If v(n) or cv(n) are separated, the logarithmic spectrum ln|V(expjω)| could be found by DTFT over cv(n) By further processing the formants could be obtained. The windowing function should not change rapidly (Hamming window is better than rectangle window).

Application in Speech Synthesis (1) For high quality of speech synthesis, the rhyming rules must be introduced into the system. In the speech database only single syllable is recorded. When uttering a word these syllables must be changed according to the rhyming rules (change amplitude, duration, tone and so on) If e(n) and v(n) for every syllable are separated and stored in the database, the the changes will be implemented easily. Changed e(n) will convolute with v(n) and generate the new speech for various words. This is one way to do speech synthesis. By the real system, the speech quality is high but the smooth concatenation of the syllables is still to be improved.