Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering,

Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering, University of the Ryukyus, Okinawa, Japan Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Digital signal processing begins with the A/D-D/A converter. Then Digital Filters (DF) are next to process the sampled data. There are two types of DF: FIR &IIR Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories x(n) a0a0 a1a1 a2a2 a N-1 aNaN y(n)

Define input vector samples: Vector of tap coefficients: Then in vector form: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Usually tap coefficients are constant. But in applications such as Echo Canceling or Equalizers in Communication, they are variable. Then, We call adaptive Digital Filter (ADF). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

In Acoustic Echo Canceling (AEC), we need to estimate Acoustic Response of room by ADF. Fig. EC Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Usually, we use Least-Mean-Square (LMS) Algorithm to Adapt the tap coefficients gradually. where in (AEC), we have the Mic Signal as a reference (pilot) to find the error Signal. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Now, things change when we have near end speaker. This condition is called Double-Talk. Then, error signal is: This error can disturb the adaptation process as s(n) does not have any Correlation with echo signal d(n). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Outline Here, we need statistical process to avoid this problem. Then, instead of signal processing in ADF, we introduce the correlation of signal to be processed in ADF. Corr: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Conventional methods for echo canceling The problem of double-talk The correlation LMS (CLMS) algorithm The Extended CLMS (ECLMS) algorithm Frequency domain ECLMS (FECLMS) algorithm Frequency Bin ECLMS (FBECLMS) algorithm Computer simulation results

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Conventional Echo Canceler System

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Model for Acoustic Echo Impulse Response

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Listening to Effect of Echo Original Speech Signal Echo with 250 msec path

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Double-talk in echo canceler

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Problem with double-talk Double-talk misleads the effective algorithm. Conventional algorithm freezes the tap adaptation in double-talk condition, resulting in: 1-Reducing the speed of adaptation. 2-Misleading algorithm to estimate echo path changes. The new proposed algorithm is based on processing of the correlation functions of input signal and desired signal

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The correlation functions processing Input Correlation Function: Cross-Correlation between desired response “d”and input “x” signals: The microphone signal “d” is the desire signal:

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The near-end talk signal, s(n): Because “s” and “x” are two independent speech signal, therefore: With the echo signal, y(n):

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The correlation LMS (CLMS) algorithm In correlation filter, we estimate the correlation between d(n) and x(n) by: Cross-correlation estimation error: By processing the correlation function of the input, we can continue the tap adaptation in double-talk condition:

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories For tap adaptation, we use the steepest descent method by minimization of the MSE: MSE=E[ | e(n) | ^2] Gradient search criterion: The correlation LMS (CLMS) Algorithm:

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Echo Canceler by CLMS Algorithm

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Stability of the CLMS

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories where: The cost function is sum of lag-squared error: The extended CLMS ( ECLMS ) algorithm R is weight matrix and:

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The normalized ECLMS algorithm The gradient vector of the cost function:

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Echo Canceler by ECLMS Algorithm

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The recursion formulas for computing correlations practically After copying the taps to DF the output is:

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Frequency domain ECLMS algorithm Taking FFT based on time lag k To reduce computational complexity of ECLMS, we propose frequency domain ECLMS algorithm:

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories where: Estimation of : The cost function: where:

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Therefore, the FECLMS algorithm will be :

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories For speech signal, we normalized the Normalization of FECLMS convergence factor to the power of bin, :

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Echo canceler using FECLMS algorithm

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Computational complexity comparison ECLMS algorithm: FECLMS algorithm: N 648.5% 1284.9% 2562.7%

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Computer Simulation Impulse Response Estimation Ratio (IRER) Measure for convergence: : Echo impulse response : Adaptive filter tap coefficient

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories IRER with white-noise N=16, = 0.9

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Normalized effect with color-noise = 1 = 0.03,

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Direct Calculation of Auto-Correlation in the Frequency Domain

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Estimation of Cross-Correlation

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Direct Calculation of Cross-Correlation in the Frequency Domain

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Reduced-Computation Structure for FECLMS Algorithm with Zero-Padding & HOL-Saved Method Called FBECLMS Algorithm

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Comparison of FDAF & FBECLMS algorithms when using zero-padding & HOL-saved method in double-talk condition. Input white noise, double-talk condition, N=16, = 0.9

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Convergence characteristics of FDAF algorithm in single and double-talk conditions.

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Comparison between LMS, CLMS, FDAF, FECLMS, and proposed FBECLMS in double-talk.

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Switching from single to double talk and comparison of various performances.

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Switching from FDAF to FBECLMS under double talk condition.

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Smart Acoustic Room (SAR) SAR is defined the acoustic response between two (or more) points could be controlled smartly. By control, we mean to have a well estimation of the acoustic path between two points and then to make the appropriate signal to cancel an unwanted noise or to emphasis to a desired signal (speech or music).

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Application of Smart Acoustic Room(SAR) ● When there are the peoples who want to listen to Jazz or Classic in a room, we don’t want to use headphone as it totally isolate the person from surrounding. ● In a conference room or big hall, we have two kinds of audiences that want to listen to the Japanese or English speech. If we can give two audiences the desire location, just by seating in the right place one can hear to desire language. Fig.1 Application of SAR ・ Jazz ・ Japanese ・ Classic ・ English Room

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories SAR model Sound A point Room Null point Fig.2 Smart acoustic room (SAR) model simplified Sound A

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories SAR model for zero-enforcing (minimization) the Mic. Signal Adaptive algorithm h(n) W2(n) Z(n) Sound Source X(n) Microphone M e(n) W1(n) W2(n) X(n) Fig.3 SAR using LMS or FXLMS algorithm Speaker S1 Speaker S2 (Null Point)

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Algorithm e(n)=x(n)*w1(n)+x(n)*h(n)*w2(n)…(1) If e(n)=0, then X(z)*W1(z)+X(z)*H(z)*W2(z)=0 …(2) H(z)=-W1(z)/W2(z) …(3) LMS algorithm: h (n+1)=h (n)-2μe(n)x(n-i)…(4) FXLMS algorithm: h (n+1)=h (n)-2μe(n)z(n-i) …(5) ii i i

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories SAR model by using the virtual microphone Adaptive filter Sound source h(n) y(n) Speaker S1 Speaker S2 W1(n) W2(n) W1(n) ~ ~ Virtual Speaker S1 Virtual Speaker S2 ~ ~ Mic M Virtual Mic M ~ SAR using the virtual microphone e(n) ~ x(n)

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories e(n)=x(n)*w1(n)+x(n)*h(n)*w2(n) …(6) If e(n)=0, then H(z)=-W1(z)/W2(z) …(7) SAR system by using virtual microphone The same, if e(n)=0, then H(z)=-W1(z)/W2(z) …(8) From equation (7) and (8) ， W1(z)/W2(z) =W1(z)/W2(z) …(9) ~ ~~ ~~

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories SAR system by using virtual microphone w1 (n+1)=w1 (n)+2μe(n)x(n-i)…(10) w2 (n+1)=w2 (n)-2μe(n)y(n-i)…(11) The virtual acoustic pass can be estimated by: i i i i ~ ~ ~ ~

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Simulation results The MSE of the SAR algorithms LMS FXLMS ・ Execute is 100. ・ Step size of adaptive filter is 0.01.

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Demonstration Amplitude of an original sound

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Demonstration Amplitude of Sound A point

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Demonstration Amplitude of Null Point

So in DTEC, we required analysis of Statistical Signal Processing using Second Order, (Correlation Processing). Now, we are concentrated to another problem that requires more Statistical Processing. This problem called Blind Source Separation (BSS). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Suppose that we have K speakers and we have M Microphone to pick up the audio signals. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Blind Source Separation s1s1 s2s2 sksk x1x1 x2x2 xMxM a 11 a 12

Assuming simultaneous mixtures (Not Convolution) Then, we have the following relations: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Assuming M ≥ K and especially M=K, we can write these relations in vector & Matrix form as follows: where Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

This problem is BSS, because we do not have information about sources (S(n)) and its mixture matrix (A), just we observe mixture signal X(n). So, here we cannot have any pilot (reference) signal such as in Echo Canceling. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

But, we have some statistical knowledge about speech signal. Speech signals are independent statistically. That is: E[S i S j ]=0 for i≠j Speech signal has super Gaussian PDF Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

If we have two sources & two mixtures as in following figures, S 1 verses S 2 and X 1 verses X 2 are drawn. Independent sources Dependent mixtures These two figures show how samples of S’s are spread over wider area than samples of X’s Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

So, we can understand that the mixtures signals are more dependent than original sources. Another important phenomenon in process of mixing is obtained from central limit theorem: CLT which tells: If a set of signals are independent with any PDF, then their sum, x : Has a PDF which is approximately Gaussian. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

This is an important fact that leads us to BSS. Even we have two sources mixed with non-unity coefficients, the result has more Gaussian shape in PDF. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

The problem of BSS seeks for an un-mixing matrix W, that when affected on mixtures x, the result y has a PDF that is non-Gaussian: So, if, then or near to it. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

But, we do not know, So, we need to find by some adaptive way. For instance by defining a function g(y) that when affected on y, make its PDF more non-Gaussian (uniform). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

The best function could be a CDF (monotonic function). Sigmoid function such as: Or “tanh” could also be a good choice. Here, the question is how to measure “non-Gaussianity” to find optimum function g(y) or optimum un-mixing matrix W. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Let’s talk more statistically, here : The expected value of a random process x with PDF is defined as the first moment : Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

The variance of x is defined as the second moment of x : The forth moment of is Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

If, Kurtosis of is defined as normalized version of the forth moment : Kurtosis shows how a random signal is super-Gaussian (peaky). If the process is Gaussian, so: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

K > 0 Super-Gaussian K = 0 Gaussian K < 0 Sub-Gaussian So, kurtosis is a measure of (non)Gaussianity, speech sawtooth noise super-G sub-G Gaussian Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Till now, we understand that extracted signals should be independent as much as possible, and at the same time non-Gaussianity should be high, that is the Kurtosis should be far from zero as possible Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Kurtosis is the measure of non-Gaussinity. But what is measure of independency? The answer is “Entropy”. Entropy “H” for a random process is defined as the average amount of surprise associated with an event, z with probability: Pr( z=1 ) = p so: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

An event (coin toss) with p=0.5 has highest entropy H=1, while if probability of event is near to zero or one has lowest entropy H=0. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

So, entropy is a measure of the uniformity (For unbiased coin toss p = 0.5, independency). Maximum entropy corresponds to complete uniformity (non-Gaussianity). So, one way to obtain mutual independent signals is to find an un-mixing matrix W that Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

maximizes the entropy (of a fixed nonlinear function “g” previously said sigmoid monotonic function) of the extracted signal. The un-mixing matrix W also minimizes the mutual information. The independent signals are obtained by maximum entropy (infomax) (Bell & Sejnowski 1995). A.J. Bell and T.J. Sejnowski. A non-linear information maximization approach that performs blind separation. In Advances in Neural Information Processing Systems 7, pages 467--474. MIT Press, Cambridge, Mass, 1995. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

In a simple example shown below, we maximize the entropy of “ y = g( u ) ” Let : u = W.X and define “g” as sigmoid function of “u” Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Then according to the famous theorem for PDF relation : Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Entropy of “y” is : Then :

For Kurtotic signal such as speech : Minimization of mutual information = Maximization of entropy of “y” Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

( Since, is not related to W ) for Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories In an adaptive algorithm to find un-mixing matrix:

Then, after a length calculation : So: Then: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

In vector form: This is the basis of an algorithm called: Kullback Leibler (KL). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

Now, let me briefly introduce some of our improvements and applications of BSS. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

PDF-matched and short-term modifications to Stone’s BSS Stone BSS is one of the main BSS methods based on predictability maximization. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories It is by using two short term predictors: The desired predictor & it’s the opposite predictor Only short term predictors.

Evaluation Results (Audio)

Evaluation Results (Image)

Generalization of Stone’s BSS by Simultaneous Diagonalizations Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

The generalized method has been used by deploying the filters used in the PDF-matched method. It has been compared with Stone BSS, SOBI (second order blind identification) and AMUSE (Algorithm for Multiple Unknown Signals Extraction) over Speech and real image mixtures. It dominates the others! Evaluation Results Department of Information Engineering – University of the Ryukyus

Numerical evaluation : G matrix index : It is based on global matrix Evaluation Results Department of Information Engineering – University of the Ryukyus

Speech Department of Information Engineering – University of the Ryukyus

Image Department of Information Engineering – University of the Ryukyus For Image, we used real mixtures. (window glass reflection) Evaluation with Mutual Information

An Efficient Blind BSS based Blind MIMO-OFDM system Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories It can be shown that in a MIMO OFDM system received symbols can be shown as a linear instantaneous mixture of transmitted symbols at each subcarrier m. N BSS problems to obtain N un-mixing matrices related to N subcarriers (0 ≤ m ≤ N − 1).

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories But even after successful separation of symbols at each subcarrier, the users recomposition suffers from - permutation indeterminacy, - amplitude scaling ambiguity and - phase distortion of symbols which are inherent to complex ICA.

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The Proposed ICA based MIMO OFDM system In the above structure, the problems inherent to BSS have been solved successfully.

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Image Enhancement Using Statistical Method Let, X={X(i,j)} denote a given image composed of L discrete gray levels denoted as For a given image X, the probability density function p(Xk) is defined as

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories For automatic image Histogram Equalization (HE), we are looking for a Transform that is As like as previous PDF Theorem used in BSS (ICA) problem, we have

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories So, the new HE image “y” is obtained as follows: This is actually Cumulative Distribution Function (CDF) of random variable (original image) “x”

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories In Digital Image, HE can be obtained as below:

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories 2-D DFT is applied to the image as is magnitude and is phase at frequency point (p,s) Operator M modifies the magnitude of the Fourier transform as α is taken from the interval (0,1) α-Rooting Method of Enhancement

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories α-Rooting Method of Enhancement The components of the Fourier transform are multiplied by the coefficients Then inverse 2-D DFT over the obtained data gives enhanced image Image 2-D DFT |F| α 2-D IDFT Phase Enhanced Image

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Wavelet Transform Wavelet transform divides image into four subband: –One with approximation coefficients (AC) and three with detail coefficients Most of the signal energy concentrates to AC Image LH HLLL HH WT

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Image Enhancement Using Wavelet LL Histogram Separation

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Original Image & Histogram

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Histogram Equalized Image

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Mean Value Histogram Separated Equalization

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Wavelet LL Histogram Separation

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Histogram Equalization in Wavelet Domain (LL)

Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Comparison between LL,HH of Wavelet and Histogram Equalization

Thank you very much ! Department of Information Engineering University of the Ryukyus DSP Lab Signal Processing Laboratories

Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering,

Similar presentations

Presentation on theme: "Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering,

Similar presentations

Presentation on theme: "Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering,"— Presentation transcript:

Similar presentations

About project

Feedback