Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering,

Similar presentations


Presentation on theme: "Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering,"— Presentation transcript:

1

2 Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering, University of the Ryukyus, Okinawa, Japan Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

3 Digital signal processing begins with the A/D-D/A converter. Then Digital Filters (DF) are next to process the sampled data. There are two types of DF: FIR &IIR Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories x(n) a0a0 a1a1 a2a2 a N-1 aNaN y(n)

4 Define input vector samples: Vector of tap coefficients: Then in vector form: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

5 Usually tap coefficients are constant. But in applications such as Echo Canceling or Equalizers in Communication, they are variable. Then, We call adaptive Digital Filter (ADF). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

6 In Acoustic Echo Canceling (AEC), we need to estimate Acoustic Response of room by ADF. Fig. EC Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

7 Usually, we use Least-Mean-Square (LMS) Algorithm to Adapt the tap coefficients gradually. where in (AEC), we have the Mic Signal as a reference (pilot) to find the error Signal. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

8 Now, things change when we have near end speaker. This condition is called Double-Talk. Then, error signal is: This error can disturb the adaptation process as s(n) does not have any Correlation with echo signal d(n). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

9 Outline Here, we need statistical process to avoid this problem. Then, instead of signal processing in ADF, we introduce the correlation of signal to be processed in ADF. Corr: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

10 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Conventional methods for echo canceling The problem of double-talk The correlation LMS (CLMS) algorithm The Extended CLMS (ECLMS) algorithm Frequency domain ECLMS (FECLMS) algorithm Frequency Bin ECLMS (FBECLMS) algorithm Computer simulation results

11 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Conventional Echo Canceler System

12 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Model for Acoustic Echo Impulse Response

13 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Listening to Effect of Echo Original Speech Signal Echo with 250 msec path

14 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Double-talk in echo canceler

15 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Problem with double-talk Double-talk misleads the effective algorithm. Conventional algorithm freezes the tap adaptation in double-talk condition, resulting in: 1-Reducing the speed of adaptation. 2-Misleading algorithm to estimate echo path changes. The new proposed algorithm is based on processing of the correlation functions of input signal and desired signal

16 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The correlation functions processing Input Correlation Function: Cross-Correlation between desired response “d”and input “x” signals: The microphone signal “d” is the desire signal:

17 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The near-end talk signal, s(n): Because “s” and “x” are two independent speech signal, therefore: With the echo signal, y(n):

18 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The correlation LMS (CLMS) algorithm In correlation filter, we estimate the correlation between d(n) and x(n) by: Cross-correlation estimation error: By processing the correlation function of the input, we can continue the tap adaptation in double-talk condition:

19 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories For tap adaptation, we use the steepest descent method by minimization of the MSE: MSE=E[ | e(n) | ^2] Gradient search criterion: The correlation LMS (CLMS) Algorithm:

20 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Echo Canceler by CLMS Algorithm

21 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Stability of the CLMS

22 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories where: The cost function is sum of lag-squared error: The extended CLMS ( ECLMS ) algorithm R is weight matrix and:

23 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

24 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The normalized ECLMS algorithm The gradient vector of the cost function:

25 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Echo Canceler by ECLMS Algorithm

26 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The recursion formulas for computing correlations practically After copying the taps to DF the output is:

27 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Frequency domain ECLMS algorithm Taking FFT based on time lag k To reduce computational complexity of ECLMS, we propose frequency domain ECLMS algorithm:

28 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

29 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories where: Estimation of : The cost function: where:

30 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Therefore, the FECLMS algorithm will be :

31 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories For speech signal, we normalized the Normalization of FECLMS convergence factor to the power of bin, :

32 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Echo canceler using FECLMS algorithm

33 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Computational complexity comparison ECLMS algorithm: FECLMS algorithm: N 648.5% 1284.9% 2562.7%

34 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Computer Simulation Impulse Response Estimation Ratio (IRER) Measure for convergence: : Echo impulse response : Adaptive filter tap coefficient

35 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories IRER with white-noise N=16, = 0.9

36 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Normalized effect with color-noise = 1 = 0.03,

37 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Direct Calculation of Auto-Correlation in the Frequency Domain

38 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Estimation of Cross-Correlation

39 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Direct Calculation of Cross-Correlation in the Frequency Domain

40 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Reduced-Computation Structure for FECLMS Algorithm with Zero-Padding & HOL-Saved Method Called FBECLMS Algorithm

41 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Comparison of FDAF & FBECLMS algorithms when using zero-padding & HOL-saved method in double-talk condition. Input white noise, double-talk condition, N=16, = 0.9

42 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Convergence characteristics of FDAF algorithm in single and double-talk conditions.

43 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Comparison between LMS, CLMS, FDAF, FECLMS, and proposed FBECLMS in double-talk.

44 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Switching from single to double talk and comparison of various performances.

45 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Switching from FDAF to FBECLMS under double talk condition.

46 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Smart Acoustic Room (SAR) SAR is defined the acoustic response between two (or more) points could be controlled smartly. By control, we mean to have a well estimation of the acoustic path between two points and then to make the appropriate signal to cancel an unwanted noise or to emphasis to a desired signal (speech or music).

47 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Application of Smart Acoustic Room(SAR) ● When there are the peoples who want to listen to Jazz or Classic in a room, we don’t want to use headphone as it totally isolate the person from surrounding. ● In a conference room or big hall, we have two kinds of audiences that want to listen to the Japanese or English speech. If we can give two audiences the desire location, just by seating in the right place one can hear to desire language. Fig.1 Application of SAR ・ Jazz ・ Japanese ・ Classic ・ English Room

48 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories SAR model Sound A point Room Null point Fig.2 Smart acoustic room (SAR) model simplified Sound A

49 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories SAR model for zero-enforcing (minimization) the Mic. Signal Adaptive algorithm h(n) W2(n) Z(n) Sound Source X(n) Microphone M e(n) W1(n) W2(n) X(n) Fig.3 SAR using LMS or FXLMS algorithm Speaker S1 Speaker S2 (Null Point)

50 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Algorithm e(n)=x(n)*w1(n)+x(n)*h(n)*w2(n)…(1) If e(n)=0, then X(z)*W1(z)+X(z)*H(z)*W2(z)=0 …(2) H(z)=-W1(z)/W2(z) …(3) LMS algorithm: h (n+1)=h (n)-2μe(n)x(n-i)…(4) FXLMS algorithm: h (n+1)=h (n)-2μe(n)z(n-i) …(5) ii i i

51 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories SAR model by using the virtual microphone Adaptive filter Sound source h(n) y(n) Speaker S1 Speaker S2 W1(n) W2(n) W1(n) ~ ~ Virtual Speaker S1 Virtual Speaker S2 ~ ~ Mic M Virtual Mic M ~ SAR using the virtual microphone e(n) ~ x(n)

52 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories e(n)=x(n)*w1(n)+x(n)*h(n)*w2(n) …(6) If e(n)=0, then H(z)=-W1(z)/W2(z) …(7) SAR system by using virtual microphone The same, if e(n)=0, then H(z)=-W1(z)/W2(z) …(8) From equation (7) and (8) , W1(z)/W2(z) =W1(z)/W2(z) …(9) ~ ~~ ~~

53 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories SAR system by using virtual microphone w1 (n+1)=w1 (n)+2μe(n)x(n-i)…(10) w2 (n+1)=w2 (n)-2μe(n)y(n-i)…(11) The virtual acoustic pass can be estimated by: i i i i ~ ~ ~ ~

54 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Simulation results The MSE of the SAR algorithms LMS FXLMS ・ Execute is 100. ・ Step size of adaptive filter is 0.01.

55 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Demonstration Amplitude of an original sound

56 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Demonstration Amplitude of Sound A point

57 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Demonstration Amplitude of Null Point

58 So in DTEC, we required analysis of Statistical Signal Processing using Second Order, (Correlation Processing). Now, we are concentrated to another problem that requires more Statistical Processing. This problem called Blind Source Separation (BSS). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

59 Suppose that we have K speakers and we have M Microphone to pick up the audio signals. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Blind Source Separation s1s1 s2s2 sksk x1x1 x2x2 xMxM a 11 a 12

60 Assuming simultaneous mixtures (Not Convolution) Then, we have the following relations: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

61 Assuming M ≥ K and especially M=K, we can write these relations in vector & Matrix form as follows: where Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

62 This problem is BSS, because we do not have information about sources (S(n)) and its mixture matrix (A), just we observe mixture signal X(n). So, here we cannot have any pilot (reference) signal such as in Echo Canceling. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

63 But, we have some statistical knowledge about speech signal. Speech signals are independent statistically. That is: E[S i S j ]=0 for i≠j Speech signal has super Gaussian PDF Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

64 If we have two sources & two mixtures as in following figures, S 1 verses S 2 and X 1 verses X 2 are drawn. Independent sources Dependent mixtures These two figures show how samples of S’s are spread over wider area than samples of X’s Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

65 So, we can understand that the mixtures signals are more dependent than original sources. Another important phenomenon in process of mixing is obtained from central limit theorem: CLT which tells: If a set of signals are independent with any PDF, then their sum, x : Has a PDF which is approximately Gaussian. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

66 This is an important fact that leads us to BSS. Even we have two sources mixed with non-unity coefficients, the result has more Gaussian shape in PDF. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

67 The problem of BSS seeks for an un-mixing matrix W, that when affected on mixtures x, the result y has a PDF that is non-Gaussian: So, if, then or near to it. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

68 But, we do not know, So, we need to find by some adaptive way. For instance by defining a function g(y) that when affected on y, make its PDF more non-Gaussian (uniform). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

69 The best function could be a CDF (monotonic function). Sigmoid function such as: Or “tanh” could also be a good choice. Here, the question is how to measure “non-Gaussianity” to find optimum function g(y) or optimum un-mixing matrix W. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

70 Let’s talk more statistically, here : The expected value of a random process x with PDF is defined as the first moment : Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

71 The variance of x is defined as the second moment of x : The forth moment of is Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

72 If, Kurtosis of is defined as normalized version of the forth moment : Kurtosis shows how a random signal is super-Gaussian (peaky). If the process is Gaussian, so: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

73 K > 0 Super-Gaussian K = 0 Gaussian K < 0 Sub-Gaussian So, kurtosis is a measure of (non)Gaussianity, speech sawtooth noise super-G sub-G Gaussian Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

74 Till now, we understand that extracted signals should be independent as much as possible, and at the same time non-Gaussianity should be high, that is the Kurtosis should be far from zero as possible Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

75 Kurtosis is the measure of non-Gaussinity. But what is measure of independency? The answer is “Entropy”. Entropy “H” for a random process is defined as the average amount of surprise associated with an event, z with probability: Pr( z=1 ) = p so: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

76 An event (coin toss) with p=0.5 has highest entropy H=1, while if probability of event is near to zero or one has lowest entropy H=0. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

77 So, entropy is a measure of the uniformity (For unbiased coin toss p = 0.5, independency). Maximum entropy corresponds to complete uniformity (non-Gaussianity). So, one way to obtain mutual independent signals is to find an un-mixing matrix W that Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

78 maximizes the entropy (of a fixed nonlinear function “g” previously said sigmoid monotonic function) of the extracted signal. The un-mixing matrix W also minimizes the mutual information. The independent signals are obtained by maximum entropy (infomax) (Bell & Sejnowski 1995). A.J. Bell and T.J. Sejnowski. A non-linear information maximization approach that performs blind separation. In Advances in Neural Information Processing Systems 7, pages 467--474. MIT Press, Cambridge, Mass, 1995. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

79 In a simple example shown below, we maximize the entropy of “ y = g( u ) ” Let : u = W.X and define “g” as sigmoid function of “u” Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

80 Then according to the famous theorem for PDF relation : Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Entropy of “y” is : Then :

81 For Kurtotic signal such as speech : Minimization of mutual information = Maximization of entropy of “y” Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

82 ( Since, is not related to W ) for Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories In an adaptive algorithm to find un-mixing matrix:

83 Then, after a length calculation : So: Then: Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

84 In vector form: This is the basis of an algorithm called: Kullback Leibler (KL). Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

85 Now, let me briefly introduce some of our improvements and applications of BSS. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

86 PDF-matched and short-term modifications to Stone’s BSS Stone BSS is one of the main BSS methods based on predictability maximization. Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories It is by using two short term predictors: The desired predictor & it’s the opposite predictor Only short term predictors.

87 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

88 Evaluation Results (Audio)

89 Evaluation Results (Image)

90 Generalization of Stone’s BSS by Simultaneous Diagonalizations Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories

91 The generalized method has been used by deploying the filters used in the PDF-matched method. It has been compared with Stone BSS, SOBI (second order blind identification) and AMUSE (Algorithm for Multiple Unknown Signals Extraction) over Speech and real image mixtures. It dominates the others! Evaluation Results Department of Information Engineering – University of the Ryukyus

92 Numerical evaluation : G matrix index : It is based on global matrix Evaluation Results Department of Information Engineering – University of the Ryukyus

93 Speech Department of Information Engineering – University of the Ryukyus

94 Image Department of Information Engineering – University of the Ryukyus For Image, we used real mixtures. (window glass reflection) Evaluation with Mutual Information

95 An Efficient Blind BSS based Blind MIMO-OFDM system Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories It can be shown that in a MIMO OFDM system received symbols can be shown as a linear instantaneous mixture of transmitted symbols at each subcarrier m. N BSS problems to obtain N un-mixing matrices related to N subcarriers (0 ≤ m ≤ N − 1).

96 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories But even after successful separation of symbols at each subcarrier, the users recomposition suffers from - permutation indeterminacy, - amplitude scaling ambiguity and - phase distortion of symbols which are inherent to complex ICA.

97 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories The Proposed ICA based MIMO OFDM system In the above structure, the problems inherent to BSS have been solved successfully.

98 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Image Enhancement Using Statistical Method Let, X={X(i,j)} denote a given image composed of L discrete gray levels denoted as For a given image X, the probability density function p(Xk) is defined as

99 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories For automatic image Histogram Equalization (HE), we are looking for a Transform that is As like as previous PDF Theorem used in BSS (ICA) problem, we have

100 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories So, the new HE image “y” is obtained as follows: This is actually Cumulative Distribution Function (CDF) of random variable (original image) “x”

101 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories In Digital Image, HE can be obtained as below:

102 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories 2-D DFT is applied to the image as is magnitude and is phase at frequency point (p,s) Operator M modifies the magnitude of the Fourier transform as α is taken from the interval (0,1) α-Rooting Method of Enhancement

103 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories α-Rooting Method of Enhancement The components of the Fourier transform are multiplied by the coefficients Then inverse 2-D DFT over the obtained data gives enhanced image Image 2-D DFT |F| α 2-D IDFT Phase Enhanced Image

104 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Wavelet Transform Wavelet transform divides image into four subband: –One with approximation coefficients (AC) and three with detail coefficients Most of the signal energy concentrates to AC Image LH HLLL HH WT

105 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Image Enhancement Using Wavelet LL Histogram Separation

106 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Original Image & Histogram

107 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Histogram Equalized Image

108 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Mean Value Histogram Separated Equalization

109 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Wavelet LL Histogram Separation

110 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Histogram Equalization in Wavelet Domain (LL)

111 Department of Information Engineering – University of the Ryukyus DSP Lab Signal Processing Laboratories Comparison between LL,HH of Wavelet and Histogram Equalization

112 Thank you very much ! Department of Information Engineering University of the Ryukyus DSP Lab Signal Processing Laboratories


Download ppt "Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering,"

Similar presentations


Ads by Google