Download presentation
Presentation is loading. Please wait.
1
An Introduction to Independent Component Analysis (ICA) 吳育德 陽明大學放射醫學科學研究所 台北榮總整合性腦功能實驗室
2
The Principle of ICA: a cocktail-party problem x 1 (t)=a 11 s 1 (t) +a 12 s 2 (t) +a 13 s 3 (t) x 2 (t)=a 21 s 1 (t) +a 22 s 2 (t) +a 12 s 3 (t) x 3 (t)=a 31 s 1 (t) +a 32 s 2 (t) +a 33 s 3 (t)
3
Independent Component Analysis Reference : A. Hyvärinen, J. Karhunen, E. Oja (2001)A. HyvärinenJ. KarhunenE. Oja John Wiley & SonsJohn Wiley & Sons. Independent Component Analysis
4
Central limit theorem The distribution of a sum of independent random variables tends toward a Gaussian distribution Observed signal=IC1IC2 ICn m1m1 + m 2 ….+ m n toward GaussianNon-Gaussian
5
Central Limit Theorem Partial sum of a sequence {z i } of independent and identically distributed random variables z i Since mean and variance of x k can grow without bound as k , consider instead of x k the standardized variables The distribution of y k a Gaussian distribution with zero mean and unit variance when k . Partial sum of a sequence {z i } of independent and identically distributed random variables z i
6
How to estimate ICA model Principle for estimating the model of ICA Maximization of NonGaussianity
7
Measures for NonGaussianity Kurtosis Super-Gaussian kurtosis > 0 Gaussian kurtosis = 0 Sub-Gaussian kurtosis < 0 Kurtosis : E{(x- ) 4 }-3*[E{(x- ) 2 }] 2 kurt(x 1 +x 2 )= kurt(x 1 ) + kurt(x 2 ) kurt( x 1 ) = 4 kurt(x 1 )
8
Assume measurement Whitening process is zero mean and Then is a whitening matrix xEDVxz T 2 1 Iss }{ T E sxA T EDV 2 1 TTT EEVxxVzz}{}{ 2 1 2 1 EDEDEED TT I TT E xx }{ Let D and E be the eigenvalues and eigenvector matrix of covariance matrix of x, i.e.
9
Importance of whitening For the whitened data z, find a vector w such that the linear combination y=w T z has maximum nongaussianity under the constrain Maximize | kurt(w T z)| under the simpler constraint that ||w||=1 Then
10
Constrained Optimization max F(w), ||w|| 2 =1 At the stable point, the gradient of F(w) must point in the direction of w, i.e. equal to w multiplied by a scalar. 0 ),( w w L 0]2[ )( w w w F 1),1()(),( 2 2 wwwwww T FL w w w 2 )(F
11
Gradient of kurtosis }]){([3}) )()( 224 zwzwzww TTT EEkurtF w wwzw w zwzw w w 2 1 4 24 )(3))(( 1 }}){(3}) )( T T t T TT t T EE F ))((2*3)]()[( 4 1 3 wwwwzwz T T t T tt T ]3}][{))[((4 2 3 wwzwzzw TT Ekurtsign T t t T E 1 )( 1 }{yy
12
Fixed-point algorithm using kurtosis w k+1 = w k + Note that adding the gradient to w k does not change its direction, since Convergence : | |=1 since w k and w k+1 are unit vectors )( wkwk F wkwk = ( + 2 ) )( wkwk F wkwk 2 3 3}][{wwzwz Therefore, w T E www/ w k+1 = w k - ( w k ) 2 = (1- 2 )wkwk
13
Fixed-point algorithm using kurtosis 1. Centering 2. Whitening 3. Choose m, No. of ICs to estimate. Set counter p 1 4. Choose an initial guess of unit norm for w p, eg. randomly. 5. Let 6. Do deflation decorrelation 7.Let w p w p /||w p || 8.If w p has not converged (| | 1 ), go to step 5. 9.Set p p+1. If p m, go back to step 4. One-by-one Estimation Fixed-point iteration
14
Fixed-point algorithm using negentropy The kurtosis is very sensitive to outliers, which may be erroneous or irrelevant observations Need to find a more robust measure for nongaussianity Approximation of negentropy ex. r.v. with sample size=1000, mean=0, variance=1, contains one value = 10 kurtosis at least equal to 10 4 /1000-3=7 kurtosis : E{x 4 }-3
15
Fixed-point algorithm using negentropy Entropy Negentropy Approximation of negentropy
16
w E{zg(w T z)} E{g '(w T z)} w w w/||w|| Fixed-point algorithm using negentropy Convergence : | |=1 Max J(y)
17
Fixed-point algorithm using negentropy 1. Centering 2. Whitening 3. Choose m, No. of ICs to estimate. Set counter p 1 4. Choose an initial guess of unit norm for w p, eg. randomly. 5. Let 6. Do deflation decorrelation 7.Let w p w p /||w p || 8.If w p has not converged, go back to step 5. 9.Set p p+1. If p m, go back to step 4. One-by-one Estimation Fixed-point iteration
18
Implantations Create two uniform sources
19
Implantations Create two uniform sources
20
Implantations Two mixed observed signals
21
Implantations Two mixed observed signals
22
Implantations Centering
23
Implantations Centering
24
Implantations Whitening
25
Implantations Whitening
26
Implantations Fixed-point iteration using kurtosis
27
Implantations Fixed-point iteration using kurtosis
28
Implantations Fixed-point iteration using kurtosis
29
Implantations Fixed-point iteration using negentropy
30
Implantations Fixed-point iteration using negentropy
31
Implantations Fixed-point iteration using negentropy
32
Implantations Fixed-point iteration using negentropy
33
Implantations Fixed-point iteration using negentropy
34
Fixed-point algorithm using negentropy Entropy A Gaussian variable has the largest entropy among all random variables of equal variance Entropy Negentropy Gaussian 0 nonGaussian >0 A Gaussian variable has the largest entropy among all random variables of equal variance Negentropy Gaussian 0 nonGaussian >0 It’s the optimal estimator but computationally difficult since it requires an estimate of the pdf Approximation of negentropy
35
Fixed-point algorithm using negentropy High-order cumulant approximation It’s quite common that most r.v. have approximately symmetric dist. Replace the polynomial function by any nonpolynomial fun G i ex.G 1 is odd and G 2 is even
36
Fixed-point algorithm using negentropy According to Lagrange multiplier the gradient must point in the direction of w
37
Fixed-point algorithm using negentropy The iteration doesn't have the good convergence properties because the nonpolynomial moments don't have the same nice algebraic properties. Finding by approximative Newton method Real Newton method is fast. small steps for convergence It requires a matrix inversion at every step. large computational load Special properties of the ICA problem approximative Newton method No need a matrix inversion but converge roughly with the same steps as real Newton method
38
Fixed-point algorithm using negentropy According to Lagrange multiplier the gradient must point in the direction of w Solve this equation by Newton method JF(w)* w = -F(w) w = [JF(w)] -1 [-F(w)] E{zz T }E{g ' (w T z)} = E{g ' (w T z)} I diagonalize w w [E{zg(w T z)} w] /[E{g ' (w T z)} ] w E{zg(w T z)} E{g ' (w T z)} w w w/||w|| Multiply by -E{g ' (w T z)}
39
w E{zg(w T z)} E{g ' (w T z)} w w w/||w|| Fixed-point algorithm using negentropy Convergence : | |=1 Because ICs can be defined only up to a multiplication sign
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.