Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICA and PCA 學生:周節 教授:王聖智 教授. Outline Introduction PCA ICA Reference.

Similar presentations


Presentation on theme: "ICA and PCA 學生:周節 教授:王聖智 教授. Outline Introduction PCA ICA Reference."— Presentation transcript:

1 ICA and PCA 學生:周節 教授:王聖智 教授

2 Outline Introduction PCA ICA Reference

3 Introduction Why are these methods ? A: For computational and conceptual simplicity. And it is more convenient to analysis. What are these methods ? A: The “representation” is often sought as a linear transformation of the original data. Well-known linear transformation methods. Ex: PCA, ICA, factor analysis, projection pursuit………….

4 What is PCA? Principal Component Analysis It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Reducing the number of dimensions

5 example X Y 2.5000 2.4000 0.5000 0.7000 2.2000 2.9000 1.9000 2.2000 3.1000 3.0000 2.3000 2.7000 2.0000 1.6000 1.0000 1.1000 1.5000 1.6000 1.1000 0.9000 Original data

6 example X Y 0.6900 0.4900 -1.3100 -1.2100 0.3900 0.9900 0.0900 0.2900 1.2900 1.0900 0.4900 0.7900 0.1900 -0.3100 -0.8100 -0.8100 -0.3100 -0.3100 -0.7100 -1.0100 (1)Get some data and subtract the mean

7 example 0.6166 0.6154 0.6154 0.7166 eigenvectors = -0.7352 0.6779 0.6779 0.7352 eigenvalues = 0.0491 0 0 1.2840 (2)Get the covariance matrix Covariance= (3)Get their eigenvectors & eigenvalues

8 example eigenvectors -0.7352 0.6779 0.6779 0.7352

9 Example (4)Choosing components and forming a feature vector eigenvectors -0.7352 0.6779 0.6779 0.7352 eigenvalues 0.0491 0 0 1.2840 A B B is bigger!

10 Example Then we choose two feature vector sets: (a) A+B -0.7352 0.6779 0.6779 0.7352 ( feature vector_1) (b) Only B (Principal Component ) 0.6779 0.7352 ( feature vector_2 ) Modified_data = feature_vector * old_data

11 example X Y -0.1751 0.8280 0.1429 -1.7776 0.3844 0.9922 0.1304 0.2742 -0.2095 1.6758 0.1753 0.9129 -0.3498 -0.0991 0.0464 -1.1446 0.0178 -0.4380 -0.1627 -1.2238 (a)feature vector_1

12 example

13 x 0.8280 -1.7776 0.9922 0.2742 1.6758 0.9129 -0.0991 -1.1446 -0.4380 -1.2238 (b)feature vector_2

14 Example (5)Deriving the new data set from feature vector (a)feature vector_1 (b)feature vector_2 New_data = feature_vector_transpose * Modified_data

15 example X Y 0.6900 0.4900 -1.3100 -1.2100 0.3900 0.9900 0.0900 0.2900 1.2900 1.0900 0.4900 0.7900 0.1900 -0.3100 -0.8100 -0.8100 -0.3100 -0.3100 -0.7100 -1.0100 (a)feature vector_1

16 example X Y 0.5613 0.6087 -1.2050 -1.3068 0.6726 0.7294 0.1859 0.2016 1.1360 1.2320 0.6189 0.6712 -0.0672 -0.0729 -0.7759 -0.8415 -0.2969 -0.3220 -0.8296 -0.8997 (b)feature vector_2

17 example

18 Sum Up 可以降低資料維度 資料要有相關性比較適合使用 幾何意義:投影到主向量上

19 What is ICA? Independent Component Analysis For separating the blind or unknown sources Start with “A cocktail-party problem”

20 ICA The Principle of ICA: A cocktail-party problem x 1 (t)=a 11 s 1 (t) +a 12 s 2 (t) +a 13 s 3 (t) x 2 (t)=a 21 s 1 (t) +a 22 s 2 (t) +a 12 s 3 (t) x 3 (t)=a 31 s 1 (t) +a 32 s 2 (t) +a 33 s 3 (t)

21 ICA X1 X2 X3 Linear Transformation S1 S2 S3

22 Math model Given x1(t),x2(t),x3(t) Want to find s1(t), s2(t), s3(t) x 1 (t)=a 11 s 1 (t) +a 12 s 2 (t) +a 13 s 3 (t) x 2 (t)=a 21 s 1 (t) +a 22 s 2 (t) +a 12 s 3 (t) x 3 (t)=a 31 s 1 (t) +a 32 s 2 (t) +a 33 s 3 (t) X=AS

23 Math model Because A,S are Unknown We need some assumption (1) S is statistical independent (2) S is nongaussian distributions Goal : Find a W such that S=WX X=AS

24 Theorem Using Central limit theorem The distribution of a sum of independent random variables tends toward a Gaussian distribution Observed signal=S1S2 Sn a1a1 + a 2 ….+ a n toward GaussianNon-Gaussian

25 Theorem Given x = As Let y = w T x z = A T w => y = w T As = z T s =S1S2 Sn z1z1 + z 2 ….+ z n toward GaussianNon-Gaussian Observed signal=X1X2 Xn w1w1 + w 2 ….+ w n

26 Theorem Find a w such that Maximization of NonGaussianity of y = w T x But how to measure NonGaussianity ? Y=X1X2 Xn w1w1 + w 2 ….+ w n

27 Theorem Measures of nongaussianity Kurtosis: As y toward to gaussian, F(y) is much closer to zero !!! F(y) = E{ (y) 4 } - 3*[ E{ (y) 2 } ] 2 Super-Gaussian kurtosis > 0 Gaussian kurtosis = 0 Sub-Gaussian kurtosis < 0

28 Steps (1) centering & whitening process (2) FastICA algorithm

29 Steps X1 X2 X3 Linear Transformation S1 S2 S3 FastICA S1 S2 S3 X1 X2 X3 centering & whitening Z1 Z2 Z3 Correlateduncorrelatedindependent

30 example Original data

31 example (1) centering & whitening process

32 example (2) FastICA algorithm

33 example (2) FastICA algorithm

34 Sum up 能讓成份間的統計相關性 (statistical dependent) 達到最小的線性轉換方法 可以解決未知訊號分解的問題 ( Blind Source Separation )

35 Reference “A tutorial on Principal Components Analysis”, Lindsay I Smith, February 26, 2002 “Independent Component Analysis : Algorithms and Applications “, Aapo Hyvärinen and Erkki Oja, Neural Networks Research Centre Helsinki University of Technology http://www.cis.hut.fi/projects/ica/icademo/

36 centering & Whitening process Let is zero mean Then is a whitening matrix xEDVxz T 2 1   sxA  T EDV 2 1   TTT EEVxxVzz}{}{  2 1 2 1   EDEDEED TT I  TT E xx  }{ Let D and E be the eigenvalues and eigenvector matrix of covariance matrix of x, i.e.

37 For the whitened data z, find a vector w such that the linear combination y=w T z has maximum nongaussianity under the constrain Maximize | kurt(w T z)| under the simpler constraint that ||w||=1 Then centering & Whitening process

38 FastICA 1. Centering 2. Whitening 3. Choose m, No. of ICs to estimate. Set counter p  1 4. Choose an initial guess of unit norm for w p, eg. randomly. 5. Let 6. Do deflation decorrelation 7.Let w p  w p /||w p || 8.If w p has not converged (| | 1 ), go to step 5. 9.Set p  p+1. If p  m, go back to step 4.


Download ppt "ICA and PCA 學生:周節 教授:王聖智 教授. Outline Introduction PCA ICA Reference."

Similar presentations


Ads by Google