Download presentation
Presentation is loading. Please wait.
Published bySamantha Andrews Modified over 8 years ago
1
ICA and PCA 學生:周節 教授:王聖智 教授
2
Outline Introduction PCA ICA Reference
3
Introduction Why are these methods ? A: For computational and conceptual simplicity. And it is more convenient to analysis. What are these methods ? A: The “representation” is often sought as a linear transformation of the original data. Well-known linear transformation methods. Ex: PCA, ICA, factor analysis, projection pursuit………….
4
What is PCA? Principal Component Analysis It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Reducing the number of dimensions
5
example X Y 2.5000 2.4000 0.5000 0.7000 2.2000 2.9000 1.9000 2.2000 3.1000 3.0000 2.3000 2.7000 2.0000 1.6000 1.0000 1.1000 1.5000 1.6000 1.1000 0.9000 Original data
6
example X Y 0.6900 0.4900 -1.3100 -1.2100 0.3900 0.9900 0.0900 0.2900 1.2900 1.0900 0.4900 0.7900 0.1900 -0.3100 -0.8100 -0.8100 -0.3100 -0.3100 -0.7100 -1.0100 (1)Get some data and subtract the mean
7
example 0.6166 0.6154 0.6154 0.7166 eigenvectors = -0.7352 0.6779 0.6779 0.7352 eigenvalues = 0.0491 0 0 1.2840 (2)Get the covariance matrix Covariance= (3)Get their eigenvectors & eigenvalues
8
example eigenvectors -0.7352 0.6779 0.6779 0.7352
9
Example (4)Choosing components and forming a feature vector eigenvectors -0.7352 0.6779 0.6779 0.7352 eigenvalues 0.0491 0 0 1.2840 A B B is bigger!
10
Example Then we choose two feature vector sets: (a) A+B -0.7352 0.6779 0.6779 0.7352 ( feature vector_1) (b) Only B (Principal Component ) 0.6779 0.7352 ( feature vector_2 ) Modified_data = feature_vector * old_data
11
example X Y -0.1751 0.8280 0.1429 -1.7776 0.3844 0.9922 0.1304 0.2742 -0.2095 1.6758 0.1753 0.9129 -0.3498 -0.0991 0.0464 -1.1446 0.0178 -0.4380 -0.1627 -1.2238 (a)feature vector_1
12
example
13
x 0.8280 -1.7776 0.9922 0.2742 1.6758 0.9129 -0.0991 -1.1446 -0.4380 -1.2238 (b)feature vector_2
14
Example (5)Deriving the new data set from feature vector (a)feature vector_1 (b)feature vector_2 New_data = feature_vector_transpose * Modified_data
15
example X Y 0.6900 0.4900 -1.3100 -1.2100 0.3900 0.9900 0.0900 0.2900 1.2900 1.0900 0.4900 0.7900 0.1900 -0.3100 -0.8100 -0.8100 -0.3100 -0.3100 -0.7100 -1.0100 (a)feature vector_1
16
example X Y 0.5613 0.6087 -1.2050 -1.3068 0.6726 0.7294 0.1859 0.2016 1.1360 1.2320 0.6189 0.6712 -0.0672 -0.0729 -0.7759 -0.8415 -0.2969 -0.3220 -0.8296 -0.8997 (b)feature vector_2
17
example
18
Sum Up 可以降低資料維度 資料要有相關性比較適合使用 幾何意義:投影到主向量上
19
What is ICA? Independent Component Analysis For separating the blind or unknown sources Start with “A cocktail-party problem”
20
ICA The Principle of ICA: A cocktail-party problem x 1 (t)=a 11 s 1 (t) +a 12 s 2 (t) +a 13 s 3 (t) x 2 (t)=a 21 s 1 (t) +a 22 s 2 (t) +a 12 s 3 (t) x 3 (t)=a 31 s 1 (t) +a 32 s 2 (t) +a 33 s 3 (t)
21
ICA X1 X2 X3 Linear Transformation S1 S2 S3
22
Math model Given x1(t),x2(t),x3(t) Want to find s1(t), s2(t), s3(t) x 1 (t)=a 11 s 1 (t) +a 12 s 2 (t) +a 13 s 3 (t) x 2 (t)=a 21 s 1 (t) +a 22 s 2 (t) +a 12 s 3 (t) x 3 (t)=a 31 s 1 (t) +a 32 s 2 (t) +a 33 s 3 (t) X=AS
23
Math model Because A,S are Unknown We need some assumption (1) S is statistical independent (2) S is nongaussian distributions Goal : Find a W such that S=WX X=AS
24
Theorem Using Central limit theorem The distribution of a sum of independent random variables tends toward a Gaussian distribution Observed signal=S1S2 Sn a1a1 + a 2 ….+ a n toward GaussianNon-Gaussian
25
Theorem Given x = As Let y = w T x z = A T w => y = w T As = z T s =S1S2 Sn z1z1 + z 2 ….+ z n toward GaussianNon-Gaussian Observed signal=X1X2 Xn w1w1 + w 2 ….+ w n
26
Theorem Find a w such that Maximization of NonGaussianity of y = w T x But how to measure NonGaussianity ? Y=X1X2 Xn w1w1 + w 2 ….+ w n
27
Theorem Measures of nongaussianity Kurtosis: As y toward to gaussian, F(y) is much closer to zero !!! F(y) = E{ (y) 4 } - 3*[ E{ (y) 2 } ] 2 Super-Gaussian kurtosis > 0 Gaussian kurtosis = 0 Sub-Gaussian kurtosis < 0
28
Steps (1) centering & whitening process (2) FastICA algorithm
29
Steps X1 X2 X3 Linear Transformation S1 S2 S3 FastICA S1 S2 S3 X1 X2 X3 centering & whitening Z1 Z2 Z3 Correlateduncorrelatedindependent
30
example Original data
31
example (1) centering & whitening process
32
example (2) FastICA algorithm
33
example (2) FastICA algorithm
34
Sum up 能讓成份間的統計相關性 (statistical dependent) 達到最小的線性轉換方法 可以解決未知訊號分解的問題 ( Blind Source Separation )
35
Reference “A tutorial on Principal Components Analysis”, Lindsay I Smith, February 26, 2002 “Independent Component Analysis : Algorithms and Applications “, Aapo Hyvärinen and Erkki Oja, Neural Networks Research Centre Helsinki University of Technology http://www.cis.hut.fi/projects/ica/icademo/
36
centering & Whitening process Let is zero mean Then is a whitening matrix xEDVxz T 2 1 sxA T EDV 2 1 TTT EEVxxVzz}{}{ 2 1 2 1 EDEDEED TT I TT E xx }{ Let D and E be the eigenvalues and eigenvector matrix of covariance matrix of x, i.e.
37
For the whitened data z, find a vector w such that the linear combination y=w T z has maximum nongaussianity under the constrain Maximize | kurt(w T z)| under the simpler constraint that ||w||=1 Then centering & Whitening process
38
FastICA 1. Centering 2. Whitening 3. Choose m, No. of ICs to estimate. Set counter p 1 4. Choose an initial guess of unit norm for w p, eg. randomly. 5. Let 6. Do deflation decorrelation 7.Let w p w p /||w p || 8.If w p has not converged (| | 1 ), go to step 5. 9.Set p p+1. If p m, go back to step 4.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.