Bayesian belief networks 2. PCA and ICA

Slides:



Advertisements
Similar presentations
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Advertisements

L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Face Recognition Jeremy Wyatt.
Ch 7.3: Systems of Linear Equations, Linear Independence, Eigenvalues
Independent Component Analysis (ICA) and Factor Analysis (FA)
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
A Quick Practical Guide to PCA and ICA Ted Brookings, UCSB Physics 11/13/06.
Minimal Neural Networks Support vector machines and Bayesian learning for neural networks Peter Andras
ICA Alphan Altinok. Outline  PCA  ICA  Foundation  Ambiguities  Algorithms  Examples  Papers.
Techniques for studying correlation and covariance structure
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
Survey on ICA Technical Report, Aapo Hyvärinen, 1999.
Summarized by Soo-Jin Kim
Principle Component Analysis (PCA) Networks (§ 5.8) PCA: a statistical procedure –Reduce dimensionality of input vectors Too many features, some of them.
Chapter 2 Dimensionality Reduction. Linear Methods
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Principles of Pattern Recognition
Independent Component Analysis on Images Instructor: Dr. Longin Jan Latecki Presented by: Bo Han.
Heart Sound Background Noise Removal Haim Appleboim Biomedical Seminar February 2007.
Independent Component Analysis Zhen Wei, Li Jin, Yuxue Jin Department of Statistics Stanford University An Introduction.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
1 Exercise 1 Submission Monday 19 Dec, 2010 Delayed Submission: 4 points every week How would you calculate efficiently the PCA of data where the dimensionality.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Linear Algebra Diyako Ghaderyan 1 Contents:  Linear Equations in Linear Algebra  Matrix Algebra  Determinants  Vector Spaces  Eigenvalues.
CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Principal Component Analysis (PCA)
Linear Algebra Diyako Ghaderyan 1 Contents:  Linear Equations in Linear Algebra  Matrix Algebra  Determinants  Vector Spaces  Eigenvalues.
ICA and PCA 學生:周節 教授:王聖智 教授. Outline Introduction PCA ICA Reference.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Introduction to Independent Component Analysis Math 285 project Fall 2015 Jingmei Lu Xixi Lu 12/10/2015.
An Introduction of Independent Component Analysis (ICA) Xiaoling Wang Jan. 28, 2003.
Object Orie’d Data Analysis, Last Time
Advanced Artificial Intelligence Lecture 8: Advance machine learning.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 10: PRINCIPAL COMPONENTS ANALYSIS Objectives:
Feature Extraction 主講人:虞台文.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 09: Discriminant Analysis Objectives: Principal.
Presented by: Muhammad Wasif Laeeq (BSIT07-1) Muhammad Aatif Aneeq (BSIT07-15) Shah Rukh (BSIT07-22) Mudasir Abbas (BSIT07-34) Ahmad Mushtaq (BSIT07-45)
Principal Components Analysis ( PCA)
Unsupervised Learning II Feature Extraction
Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.
Unsupervised Learning II Feature Extraction
Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Dynamic graphics, Principal Component Analysis Ker-Chau Li UCLA department of Statistics.
Principal Component Analysis
LECTURE 11: Advanced Discriminant Analysis
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Principle Component Analysis (PCA) Networks (§ 5.8)
School of Computer Science & Engineering
Brain Electrophysiological Signal Processing: Preprocessing
Principal Component Analysis (PCA)
Principal Components Analysis
PCA vs ICA vs LDA.
Bayesian belief networks 2. PCA and ICA
Techniques for studying correlation and covariance structure
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
X.1 Principal component analysis
A Fast Fixed-Point Algorithm for Independent Component Analysis
Feature space tansformation methods
Symmetric Matrices and Quadratic Forms
Principal Component Analysis
Eigenvalues and Eigenvectors
Symmetric Matrices and Quadratic Forms
Outline Variance Matrix of Stochastic Variables and Orthogonal Transforms Principle Component Analysis Generalized Eigenvalue Decomposition.
Presentation transcript:

Bayesian belief networks 2. PCA and ICA Peter Andras andrasp@ieee.org

Principal component analysis PCA 1. Idea: the high dimensional data might be situated on a lower dimensional surface.

PCA 2. How to find the lower dimensional surface ? We look for linear surfaces, i.e., hyperplanes. We decompose the correlation matrix of data conform its eigenvectors.

PCA 3. The eigenvectors are called principal component vectors. The new data vectors are formed by the projections of the original data vectors onto the principal component vectors.

PCA 4. are the data vectors The correlation matrix is:

PCA 5. The eigenvectors are determined by the equation: where  is a real number. Example with two eigenvectors:

PCA 6. In principle we should find d eigenvectors if the dimensionality of the data vectors is d. If the data vectors are situated on a lower dimensional linear surface we find less than d eigenvectors (i.e., the determinant of the correlation matrix is zero).

PCA 7. If v1, v2, …, vm, m<d, are the eigenvectors of R then the new, transformed data vectors are calculated as:

PCA 8. How to calculate the eigenvectors of R ? First method: use standard matrix algebra methods. (it is very laborious) Second method: iterative calculation of the eigenvectors inspired by artificial neural networks.

PCA 9. Iterative calculation of the eigenvectors Let w1 Rd a randomly chosen vector, such that ||w1||=1 Perform iteratively the calculation: where yi=w1Txi and  is a learning constant. The algorithm converges to the eigenvector corresponding to the largest eigenvalue ().

PCA 10. To calculate the following eigenvectors we modify the iterative algorithm. Now we use the calculation formula: where and uji=wjTxi. This iterative algorithm converges to wk the k-th eigenvector.

PCA 11. If the algorithm doesn’t converge the situation can be: a. the vector enters in a cycle; b. the values doesn’t form any cycle. If we have a cycle, all the vectors of the cycle are eigenvectors, and their corresponding eigenvalues are very close. If we have no convergence and no cycle, that means that there is no more eigenvector that can be determined.

PCA 12. How to use the PCA for dimension reduction ? Select the important eigenvectors. Many times all of the eigenvectors can be determined but only part of them are important. The importance of the eigenvectors is shown by their associated eigenvalue.

PCA 13. Selecting the important eigenvectors. 1. Graphical method:

PCA 14. Selecting the important eigenvectors. 2. Relative power: 3. Cumulative power:

PCA 15. Summary The PCA is used for dimensionality reduction. The data vectors are projected on the eigenvectors of their correlation matrix to obtain the transformed data vectors. To calculate easily the PCA we can use the iterative algorithm. To reduce the data dimension we consider only the important eigenvectors.

Independent component analysis ICA 1. The idea: if the data vectors are linear combination of statistically independent data components, they should be separable in their components. This is true if the component vectors have non-Gaussian distribution, with sharper or flatter peak.

ICA 2. Suppose xi=Asi, where xi are the data vectors, si are the vectors of statistically independent components (sji) Our goal is to find the matrix A (more precisely, the rows of it). Example: ‘cocktail-party’ effect: many independent voices registered together; goal: separate the independent voices; the recorded mixture is a linear mixture.

ICA 3. How to find the independent components ? Optimize: All solution vectors (w) are local minimum solutions, and they correspond to one of the independent components, i.e., on the components of the si vectors.

ICA 4. How to do it practically ? FastICA algorithm (Hyvarinen and Oja): Calculates by iterations the w vectors. The calculation formula is: w converges to one of the vectors corresponding to one of the independent components.

ICA 5. In practice we have to calculate several w vectors. To test whether the generated independent components are really independent we can use statistical tests. Let us consider s1i=w1Txi and s2i=w2Txi. Then we can test the independence of s1 and s2 by calculating their correlation and testing their identical origin by the F-test (they may not be strongly correlated but at the same time they may have identical origin). If the testing accepts the independence of the two series we may accept w2 as a new vector that corresponds to a separate independent component.

ICA 6. Remarks By calculating the independent components we get a new representation of the data, which has the property that the components contain minimum mutual information. We can use the ICA to select the independent non-Gaussian components, but we cannot separate the Gaussian mixtures.