A Quick Practical Guide to PCA and ICA Ted Brookings, UCSB Physics 11/13/06
Blind Source Separation Suppose we have a data set that Has many independent components or channels Audio track recorded from multiple microphones Series of brain images with multiple voxel We believe is driven by several independent processes Different people speaking into microphones Different neuronal processes occurring within the brain We have no a priori notion of what those processes look like Our goal is to figure out what the different processes are by grouping together data that is correlated
Our Simple Example Driven by two sin signals with different frequencies 100 Sample Times 200 Channels: 150 are a linear combination of Signal1 and Signal2, with Poisson noise 50 are Pure Poisson noise
PCA (Principle Component Analysis) Linear transform ---chooses a new basis Perpendicular First component explains the most variance, second component explains the most remaining variance, etc. Finds a weight matrix W, and set of signals S, that approximate the data X: X = W * S The weight matrix is the eigenvectors of the correlation matrix, so the eigenvalues provide the order of components Image from:
Spelling Things Out The meaning of the basis equation: e.g. if W =.6 and W =.2, then X =.6 S +.2 S. That is, X is actually being generated (at least partly) by the processes S and S X is typically a time series ---that is, X is measured at discrete intervals. However, our basis doesn’t change, because the fundamental processes that are at work is presumed to be constant. Because of this, W is constant in time, and S changes with time. The end result of PCA is then S(t), and W, which tells us the activity of each component, and how to generate the original data from the components.
PCA Results Unsurprisingly, PCA discovers two dominant components We might expect trouble here: PCA will probably go diagonal
PCA Results Oops! The signals are mixed. But… They’re a lot cleaner, because PCA has removed a lot of gaussian noise
ICA (Independent Component Analysis) Linear transform ---chooses a new basis NOT Perpendicular The basis is chosen to be maximally-independent There is no particular ordering of the basis vectors
Er… “Maximally Independent”? Correlated:Uncorrelated: Technical, and the definition depends somewhat on the algorithm being used. Ultimately boils down to cross-correlations. If two variables are uncorrelated, they are independent. Images from web page by Aapo Hyvärinen,
Requirements At most one gaussian-distributed element of data The number of independent data must be greater than the number of components: m > n. E.g. number of microphones greater than number of voices.
ICA Results Ick! Might have expected this, because there’s a ton of gaussian noise in the system.
Do ICA on the Results of PCA! PCA cleans up the gaussian noise (and reduces the dimension). Most PCA packages incorporate PCA or some other preprocessing for this reason. ICA picks the basis that is maximally independent.
For More Info Check out Wikipedia (seriously). The articles on PCA/ICA Are actually good. Provide links to software packages for C++, Java, Matlab, etc. See especially FastICA. Many of the external links provide good overviews as well.
The Aftermath… Great! Now that we have what we’ve always wanted (a list of “components”) what do we do with them? Since ICA is “blind” it doesn’t tells us much about the components. We may simply be interested in data reduction, or categorizing the mechanisms at work. We may be interested in components that correlate with some signal that we drove the experiment with.