1 Exercise 1 Submission Monday 6 Dec, 2010 Delayed Submission: 4 points every week How would you calculate efficiently the PCA of data where the dimensionality d is much larger than the number of vector observations n? Download the Abalone Data from the UC Irvine repository, extract PCAs from the data, test scatter plots of original data and after projecting onto the principal components, plot Eigen values.Abalone Projections on which principal components are most correlated with the class labels?
Ex1. Part 2 Submit to subject: Ex1 NC and last 1.Given a high dimensional data, is there a way to know if all possible projections of the data are Gaussian? Explain - What if there is some additive Gaussian noise?
Ex1. (cont.) 2. Use Fast ICA (easily found on Google) e/dlcode.html e/dlcode.html – Choose your favorite two songs – Create 3 mixture matrices and mix them – Apply fastica to de-mix
Ex1 (cont.) Discuss the results – What happens when the mixing matrix is symmetric – Why did u get different results with different mixing matrices – Demonstrate that you got close to the original files – Try different nonlinearity of fastica, which one is best, can you see that from the data
Ex1 - Final Task Create a BCM learning rule which can go into the Fast ICA algorithm of Hyvarinen. – Run it on multi modal distributions as well as other distributions. – Running should be as the regular fast ICA but with a new option for the BCM rule. – Demonstrate how down in Fisher score can you go to still get separation