Clustering Gene Expression Data Using Independent Component Analysis Stephen C. Billups University of Colorado at Denver Department of Mathematics Larry Hunter University of Colorado Health Sciences Center Department of Pharmacology
x=As + ν + noise
Key Points ICA clustering attractive for gene expression data: Accounts for and identifies independent hidden effects that influence gene expression. Allows clusters with markedly different shapes and dimensionalities to be identified. Bayesian approach allows prior knowledge to be incorporated. (semi-supervised learning). Algorithm works only when underlying effects have non-Gaussian distributions. The algorithm is made tractable by using a variational Bayesian method with some sensible simplifications. Opportunities exist for improving the computational efficiency.