Download presentation
Presentation is loading. Please wait.
Published byAmanda Grant Modified over 8 years ago
1
K -means clustering via Principal Component Analysis (Chris Ding and Xiaofeng He, ICML 2004) 03 March 2011 Kwak, Namju 1
2
Overview By adopting the PCA technique while performing K -means clustering, we can give upper and lower bounds to the K -means clustering objective function. PCA-guided K -means clustering –Before performing K -means clustering, get cluster membership indicator vectors q i ’s. –Determine members of each cluster C i by q i. –Calculate centroids m i ’s from C i ’s. –With m i ’s as initial centroids, perform K -means clustering until convergence. Cluster centroid subspace –Project data points onto the dimension-reduced subspace, then perform clustering. Connectivity Analysis –Determine cluster membership without K -means iterations. 2
3
Introduction 3
4
2 -way Clustering 4
5
J D is always positive. Theorem For K=2, minimization of K -means cluster objective function J K is equivalent to maximization of the distance objective J D, which is always positive. 5
6
2 -way Clustering 6
7
7
8
8
9
9
10
10
11
K -way Clustering 11
12
K -way Clustering Then, where Redundancies in H K. Remove this redundancy by (a) performing a linear transformation T into q k 's where T=(t ij ) is a K×K orthonormal matrix ( T T T=I ) and (b) requiring that the last column of T is Therefore, always, 12
13
K -way Clustering The mutual orthogonality of h k implies that of q k. When Now the K -means objective can be written as J K does not distinguish the original data { x i } and the centered data { y i }. Optimization of J K becomes 13
14
K -way Clustering 14
15
Cluster Centroid Subspace 15
16
Cluster Centroid Subspace 16
17
Cluster Centroid Subspace Proposition In cluster subspace, between-cluster distances remain nearly as in original space, while within-cluster distances are reduced. 17
18
Kernel K -means clustering and Kernel PCA K -means clustering can be viewed as using the standard dot- product (Gram matrix). Thus it can be easily extended to any other kernels. with kernel matrix 18
19
Kernel K -means clustering and Kernel PCA 19
20
Recovering K Clusters Once the K-1 principal components q k are computed, how to recover the non-negative cluster indicators h k, therefore the clusters themselves? The key is to compute the orthonormal transformation T. Theorem The linear transformation T is formed by the K eigenvectors of Γ specified by – α ij are K(K-1)/2 arbitrary positive numbers that sum-to-one. 20
21
Connectivity Analysis 21
22
Experiment 4029 gene expression on human lymphoma are classified into 9 classes. 200 out of 4029 are selected based on F-statistic. 3 too small classes are ignored. ( K=6 ) The cluster structure are embedded in the first K -1=5 principal components. In this 5-dimensional eigenspace we perform K -means clustering. 22
23
Experiment Confusion matrix – b kl =number of samples being clustered into class k, but actually belonging to class l. –Clustering accuracy 23
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.