Download presentation
Presentation is loading. Please wait.
Published bySilas Booker Modified over 9 years ago
1
Iterative K-Means Algorithm Based on Fisher Discriminant UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND Mantao Xu to be presented at: Infomation Fusion 2004
2
Problem Formulation Given N data samples X={x 1, x 2, …, x N }, construct the codebook C = {c 1, c 2, …, c M } such that mean-square-error is minimized. The class membership p (i) is
3
Traditional K-Means Algorithm Iterations of two steps: assignment of each data vector with a class label computation of cluster centroid by averaging all data vectors that are assigned to it Characteristics: Randomized initial partition or codebook Convergence to a local minimum Use of L 2, L 1 and L distance Fast and easy implementation Extensions: Kernel Kmeans algorithm EM algorithm K-median algorithm
4
Motivation Investigation on a clustering algorithm that : iteratively performs the regular K-Means algorithm in searching a solution close to the global optima estimates the initial partition close to the optimal solution by at each iteration applies a dissimilarity function based on the current partition instead of L 2 distance
5
Selecting K-Means initial partiton Selection of initial partion based on Fisher discriminant and dynamic programming: A suboptimal partition is estimated by dynamic programming in some one- dimenaional subspace of feature space. The one-dimensional subspace is constructed through linear multi-class Fisher discriminant analysis The output class of K-Means in each iteration is selected as the input class of discriminant analysis for next iteration
6
The multi-class Fisher discriminant analysis The separation of input classes in the discriminant direction w can be measured by F-ratio validity index, F(w) The multi-class linear Fisher discriminant w is the minimization of F-ratio validity index
7
The dynamic programming in discriminant direction The optimal convex partition Q k ={(q j-1,q j ]| j=1, ,n} in the discriminant direction w can be estimated by dynamic promgramming in terms of MSE distortion on the discriminant subspace or in terms of MSE distortion on original feature space (2) (1)
8
Application of Delta-MSE Dissimilarity x 1 x 2 x 3 y 1 y 2 y 3 x 4 G 1 G 2 Delta-MSE(x 4,G,G 1 )=RemovalVariance Delta-MSE(x 4,G,G 2 )=AddVariance Move vector x from cluster i to cluster j, the change of the MSE function [10] caused by this move is:
9
Pseudocodes of Iterative K-Means algorithm
10
Four K-Means algorithms conducted in experimental tests K-D tree based K-Means: selects its initial cluster centroids from the k-bucket centers of a kd-tree structure that is recursively built by principal component analysis PCA based K-Means: an intuitive approach to estimate a sub-optimal initial partition by applying the dynamic programming in the principal component direction LFD-I: the proposed iterative K-Means algorithm based on the dynamic programming criterion (1) LFD-II: the proposed iterative K-Means algorithm based on the dynamic programming criterion (2)
11
Comparisons of the four K-Means algorithms Table 1: Performance comparisons (in F-ratio validity indices ) of the four K-Means algorithms on the practical numbers of clusters
12
F-ratios produced by the four K- Means clusterings on dataset glass
13
F-ratios produced by the four K- Means clusterings on dataset heart
14
F-ratios produced by the four K- Means clusterings on dataset image
15
Conclusions A new approach to the k-center clustering problem by iteratively incorporating the Fisher discriminant analysis and the dynamic programming technique The proposed approach in general outperforms the two other algorithms: the PCA based K-Means algorithm and the kd-tree based K-Means algorithm The classification performance gains of the proposed approach over the two others is increased with the number of clusters
16
Further Work Sovling the k-center clustering problem by iteratively incorporating the kernel Fisher discriminant analysis and the dynamic programming technique Sovling the k-center clustering problem by incorporating the kernel PCA technique and the dynamic programming technique
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.