r 朱冠宇 r 李哲君 2012/04/17
Hierarchical K-means
K-means can be used for lots of applications Market research Social network analysis Recommendation system Image retrieval system
Data: N * d, find K centroids Find the nearest centroid: sub : N * d * k, mul : N * d * k, add : N * d min : O(k) Calculate new centroid: add : N * d, div : K * d Therefore, Hierarchical K-Means (HKM) shows up, but it is still slow for high dimension and large data.
Holiday dataset 1491 pictures vectors centroids Time : 626m55.739s
4/16~4/20K-means(GPU), HKM(CPU) code survey 4/23~5/04HKM GPU basic version 5/07~5/18Code speed up 5/21~6/01Clustering visualization or some app. 6/01~6/12buffer and write poster
1. big dimension (>128 for sift) HKM CUDA lib 2. HKM(GPU) v.s. HKM(CPU) report 3. clustering visualization (optional)