University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax K-MST -based clustering Caiming Zhong Pasi Franti
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Outline Minimum spanning tree (MST) MST-based clustering K-MST K-MST-based clustering Fast approximate MST MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Minimum Spanning Tree Spanning tree Given graph Spanning tree Non- Spanning tree MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Minimum Spanning Tree Minimize the sum of weights (Kruskal, Prim’s Algorithm) Given graph G=(V,E) MST T MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax MST-based clustering The most used Method1: removing long MST-edges MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax MST-based clustering Removing long MST-edges doesn’t always work MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax MST-based clustering The most used Method2: edge inconsistent Tree edge AB, whose weight W(AB) is significantly larger than the average of nearby edge weights on both sides of the edge AB, should be deleted. MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax K-MST What is K-MST? –Let G = (V,E) denote the complete graph –Let MST 1 denote the MST of G, and it is computed as MST 1 = mst(V, E). –Then, MST 2 denote the second round of MST of G, MST 2 = mst(V, E- MST 1 ). –MST k = mst(V, E- MST 1 -…-MST k-1 ). MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax K-MST K-MST-based graph MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax K-MST Typical clustering problems –Separated problems and touching problems. –Separated problems includes distance- separated problems and density-separated problems. MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax K-MST-based clustering Definition of edge weight for separated problems MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Three good features: (1) Weights of inter-cluster edges are quite larger than those of intra-cluster edges. (2) The inter- cluster edges are approximately equally distributed to T1 and T2. (3) Except inter- cluster edges, most of edges with large weights come from T2.
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax K-MST-based clustering Touching problems MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Partition(cut1) and Partition(cut3) are similar ; Partition(cut2) and Partition(cut3) are similar.
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Fast approximate MST (FAMST) Traditional MST algorithms take O(N 2 ) time, not favored by large data sets. In practical application, generally FAMST has as same result as exact MST Find a FAMST in O(N 1.55 ) MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Fast approximate MST (FAMST) Scheme: Divide-and-Conquer MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Fast approximate MST (FAMST) Performance MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax MST MST-based clustering K-MST K-MST-based clustering Fast approximate MST