Clustering The process of grouping samples so that the samples are similar within each group.
Clustering
Algorithm of Clustering Hierarchical clustering Organizes the data into larger groups, which contain smaller groups, like a tree or dendrogram. Algorithms :Agglomerative,Single-linkage, complete-linkage, average-linkage, Ward…. Partitional clustering To create one set of clusters that partitions the data into similar groups. Algorithms: Forgy’s, k-means, Isodata… SOM,CLICK, CAST, …
Figures of Hierarchical Clustering 1‘ 1 2 3 4 5
Figures of Hierarchical Clustering 2‘ 1 2 3 4 5
Figures of Hierarchical Clustering 2‘ 3‘ 1 2 3 4 5
Figures of Hierarchical Clustering 1 2 3 4 5
Hierarchical Clustering Method Distance metric Single-link Average-link Complete-link Centriod
K-mean approach One more input k is required. There are many variants of k-mean. Sum-of squares criterion minimize
An example of k-mean approach Two passes Begin with k clusters, each consisting of one of the first k samples. For the remaining n-k samples, find the centroid nearest it. After each sample is assigned, re-compute the centroid of the altered cluster. For each sample, find the centroid nearest it. Put the sample in the cluster identified with this nearest centroid. ( do not need to re-compute.)
Examples
Examples
Examples
Examples
Examples
Examples
Self Organizing Maps
Examples
Examples
Examples
Examples
Examples
Examples
Examples
Examples
Examples
Examples
Examples
CLICK Use graph theory Connected component The edge weight is calculated by statistical probabilities