University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax K-means*: Clustering by Gradual Data Transformation Mikko Malinen and Pasi Fränti Speech and Image Processing Unit School of Computing University of Eastern Finland
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax K-means* clustering Gradual transformation of data Model Data Fit the data to a model IntermediateFinal
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax K-means clustering Iterate between two steps: 1. Assignment step Assign the points to the nearest centroids 2. Update step Update the location of centroids
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax K-means* clustering
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Example of clustering (s 2 dataset)
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax % done
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Empty clusters problem
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Initialization Data set transform Empty clusters removal K-means Algorithm total Time Complexity
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Time Complexity Fixed k-means Initialization Data set transform Empty clusters removal K-means Algorithm total
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax s1 d = 2 n = 5000 k = 15 s2 d = 2 n = 5000 k = 15 s3 d = 2 n = 5000 k = 15 s4 d = 2 n = 5000 k = 15 bridge d = 16 n = 4096 k= 256 missa d = 16 n = 6480 k= 256 house d = 3 n=34000 k=256 thyroid d = 5 n = 215 k = 2 iris d = 4 n = 150 k = 2 wine d = 13 n = 178 k = 3 Datasets
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Mean square error Datasetk-meansproposedGKMoptimal s s s s bridge missa house thyroid iris wine
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Mean square error vs. number of steps
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Mean square error vs. number of steps
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Mean square error vs. number of steps
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Mean square error vs. number of steps
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Mean square error vs. number of steps
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Mean square error vs. number of steps
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Mean square error vs. number of steps
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax All correct: Number of incorrect clusters proposed: 36% k-means: 14%
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax incorrect: Number of incorrect clusters proposed: 64% k-means: 38%
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax incorrect: Number of incorrect clusters proposed: 0% k-means: 34%
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax incorrect: Number of incorrect clusters proposed: 0% k-means: 10%
University of Eastern Finland School of Computing P.O. Box 111 FIN Joensuu FINLAND Tel fax Summary We have presented a clustering method based on gradual transformation of data and k-means. Instead of fitting the model to data, we fit the data to a model. The proposed method gives better mean square error than k-means.