ECE 539 Project Aditya Ghule Optimization of the determination of the Number of clusters using Self Organizing Map ECE 539 Project Aditya Ghule
Need for the determination of number of clusters The determination of number of clusters in a data set is one of the most fundamental problems. A good result is essential for further analysis like classification process.
Existing approaches in the field Over the years various algorithms have been developed to determine the number of clusters in the given data set like: The elbow method Information Theoretic Approach Prediction based re-sampling method
Objective Identification of cluster patterns in given data set by using Self Organizing Map. The algorithm is optimized to give accurate results when the number of clusters is less than 10
Logic of algorithm A 1-D Self Organizing Map is implemented on the given data set. The distance between adjacent neurons in the SOM is used to determine the number of cluster like patterns in the data set. The data set is generated in Matlab with random number of points, variance and distribution.
Plot of the distances between the neurons in the SOM
Plot of the SOM
Plot of the cluster patterns and cluster centers estimated by the algorithm
Thus it can be seen that cluster patterns can be recognized by using SOM accurately. This can be then used for classification of the clusters by assigning each cluster a class label.
Thank You