Efficient SOM Learning by Data Order Adjustment Authors: Miyoshi et al. Advisor: Dr. Hsu Graduate :Yu-Wei Su
Outline Motivation Objective SOM Data order and learning convergence Ending point of convergence Experiments Conclusion opinion
Motivation In SOM, there are many factors to aggravate computational load and competition
Objective Reducing the competition load and increasing the speed of SOM
SOM SOM is a algorithm that map dimension from high to low, always two dimension Step 1: finding BMU of each datum Step 2: modifying the value of BMU and neighborhood nodes
Data order and learning convergence The SOM spends lots of time to learn because of large map size, large quantity of input data and many dimensions in data et al. In the beginning stage of learning process, SOM map is dynamically and widely and that is depended on the distance of each input data
Data order and learning convergence( cont.) Adjusting data order based on the distance between data classes to reduce the competition load
Data order To change order of input data, using class distance that is calculated by class center First select typical data as class center in each class And calculate Euclidian distance between all class centers as class distance
Data order( cont.) Order 1: random order Order 2: the largest distance order based on previous data class scli : selected data class i ucli : still unselected data class i cd(A,B) :class distance between A and B
Data order( cont.) Order 3: the smallest distance order based on previous data class Order 4: the smallest distance order based on all classes scli : selected data class i ucli : still unselected data class i cd(A,B) :class distance between A and B
Data order( cont.) Order 5: the largest distance order based on all classes Order 6: average distance order based on all classes scli : selected data class i ucli : still unselected data class i cd(A,B) :class distance between A and B
Ending point of convergence Definition of converging point of learning Keep maximum Euclidian distance for all nodes in each step of learning Test the difference between (|xdn-xdn-1|) whether (|xdn-xdn-1|) is smaller than Th1 If it is, test how long the distance are continued smaller If it continues long enough than Th2 it is determined as the ending point of learning
Ending point of convergence (cont.) Th1 : threshold of difference Th2 : threshold of period xdn : n-th max distance through all input data and output nodes ed(A,B) : Euclidian distance between A and B dti : input data I onj: output node j dmax : total of input data nmax : total of output nodes
Experiments Experiment data Parameters Synthetic data, 5 dim, 7 classes each 49 data Parameters Size of map 8x8, initial neighborhood from 3 to 5, initial learning rate from 0.2 to 0.8, 300 maps that initialized at random
Experiments ( cont.) Learning rate function Neighborhood function
Experiments ( cont.)
Experiments ( cont.)
Conclusion The data stream of small distance makes maximum 9% improvement The data stream of large distance still similar with conventional SOM All order make no remarkable difference in result map
Opinion No experiments of comparison with others The terminal condition is a good idea