Data Mining, Neural Network and Genetic Programming High-Order Neural Network and Self-Organising Map Yi Mei Yi.mei@ecs.vuw.ac.nz
Outline High-order neural network Self-organising map Motivation Architecture HONN with products Self-organising map Classification with no label (clustering) Architecture (neurons, weights, positions) Training and mapping Weight learning
High-Order Neural Network We have discussed about CNN Weight smoothing Central weight initialisation All take use of domain knowledge in object recognition Relationship between neighbouring pixels CNN changes architectures and constrain weights Weight smoothing constrain weights Central weight initialisation: the central pixels of an object may be more important, and deserve more weights The architecture of CNN is good, but too complicated to design, any simple ways? High-Order Neural Network (HONN)
High-Order Neural Network Conventional Feedforward Neural Network Outputs Hidden units Inputs (pixels) 1-order
High-Order Neural Network Outputs Hidden units Inputs Pixels 2-order
High-Order Neural Network Outputs Hidden units Inputs Pixels 3-order
High-Order Neural Network 8 * 8 feature map 64 hidden nodes 16 * 16 256 input nodes Receptive field 5 * 5 weight matrix 25-order HONN with weight constraint
High-Order Neural Network Neighboring pixels are connected to the same input node (not fully connected) CNN is a special type of HONN Do not require weight constraint (but can be improved by including weight constraints) A more general architecture that can take advantage of geometric relationship between pixels
High-Order Neural Network Another point of view Conventional neural network only considers weighted sum (1-order) This only captures the linear correlation between inputs High-order correlation (e.g. products) may be useful HONN can capture high-order correlations 1-order 2-order
High-Order Neural Network An example: XOR problem Cannot be solved by perceptron What if a high-order perceptron?
High-Order Neural Network Translation invariance Impose constraints on weights: similar to CNN
Self-Organising Map Conventional neural network require class label (desired output) But class label is hard (time-consuming) to get Can we do classification without class label? -- SOM
Self-Organising Map Introduced by T. Kohonen in 1980s A type of neural network But different from other NNs (feedforward, CNN) – unsupervised learning Does not require class labels Visualise high-dimensional data in a low-dimensional space Discover categories and abstractions from raw data
Self-Organising Map A set of nodes or neurons A weight vector – one weight for each input variable A position in the map space Usually arranged as a rectangular grid Neurons Inputs
Self-Organising Map Training: building the map using (training) input examples (adjust neuron weights) Mapping: automatically classifying a new input vector (which neuron is fired?) Doing competitive learning rather than error-correction learning Use a neighborhood function to preserve the topology of the input space
Self-Organising Map For each input vector, the node whose weight is the closest to the input value is chosen (fired) Best Matching Unit (BMU) For each input vector, the weights of the BMU and the neurons close to it in the SOM lattice are adjusted towards the input vector Neighborhood function, shrink over time Input BMU of the input Decreasing learning coefficient
Self-Organising Map Topology Rectangular Hexagonal
Self-Organising Map Neighborhood function Threshold Other forms: Gaussian
Self-Organising Map Distance Euclidean distance Manhattan distance Box distance Number of links in between
Learning Self-Organising Map 1. Randomly initialise the weights 2. Get an input vector 3. Calculate the BMU of : 4. Update the weight vector for each node in the map: 5. Increase s and go back to 2 until s reach the maximal step number
An example: color Input: 3-dimensional vectors: RGB values SOM: 40 x 40 lattice For each neuron, show the 3-D weight vector as RGB
Another example: clustering digits
Question Consider a SOM for digit clustering Each digit has 8 features The SOM is defined as a 30 x 30 lattice 10 classes, 1000 images for training, 100 per class How many input neurons? How many output neurons? How many weights in the SOM?
Summary HONN is a general network to take into account the geometric relationship between pixels Which pixels to be connected together? HONN can consider high-order correlation (products) between inputs (pixels) HONN can be handcrafted to be invariant under transformations (translation, scaling, rotation, …) SOM is an unsupervised neural network SOM can find patterns in input data without requiring labels SOM can visualise high-dimensional data in low dimension