2806 Neural Computation Self-Organizing Maps Lecture 9 2005 Ari Visa.

Slides:



Advertisements
Similar presentations
© Negnevitsky, Pearson Education, Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Advertisements

Sampling and Pulse Code Modulation
Instar and Outstar Learning Laws Adapted from lecture notes of the course CN510: Cognitive and Neural Modeling offered in the Department of Cognitive and.
Un Supervised Learning & Self Organizing Maps. Un Supervised Competitive Learning In Hebbian networks, all neurons can fire at the same time Competitive.
Neural Networks Chapter 9 Joost N. Kok Universiteit Leiden.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
Self Organization: Competitive Learning
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Kohonen Self Organising Maps Michael J. Watts
Artificial neural networks:
Competitive learning College voor cursus Connectionistische modellen M Meeter.
Unsupervised Networks Closely related to clustering Do not require target outputs for each input vector in the training data Inputs are connected to a.
Self-Organizing Map (SOM). Unsupervised neural networks, equivalent to clustering. Two layers – input and output – The input layer represents the input.
X0 xn w0 wn o Threshold units SOM.
Self Organizing Maps. This presentation is based on: SOM’s are invented by Teuvo Kohonen. They represent multidimensional.
Slides are based on Negnevitsky, Pearson Education, Lecture 8 Artificial neural networks: Unsupervised learning n Introduction n Hebbian learning.
PMR5406 Redes Neurais e Lógica Fuzzy
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Instar Learning Law Adapted from lecture notes of the course CN510: Cognitive and Neural Modeling offered in the Department of Cognitive and Neural Systems.
1 Study of Topographic and Equiprobable Mapping with Clustering for Fault Classification Ashish Babbar EE645 Final Project.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
November 24, 2009Introduction to Cognitive Science Lecture 21: Self-Organizing Maps 1 Self-Organizing Maps (Kohonen Maps) In the BPN, we used supervised.
WK6 – Self-Organising Networks:
Lecture 09 Clustering-based Learning
Project reminder Deadline: Monday :00 Prepare 10 minutes long pesentation (in Czech/Slovak), which you’ll present on Wednesday during.
Lecture 12 Self-organizing maps of Kohonen RBF-networks
KOHONEN SELF ORGANISING MAP SEMINAR BY M.V.MAHENDRAN., Reg no: III SEM, M.E., Control And Instrumentation Engg.
Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Self Organizing Maps (SOM) Unsupervised Learning.
Self Organized Map (SOM)
CZ5225: Modeling and Simulation in Biology Lecture 5: Clustering Analysis for Microarray Data III Prof. Chen Yu Zong Tel:
Self-organizing map Speech and Image Processing Unit Department of Computer Science University of Joensuu, FINLAND Pasi Fränti Clustering Methods: Part.
Artificial Neural Networks Dr. Abdul Basit Siddiqui Assistant Professor FURC.
Artificial Neural Network Unsupervised Learning
Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Stephen Marsland Ch. 9 Unsupervised Learning Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009 based on slides from Stephen.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
Self Organizing Feature Map CS570 인공지능 이대성 Computer Science KAIST.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
UNSUPERVISED LEARNING NETWORKS
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
381 Self Organization Map Learning without Examples.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive FIR Neural Model for Centroid Learning in Self-Organizing.
Self-Organizing Maps (SOM) (§ 5.5)
CHAPTER 14 Competitive Networks Ming-Feng Yeh.
Unsupervised learning: simple competitive learning Biological background: Neurons are wired topographically, nearby neurons connect to nearby neurons.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
Computational Intelligence: Methods and Applications Lecture 9 Self-Organized Mappings Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Machine Learning 12. Local Models.
Chapter 5 Unsupervised learning
Self-Organizing Network Model (SOM) Session 11
LECTURE 11: Advanced Discriminant Analysis
Unsupervised Learning Networks
Other Applications of Energy Minimzation
Synaptic Dynamics: Unsupervised Learning
Structure learning with deep autoencoders
Lecture 22 Clustering (3).
Computational Intelligence: Methods and Applications
Introduction to Cluster Analysis
Feature mapping: Self-organizing Maps
Unsupervised Networks Closely related to clustering
Presentation transcript:

2806 Neural Computation Self-Organizing Maps Lecture Ari Visa

Agenda n Some historical notes n Some theory n Self-Organizing Map n Learning Vector Quantization n Conclusions

Some Historical Notes Local ordering (von der Malsbyrg, 1973) (Amari, 1980) a matematical analysis elucidates the dynamic stability of a cortical map Self-organizing feature map (SOM), (Kohonen 1982) (Erwin, 1992) a convex neighbourhood function should be used (~ Gaussian) The relationship between the SOM and principal curves is dicussed (Ritter, 1992 & Cherkassky and Mulier, 1995)

Some Historical Notes n Vector quantization: Lloyd algorithm 1957 for scalar quantization (= Max quantizer ) n The generilized Lloyd algorithm for vector quantization (= k-means algorithm McQueen 1969 = LBG algorithm Linde et al 1980) n The idea of learning vector quantization (LVQ) (Kohonen, 1986) n Convergence properties of the LVQ algorithm using the ordinary differential equation (ODE) (Baras and LaVigna, 1990)

Some Theory n The spatial location of an output neuron in a topographic map corresponds to a particular domain or feature of data drawn from the input space. (Kohonen 1990) n Two approaches: Willshaw-von der Malsburg (explain neurobiological details) and Kohonen ( a more general than the first model in the sense that it is capable of performing data reduction).

Some Theory n The principle goal of the self-organizing map is to transform an incoming signal pattern of arbitary dimension into a one- or two dimensional discrete map, and to perform this transformation adaptively in a topologically ordered fashion.

SOM The formation of the self-organizing map 1. Competition 2. Cooperation 3. Synaptic Adaptation n Competitive Process n x = [x 1, x 2,...,x m ] T n w j = [w j1,w j2,...,w jm ] T n where j = 1,2,...,l (=total number of neurons in the network) n Select the neuron with largest w j T x n i(x) = arg min j ||x – w j || n A continuous input space of activation patterns is mapped onto a discrete output space of neurons by a process of competition among the neurons in the network.

SOM Cooperation n How to define a topological neigborhood that is neurobiologically correct? n Let h ij denote the topological neighborhood centered on winning neuron i, and encompassing a set of excited neurons denoted by j. n The topological neighborhood is symmetric about the maximum point n The amplitude of the topological neighborhood decreases monotonically with the increasing lateral distance  h j,i(x) (n) = exp(d² j,i /2  ²(n)) n  (n) =  0 exp(-n/  1 ), n =0,1,2,….

SOM Adaptation n w j (n+1) = w j (n) +  (n) h j,i(x) (n)[x-w j (n)], note Hebbian learning n The synaptic weight vector w j of winning neuron i move toward the input vector x. Upon repeated presentations of the training data, the synaptic weight vector tend to follow the distribution of the input vectors due to the neighborhood updating.  topological ordering n The learning-rate parameter  (n) should be time varying.  (n) =  0 exp(-n/  2 ), n =0,1,2,…

SOM n Ordering and Convergence n Self-organizing or ordering phase: (max 1000 iterations) n  (n) = [0.1, 0.01],  2 = 1000 n h j,i (n) = [”radius” of the lattice, the winning neuron and a couple neighboring neurons around],  1 = 1000/ log  0 n Convergence phase: (fine tune the feature map, 500*the number of neurons in the network)  (n) = 0.01, h j,i (n) = the winning neuron and one or zero neighboring neurons. (summary of SOM)

Some Theory n Property 1. Approximation of the Input Space. The feature map , represented by the set of synaptic weight vectors {w j } in the input space A, provides a good approximation to the input space H. n The theoretical basis of the idea is rooted in vector quantization theory.

Some Theory n c(x) acts as an encoder of the input vector x and x’(c) acts as a decoder of c(x). The vector x is selected at random from the training sample, subject to an underlying probability density function f x (x). The optimum encoding-decoding scheme is detemined by varying the functions c(x) and x’(c), so as to minimize the expected distortion defined by.

Some Theory n d(x,x’) = ||x-x’||² = (x-x’) T (x-x’) Generalized Lloyd algorithm Condition 1. Given the input vector x, choose the code c = c(x) to minimize the squared error distortion ||x-x’(c)||². Condition 2. Given the code c, compute the reconstuction vector x =x’(c) as the centroid of the input vectors x that satisfy condition 1. The Generalized Lloyd algorithm is closely related to the SOM.

Some Theory Condition 1. Given the input vector x, choose the code c = c(x) to minimize the distortion measure D 2. Condition 2. Given the code c, compute the reconstuction vector x =x’(c) to satisfy the conditions. n x’ new (c) ← x’ old (c) +  (c-c(x))[ x-x’ old (c)]

Some Theory n Property 2. Topological Ordering. The feature map  computed by the SOM algorithm is topologically ordered in the sense that the spatial location of a neuron in the lattice corresponds to a particular domain or feature of input patterns.

Some Theory n Property 3. Density Matching. The feature map  reflects variations in the statistics of the input distributions: regions in the input space H from which the sample vectors x are drawn with a high probability of occurrence are mapped onto larger domains of the output space A, and therefore with better resolution than regions in H from which sample vectors x are drawn with a low probability of occurrence. n Minimum-distortion encoding, according to which the curvature terms and all higher-order terms in the distortion measure due to noise model  ( ) are retained. m(x)  f x ⅓ (x) n Nearest-neighbor encoding, which emerges if the curvature terms are ignored, as in the standard form of the SOM algorithm. m(x)  f x ⅔ (x)

Some Theory. n Property 4. Feature Selection. Given data from an input space with an nonlinear distribution, the self-organizing map is able to select a set of best features for approximating the underlying distribution.

Learning Vector Quantizer n Vector quantization: an input space is divided into a number of distinct regions, and for each region a reconstruction vector is defined. n A vector quantizer with minimum encoding distortion is called a Voronoi or nearest- neighbor quantizer. n The collection of possible reproduction vectors is called the code book of the quantizer, and its members are called code vectors.

Learning Vector Quantizer The SOM algorithm provides an approximate method for computing the Voronoi vectors in unsupervised manner. Learning vector quantization (LVQ) is a supervised learning technique that uses class information to move the Voronoi vectors slightly, so as to improve the quality of the classifier decision regions.

Learning Vector Quantizer n An input vector x is picked at random from the input space. If the class labels of the input vector x and a Voronoi vector w agree, the Voronoi vector w is moved in the direction of the input vector x. If the class labels of the input vector x and the Voronoi vector w disagree, the Voronoi vector w is moved away from the input vector x. n Let {w i } 1 i=1 denote the set of Voronoi vectors, and the {x i } N i=1 denote the set of input vectors. LVQ: I. Suppose that the Voronoi vector w c is the closest to the input vector x i. Let L wc denote the class associated with the Voronoi vector w c and L xi denote the class label of the input vector x i. The Voronoi vector w c is adjusted as follows: If L wc = L xi,then w c (n+1) = w c (n) + α n [x i - w c (n)] where 0< α n <1. If L wc ≠ L xi,then w c (n+1) = w c (n) - α n [x i - w c (n)] where 0< α n <1. II. The other Voronoi vectors are not modified.

Learning Vector Quantizer Vector quantization is a form of lossy data compression. Rate distortion theory (Gray 1984): Better data compression performance can always be achieved by coding vectors instead of scalars, even if the source of data is memoryless, or if the data compression system has memory. A multistage hierarchical vector quantizer (Luttrell 1989)

Learning Vector Quantizer n First-order autoregressive (AR) model: x(n+1) =  x(n) + (n) where  is the AR coefficient and the (n) are independent and identical distributed (iid) Gaussian random variables of zero mean and unit variance.

Some Theory n Attribute code x a n Symbol code x s n x =[x s, 0] T + [0, x a ] T n Contextual map or semantic map

Summary n The SOM algorithm is neurobiologically inspired, incorporating all the mechanisms that are basic to self-organization: competition, cooperation, and self-amplification. n The Kohonen’s SOM algorithm is so simple to implement, yet matematically so difficult to analyze its properties in a general setting. n The self-organizing map may be viewed as a vector quantizer.