Instar Learning Law Adapted from lecture notes of the course CN510: Cognitive and Neural Modeling offered in the Department of Cognitive and Neural Systems.

Slides:



Advertisements
Similar presentations
© Negnevitsky, Pearson Education, Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Advertisements

2806 Neural Computation Self-Organizing Maps Lecture Ari Visa.
Instar and Outstar Learning Laws Adapted from lecture notes of the course CN510: Cognitive and Neural Modeling offered in the Department of Cognitive and.
Un Supervised Learning & Self Organizing Maps. Un Supervised Competitive Learning In Hebbian networks, all neurons can fire at the same time Competitive.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
Self Organization: Competitive Learning
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Kohonen Self Organising Maps Michael J. Watts
Artificial neural networks:
Adaptive Resonance Theory (ART) networks perform completely unsupervised learning. Their competitive learning algorithm is similar to the first (unsupervised)
Competitive learning College voor cursus Connectionistische modellen M Meeter.
Unsupervised Networks Closely related to clustering Do not require target outputs for each input vector in the training data Inputs are connected to a.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
CSC321: Neural Networks Lecture 3: Perceptrons
X0 xn w0 wn o Threshold units SOM.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Simple Neural Nets For Pattern Classification
Slides are based on Negnevitsky, Pearson Education, Lecture 8 Artificial neural networks: Unsupervised learning n Introduction n Hebbian learning.
Chapter 2: Pattern Recognition
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Un Supervised Learning & Self Organizing Maps Learning From Examples
Linear Separators. Bankruptcy example R is the ratio of earnings to expenses L is the number of late payments on credit cards over the past year. We will.
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
Chapter Seven The Network Approach: Mind as a Web.
Artificial Neural Networks Ch15. 2 Objectives Grossberg network is a self-organizing continuous-time competitive network.  Continuous-time recurrent.
Linear Discriminant Functions Chapter 5 (Duda et al.)
Lecture 09 Clustering-based Learning
Radial Basis Function (RBF) Networks
Lecture 12 Self-organizing maps of Kohonen RBF-networks
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Self Organized Map (SOM)
Artificial Neural Networks Dr. Abdul Basit Siddiqui Assistant Professor FURC.
Artificial Neural Network Unsupervised Learning
NEURAL NETWORKS FOR DATA MINING
Hebbian Coincidence Learning
15 1 Grossberg Network Biological Motivation: Vision Eyeball and Retina.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Soft Computing Lecture 14 Clustering and model ART.
Self Organizing Feature Map CS570 인공지능 이대성 Computer Science KAIST.
Ming-Feng Yeh1 CHAPTER 16 AdaptiveResonanceTheory.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
UNSUPERVISED LEARNING NETWORKS
1 Financial Informatics –XVII: Unsupervised Learning 1 Khurshid Ahmad, Professor of Computer Science, Department of Computer Science Trinity College, Dublin-2,
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
381 Self Organization Map Learning without Examples.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Self-Organizing Maps (SOM) (§ 5.5)
CHAPTER 14 Competitive Networks Ming-Feng Yeh.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Soft Computing Lecture 15 Constructive learning algorithms. Network of Hamming.
Machine Learning 12. Local Models.
Self-Organizing Network Model (SOM) Session 11
Synaptic Dynamics: Unsupervised Learning
Dr. Unnikrishnan P.C. Professor, EEE
Lecture 22 Clustering (3).
Grossberg Network.
Competitive Networks.
Adaptive Resonance Theory
Grossberg Network.
Competitive Networks.
Adaptive Resonance Theory
The Network Approach: Mind as a Web
Artificial Neural Networks
Unsupervised Networks Closely related to clustering
Patterson: Chap 1 A Review of Machine Learning
Presentation transcript:

Instar Learning Law Adapted from lecture notes of the course CN510: Cognitive and Neural Modeling offered in the Department of Cognitive and Neural Systems at Boston University (Instructor: Dr. Anatoli Gorchetchnikov;

Competitive Learning A standard competitive learning neural network is schematized at right Specific examples include: Von der Malsburg (1973) model of visual cortex Grossberg (1976) adaptive resonance theory (ART) Kohonen (1983) self- organizing feature maps (SOFM, SOM) Competitive learning networks typically learn in an unsupervised fashion

Grossberg (1976) Describes an unsupervised learning scheme that can be used to build “feature detectors” or perform categorization This paper presents many of the theoretical bases of adaptive resonance theory (ART)

Neural Feature Detectors A neural feature detector is a (typically sensory) neuron that fires preferentially to inputs of a particular type. For example, many cells in visual cortex fire preferentially to lines of particular orientations: Such a cell could be called a line orientation detector –The detector in the figure above is “tuned” to vertically oriented lines The pattern of firing as a function of some stimulus dimension (e.g., line orientation angle in the above figure) is called the “tuning curve”

Tuning vs. Categorizing Although a feature detector cell is “tuned” to a particular orientation, it may or may not fire for other nearby orientations Categorization, on the other hand, implies that a cell fires only for inputs in the category coded by the cell This is closely related to the difference between “choice” and “contrast enhancement” in an RCF: In the competitive learning network of Grossberg (1976), choice leads to a categorization network, and contrast enhancement leads to feature detectors with broader tuning curves FASTER- THAN- LINEAR SIGMOID x x f(x)

The Network Of Grossberg (1976) Network is a standard competitive learning network in which the F2 layer is a shunting recurrent competitive field (RCF): In this tutorial we will consider the case where F2 is a “choice” RCF network (i.e., faster-than-linear signal function) COMPETITION ALLOWS ONLY ONE OR A RELATIVELY SMALL NUMBER OF CELLS TO BECOME ACTIVE ONLY WEIGHTS PROJECTING TO ACTIVE CELLS ARE ALLOWED TO INCREASE INPUT LAYER COMPETITIVE +

Consider what happens when an input pattern is presented to the F1 stage of the network Because the RCF at F2 is winner-take-all, the F2 cell with the largest input will win the competition All other cells will have their activities quenched _ _ _ _ F2 F1

Which F2 cell will have the largest input? Need to consider the relationship between the input pattern (vector) and the cell’s weight vector The input to F2 cell will be a product of input pattern and the weight vector _ _ _ _ F2 F1

In vector notation the input I j to the F2 cell indexed by j is: Where is the vector of F1 cell activities, is the vector of weights projecting to F2 cell j, and is the angle between these two vectors Assume for now that the total activity at F1 is normalized, as are the weight vectors projecting to the F2 cells In this case, the F2 cell with the most input will be the one whose afferent weight vector is closest to parallel to the activity pattern at F1 and thus has the largest term in the above equation

For a 2-D input space (i.e., 2 F1 cells), the situation can be schematized as follows:

The Instar Learning Law Grossberg (1976) studied the effects of using an “instar” learning law with Hebbian growth and post-synaptically gated decay in the F1 to F2 weights If F2 is winner-take-all, learning will only occur in the weight vector projecting to the F2 cell that won the competition; This is Competitive learning

What happens to the afferent weight vector for the F2 cell that won the competition? The weight vector becomes more like the current input vector.

That is, the afferent weight vector becomes more parallel to the current F1 STM pattern This means that the winning F2 cell will become even more responsive to the current input pattern

Weight “Normalization” in the Grossberg (1976) Network If the F1 layer is a shunting competitive network with normalized activity, then the weight vector projecting to each F2 cell will be approximately normalized as well after learning has taken place This can be seen by looking at the instar learning law: This can be interpreted as “z ij tracks x 1i at a rate determined by x 2j ”. Thus, each component of the F2 cell’s weight vector tracks the corresponding component of the input vector at F1, and since the input vector is normalized, the weight vector becomes approximately normalized as well

“Coding” of the Input Space The afferent weight vectors for the F2 cells span the input space When an input arrives at F1, the F2 cell whose afferent weight vector is closest to parallel to the input vector will have the most input and win the competition at F2 Thus, each F2 cell is tuned to (“codes”) a region of the input space roughly centered around its afferent weight vector:

We know that, for a single input pattern, the weight vector afferent to an F2 cell moves toward the input pattern vector More general question: what happens to the network as a whole when presented with many stimulus patterns? In particular, do the “features” (i.e., regions of input space) coded by the cells constantly change, or do they stabilize? Grossberg (1976) noted that we cannot in general guarantee that: –an initial partition of input space will be maintained through time, or –a “stable” partition will ever form

Strange learning sequences might cause the constant recoding of F2 cell firing preferences:

Sparse Patterns Theorem However, we can guarantee that a partitioning into “sparse” classes (defined below) will persist: Sparse patterns theorem Assume that you start out with a “sparse partition”; i.e., a partition such that the classifying weight vector for a given subset of input vectors is closer to all vectors in the subset than to any other vector from any other subset

Then, with instar learning and a choice network at F2, you are guaranteed that the initial partitions will not be recoded, and the weight vectors will eventually enter the convex hull of the subset of input patterns that it codes.

Proof Of Convergence to Convex Hull Grossberg (1976) provides an analytical proof We will look at a simpler and more general “geometric” proof here (Guenther, 1992) Key concept: For an acute angle ABC, as x is moved from B toward C along line segment BC, the distance between A and x initially decreases This concept is applied twice in the following proof

To prove that the weight vector z will enter the convex hull of training inputs, we can show that the closest point on the convex hull, c min, keeps getting closer to z during learning Consider the hyperplane H passing through c min and perpendicular to line segment zc min

First, note that all points in the convex hull must lie on H or on the side of H opposite z To see that this is the case, note that any point c in the convex hull that might lie on the same side of H as z would form an acute angle zc min c, which means that some points between c min and c would be closer to z, thus violating the definition of c min Thus all training inputs lie on the side of H opposite z or on H itself

This means that the angle c min zc must be acute Thus incrementing the weight vector z toward any training input c from the convex hull will decrease the distance between z and c min I.e., the weight vector constantly moves closer to the convex hull, and once in the hull it can never leave it More generally, constantly moving toward arbitrary points in a convex hull eventually leads to the convex hull

Beyond Sparse Patterns In the general case, we cannot rely on inputs falling into sparse classes How, then, do we guarantee that cells in F2 won’t constantly be recoded? This issue became the motivation for top-down feedback in ART, with top-down weights adjusted according to the outstar learning law