Counter propagation network (CPN) (§ 5.3)

Slides:



Advertisements
Similar presentations
Feed-forward Networks
Advertisements

Introduction to Neural Networks Computing
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
Neural Networks Chapter 9 Joost N. Kok Universiteit Leiden.
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Kohonen Self Organising Maps Michael J. Watts
Artificial neural networks:
Adaptive Resonance Theory (ART) networks perform completely unsupervised learning. Their competitive learning algorithm is similar to the first (unsupervised)
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Slides are based on Negnevitsky, Pearson Education, Lecture 8 Artificial neural networks: Unsupervised learning n Introduction n Hebbian learning.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
September 16, 2010Neural Networks Lecture 4: Models of Neurons and Neural Networks 1 Capabilities of Threshold Neurons By choosing appropriate weights.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Radial Basis Function (RBF) Networks
Radial Basis Function Networks
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Kumar Srijan ( ) Syed Ahsan( ). Problem Statement To create a Neural Networks based multiclass object classifier which can do rotation,
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
Chapter 4. Neural Networks Based on Competition Competition is important for NN –Competition between neurons has been observed in biological nerve systems.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Chapter 9 Neural Network.
Adaptive Resonance Theory
Artificial Intelligence Techniques Multilayer Perceptrons.
Chapter 5. Adaptive Resonance Theory (ART) ART1: for binary patterns; ART2: for continuous patterns Motivations: Previous methods have the following problem:
Chapter 5 Unsupervised learning. Introduction Unsupervised learning –Training samples contain only input patterns No desired output is given (teacher-less)
Soft Computing Lecture 14 Clustering and model ART.
Ming-Feng Yeh1 CHAPTER 16 AdaptiveResonanceTheory.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
UNSUPERVISED LEARNING NETWORKS
1 Adaptive Resonance Theory. 2 INTRODUCTION Adaptive resonance theory (ART) was developed by Carpenter and Grossberg[1987a] ART refers to the class of.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Chapter 2 Single Layer Feedforward Networks
CHAPTER 14 Competitive Networks Ming-Feng Yeh.
Additional NN Models Reinforcement learning (RL) Basic ideas: –Supervised learning: (delta rule, BP) Samples (x, f(x)) to learn function f(.) precise error.
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
1 Neural networks 2. 2 Introduction: Neural networks The nervous system contains 10^12 interconnected neurons.
Soft Computing Lecture 15 Constructive learning algorithms. Network of Hamming.
Dimensions of Neural Networks Ali Akbar Darabi Ghassem Mirroshandel Hootan Nokhost.
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
J. Kubalík, Gerstner Laboratory for Intelligent Decision Making and Control Artificial Neural Networks II - Outline Cascade Nets and Cascade-Correlation.
Neural networks.
Chapter 5 Unsupervised learning
Regression.
Fall 2004 Backpropagation CS478 - Machine Learning.
Data Mining, Neural Network and Genetic Programming
Adaptive Resonance Theory (ART)
Additional NN Models Reinforcement learning (RL) Basic ideas:
Chapter 2 Single Layer Feedforward Networks
Radial Basis Function G.Anuradha.
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Dr. Unnikrishnan P.C. Professor, EEE
Chapter 5 Unsupervised learning
Lecture 22 Clustering (3).
Neuro-Computing Lecture 4 Radial Basis Function Network
of the Artificial Neural Networks.
Additional NN Models Reinforcement learning (RL) Basic ideas:
Artificial Intelligence Chapter 3 Neural Networks
Adaptive Resonance Theory
Capabilities of Threshold Neurons
Artificial Intelligence Chapter 3 Neural Networks
The Naïve Bayes (NB) Classifier
Artificial Intelligence Chapter 3 Neural Networks
Adaptive Resonance Theory
Dr. Unnikrishnan P.C. Professor, EEE
Artificial Intelligence Chapter 3 Neural Networks
Computer Vision Lecture 19: Object Recognition III
Presentation transcript:

Counter propagation network (CPN) (§ 5.3) Basic idea of CPN Purpose: fast and coarse approximation of vector mapping not to map any given x to its with given precision, input vectors x are divided into clusters/classes. each cluster of x has one output y, which is (hopefully) the average of for all x in that class. Architecture: Simple case: FORWARD ONLY CPN, x z y 1 1 1 x w z v y i k,i k j,k j x z y n p m from input to hidden (class) from hidden (class) to output

Learning in two phases: training sample (x, d ) where is the desired precise mapping Phase1: weights coming into hidden nodes are trained by competitive learning to become the representative vector of a cluster of input vectors x: (use only x, the input part of (x, d )) 1. For a chosen x, feedforward to determined the winning 2. 3. Reduce , then repeat steps 1 and 2 until stop condition is met Phase 2: weights going out of hidden nodes are trained by delta rule to be an average output of where x is an input vector that causes to win (use both x and d). 2. (optional) 3. 4. Repeat steps 1 – 3 until stop condition is met

Notes A combination of both unsupervised learning (for in phase 1) and supervised learning (for in phase 2). After phase 1, clusters are formed among sample input x , each is a representative of a cluster (average). After phase 2, each cluster k maps to an output vector y, which is the average of View phase 2 learning as following delta rule

After training, the network works like a look-up of math table. For any input x, find a region where x falls (represented by the wining z node); use the region as the index to look-up the table for the function value. CPN works in multi-dimensional input space More cluster nodes (z), more accurate mapping. Training is much faster than BP May have linear separability problem

Full CPN If both we can establish bi-directional approximation Two pairs of weights matrices: W(x to z) and V(z to y) for approx. map x to U(y to z) and T(z to x) for approx. map y to When training sample (x, y) is applied ( ), they can jointly determine the winner zk* or separately for

Adaptive Resonance Theory (ART) (§ 5.4) ART1: for binary patterns; ART2: for continuous patterns Motivations: Previous methods have the following problems: Number of class nodes is pre-determined and fixed. Under- and over- classification may result from training Some nodes may have empty classes. no control of the degree of similarity of inputs grouped in one class. Training is non-incremental: with a fixed set of samples, adding new samples often requires re-train the network with the enlarged training set until a new stable state is reached.

Ideas of ART model: suppose the input samples have been appropriately classified into k clusters (say by some fashion of competitive learning). each weight vector is a representative (average) of all samples in that cluster. when a new input vector x arrives Find the winner j* among all k cluster nodes Compare with x if they are sufficiently similar (x resonates with class j*), then update based on else, find/create a free class node and make x as its first member.

To achieve these, we need: a mechanism for testing and determining (dis)similarity between x and . a control for finding/creating new class nodes. need to have all operations implemented by units of local computation. Only the basic ideas are presented Simplified from the original ART model Some of the control mechanisms realized by various specialized neurons are done by logic statements of the algorithm

ART1 Architecture

Working of ART1 3 phases after each input vector x is applied Recognition phase: determine the winner cluster for x Using bottom-up weights b Winner j* with max yj* = bj* ּx x is tentatively classified to cluster j* the winner may be far away from x (e.g., |tj* - x| is unacceptably large)

Working of ART1 (3 phases) Comparison phase: Compute similarity using top-down weights t: vector: If (# of 1’s in s)|/(# of 1’s in x) > ρ, accept the classification, update bj* and tj* else: remove j* from further consideration, look for other potential winner or create a new node with x as its first patter.

Weight update/adaptive phase Initial weight: (no bias) bottom up: top down: When a resonance occurs with If k sample patterns are clustered to node j then = pattern whose 1’s are common to all these k samples

Example for input x(1) Node 1 wins

Notes Classification as a search process No two classes have the same b and t Outliers that do not belong to any cluster will be assigned separate nodes Different ordering of sample input presentations may result in different classification. Increase of r increases # of classes learned, and decreases the average class size. Classification may shift during search, will reach stability eventually. There are different versions of ART1 with minor variations ART2 is the same in spirit but different in details.

ART1 Architecture + + - R G2 - + G1 + + +

cluster units: competitive, receive input vector x through weights b: to determine winner j. input units: placeholder or external inputs interface units: pass s to x as input vector for classification by compare x and controlled by gain control unit G1 Needs to sequence the three phases (by control units G1, G2, and R)

R = 0: resonance occurs, update and R = 1: fails similarity test, inhibits J from further computation