Outline Associative Learning: Hebbian Learning

Slides:



Advertisements
Similar presentations
Chapter3 Pattern Association & Associative Memory
Advertisements

Introduction to Neural Networks Computing
Component Analysis (Review)
Machine Learning Lecture 8 Data Processing and Representation
PCA + SVD.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Intro. ANN & Fuzzy Systems Lecture 8. Learning (V): Perceptron Learning.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
Principal Component Analysis
Self Organization: Hebbian Learning CS/CMPE 333 – Neural Networks.
Un Supervised Learning & Self Organizing Maps Learning From Examples
Perceptron Learning Rule Assuming the problem is linearly separable, there is a learning rule that converges in a finite time Motivation A new (unseen)
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
ICA Alphan Altinok. Outline  PCA  ICA  Foundation  Ambiguities  Algorithms  Examples  Papers.
Principal Component Analysis. Consider a collection of points.
Summarized by Soo-Jin Kim
Principle Component Analysis (PCA) Networks (§ 5.8) PCA: a statistical procedure –Reduce dimensionality of input vectors Too many features, some of them.
Supervised Hebbian Learning
Deep Learning – Fall 2013 Instructor: Bhiksha Raj Paper: T. D. Sanger, “Optimal Unsupervised Learning in a Single-Layer Linear Feedforward Neural Network”,
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
Hebbian Coincidence Learning
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Unsupervised Learning Motivation: Given a set of training examples with no teacher or critic, why do we learn? Feature extraction Data compression Signal.
Contents PCA GHA APEX Kernel PCA CS 476: Networks of Neural Computation, CSD, UOC, 2009 Conclusions WK9 – Principle Component Analysis CS 476: Networks.
Introduction to Linear Algebra Mark Goldman Emily Mackevicius.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
13 1 Associative Learning Simple Associative Network.
Intro. ANN & Fuzzy Systems Lecture 15. Pattern Classification (I): Statistical Formulation.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Intro. ANN & Fuzzy Systems Lecture 38 Mixture of Experts Neural Network.
Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.
Feature Extraction 主講人:虞台文.
Principal Component Analysis (PCA)
Self-Organizing Network Model (SOM) Session 11
Lecture 15. Pattern Classification (I): Statistical Formulation
Background on Classification
COMP 1942 PCA TA: Harry Chan COMP1942.
Principle Component Analysis (PCA) Networks (§ 5.8)
Lecture 12. MLP (IV): Programming & Implementation
Principal Component Analysis (PCA)
Lecture 19. SVM (III): Kernel Formulation
Blind Signal Separation using Principal Components Analysis
Lecture 12. MLP (IV): Programming & Implementation
Lecture 22 Clustering (3).
Biological and Artificial Neuron
Chapter 3. Artificial Neural Networks - Introduction -
Biological and Artificial Neuron
Outline Single neuron case: Nonlinear error correcting learning
Techniques for studying correlation and covariance structure
Principal Component Analysis
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Associative Learning.
Biological and Artificial Neuron
Lecture 7. Learning (IV): Error correcting Learning and LMS Algorithm
Introduction PCA (Principal Component Analysis) Characteristics:
Recitation: SVD and dimensionality reduction
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Dimensionality Reduction
Feature space tansformation methods
Maths for Signals and Systems Linear Algebra in Engineering Lecture 18, Friday 18th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL.
Principal Component Analysis
Lecture 8. Learning (V): Perceptron Learning
Ch6: AM and BAM 6.1 Introduction AM: Associative Memory
Lecture 16. Classification (II): Practical Considerations
Associative Learning.
CS623: Introduction to Computing with Neural Nets (lecture-11)
Outline Variance Matrix of Stochastic Variables and Orthogonal Transforms Principle Component Analysis Generalized Eigenvalue Decomposition.
Presentation transcript:

Lecture 6. Learning (III): Associative Learning and Principal Component Analysis

Outline Associative Learning: Hebbian Learning Use Associative Learning to compute Principal Component (C) 2001 by Yu Hen Hu

Associative (Hebbian) Learning A Learning rule postulated by Hebb (1949) on the associative learning at the cellular level. D wkj(n) = wkj(n+1)  wkj(n) =  dk(n)xi(n) or D w(n) =  dk(n)x(n) The change of synaptic weights is proportional to the correlation of the (desired) activation dk(n) and the input x(n). The weight vector w(n) converges to the cross- correlation of dk(n) and x(n): (C) 2001 by Yu Hen Hu

Potential Applications of Associative Learning Pattern Retrieval Each pattern (input: x) is associated with a label (output: d). The association information is stored in weights of a single layer network W. When a test pattern is presented to the input, the label of a stored pattern that is closest to the test pattern should appear as the output. This is the mechanism of an “associative memory” (C) 2001 by Yu Hen Hu

Linear Associative Memory M pairs of training vectors {(bm, am); m = 1, 2, ..., M}. Weight matrix: Retrieval: y=W x. Let x = bk, (C) 2001 by Yu Hen Hu

Example Consider 3 input patterns: x(1) = [1 2 –1]T, x(2) = [–1 1 1]T, x(3) = [1 0 1]T, and their corresponding 2D labels: y(1) = [–1 2]T, y(2) = [2 4]T, y(3) = [1 –3]T. Compute the cross correlation matrix using the Hebbian learning,with learning rate  = 1: Recall: Present x(2). Then y = W x(2) = [6 12]T = 3[2 4]T which is proportional to y(2)! (C) 2001 by Yu Hen Hu

Hebbian Learning Modifications Problem with Hebbian learning: when both dk(n) and xj(n) are positive, the weight wkj(n) will grow unbound. Kohonen (1988) proposed to add a forgetting factor: Dwkj(n) = dk(n) xj(n) – a dk(n) wkj(n) or Dwkj(n) = a dk(n)[c xj(n) – wkj(n)]; where c = h/a Analysis: Suppose that xj(n) > 0 and dk(n) > 0. When wkj(n) increases until c xj(n) < wkj(n), Dwkj(n) < 0. This, in turn, will reduces wkj(n) for future n. It can be shown that when |a dk(n)| < 1, |wkj(n)| will be bounded if dk(n) and xj(n) are bounded. (C) 2001 by Yu Hen Hu

Principal Component Analysis (PCA) Pricipal component: part of signal (feature vectors) which consists of most energy of the signal! Also known as Karhunen-Loeve Transformation A linear transform of n-dim vector x, Tx such that ||x–Tx||2 is minimum among all n by n matrices T whose rank = p < n. Applications: Feature dimension reduction Data Compression (C) 2001 by Yu Hen Hu

PCA Method Given {x(k); 1  k  K}. Let dimension of x(k) = n. 1. Form a n by n Covariance matrix estimate 2. Eigenvalue decomposition 3. Principal components y(k) = VTpx(k) = [v 1v2 … vp]Tx(k), p  n. Tx(k) = Vpy(k) = VpVTpx(k) (C) 2001 by Yu Hen Hu

Optimality of Principal Components Among all n by n matrices T' whose rank = p  n, || x – Tx ||2= li  || x – T'x ||2 This is the orthogonality principle: y(k) is the p-dimensional representation of x in the subspace spanned by columns of Vp. || y(k)|| = || Tx(k)|| is the length of the shadow of x(k). (C) 2001 by Yu Hen Hu

Principal Component Network Linear neuron model: Choose w to maximize E{||y||2} = wTE{x xT}w = wTCw subject to ||w||2 = 1. Generalized Hebbian learning rule:  x1 x2 xM yN y2 y1  (C) 2001 by Yu Hen Hu ghademo.m, ghafun.m