Simple Learning: Hebbian Learning and the Delta Rule

Slides:



Advertisements
Similar presentations
Beyond Linear Separability
Advertisements

Slides from: Doug Gray, David Poole
Introduction to Neural Networks Computing
G53MLE | Machine Learning | Dr Guoping Qiu
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
CS Perceptrons1. 2 Basic Neuron CS Perceptrons3 Expanded Neuron.
Classification Neural Networks 1
x – independent variable (input)
Perceptron Learning Rule Assuming the problem is linearly separable, there is a learning rule that converges in a finite time Motivation A new (unseen)
Lecture 12 Least Square Approximation Shang-Hua Teng.
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
-Artificial Neural Network- Chapter 3 Perceptron 朝陽科技大學 資訊管理系 李麗華教授.
Supervised Hebbian Learning
Where We’re At Three learning rules  Hebbian learning regression  LMS (delta rule) regression  Perceptron classification.
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
The Boltzmann Machine Psych 419/719 March 1, 2001.
1 CSC 9010 Spring Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.
Chapter 13 Multiple Regression
Perceptron Models The perceptron is a kind of binary classifier that maps its input x (a vector of type Real) to an output value f(x) (a scalar of type.
Unit 1: Functions Sec1: Relations, Functions, Domain and Range Algebra II.
Chapter 2 Single Layer Feedforward Networks
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Pattern Associators, Generalization, Processing Psych /719 Feb 6, 2001.
1 Neural networks 2. 2 Introduction: Neural networks The nervous system contains 10^12 interconnected neurons.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Fall 2004 Backpropagation CS478 - Machine Learning.
Statistics 101 Chapter 3 Section 3.
Learning with Perceptrons and Neural Networks
Chapter 2 Single Layer Feedforward Networks
One-layer neural networks Approximation problems
第 3 章 神经网络.
Correlation – Regression
Chapter 5 STATISTICS (PART 4).
A Simple Artificial Neuron
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Dr. Kenneth Stanley September 6, 2006
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
CHAPTER 10 Correlation and Regression (Objectives)
Simple learning in connectionist networks
Perceptrons for Dummies
Data Mining with Neural Networks (HK: Chapter 7.5)
Unsupervised learning
Neural Networks Advantages Criticism
Biological and Artificial Neuron
Classification Neural Networks 1
Chapter 3. Artificial Neural Networks - Introduction -
Biological and Artificial Neuron
Artificial Intelligence Chapter 3 Neural Networks
Perceptron as one Type of Linear Discriminants
Biological and Artificial Neuron
Neural Networks Chapter 5
Backpropagation.
CS 621 Artificial Intelligence Lecture 25 – 14/10/05
High School – Pre-Algebra - Unit 8
Artificial Intelligence Chapter 3 Neural Networks
Backpropagation.
Artificial neurons Nisheeth 10th January 2019.
Artificial Intelligence Chapter 3 Neural Networks
Chapter - 3 Single Layer Percetron
Artificial Intelligence Chapter 3 Neural Networks
Simple learning in connectionist networks
Peer Instruction I pose a challenge question (usually multiple choice), which will help solidify understanding of topics we have studied Might not just.
The McCullough-Pitts Neuron
Artificial Intelligence Chapter 3 Neural Networks
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Simple Learning: Hebbian Learning and the Delta Rule Psych 85-419/719 Feb 1, 2001

“Correlational” Learning Suppose we have a set of samples from some domain we’re interested in. People in this class: age, height, weight, etc. The set of variables (e.g., age, weight) is a vector of numbers (22, 30, 19…), (175, 130, 200)… We can compute the correlation between any of those vectors. Can predict one variable from another.

In Statistics... We gather samples 250 We can use a regression to predict one variable from another weight For novel instances, use regression line to guess what predicted variable is. 100 48” 84” height

Learning: Decomposability of the problem: treat each output unit as a simpler problem to solve So: how do we update the weights from input units to outputs?

The Hebb Rule The change in weight from unit i to j is the product of the two units’ activities, and a scaling factor u If both units’ activities are positive, or both negative, weight goes up If signs are opposite, weight goes down j wi,j i

Example: c=a (u=0.25) a b c Dw1 Dw2 a b c w1 w2 -1 -1 -1 .25 .25 1 -1 -1 -1 .25 .25 1 .25 -.25 1 -1 -1 .25 -.25 -1 1 1 .25 .25 1 1 Total 1.0 0.0

A Failure: c=a or b a b c Dw1 Dw2 a b c w1 w2 -1 -1 -1 .25 .25 1 -1 -1 -1 .25 .25 1 .25 -.25 1 -1 1 -.25 .25 -1 1 1 .25 .25 1 1 Total 0.75 0.75

With Biases… d = a or b (unit c as bias unit) Dw1 Dw2 c Dw3 a b d w1 w2 c w3 -1 -1 1 -1 .25 .25 -.25 1 -1 1 1 .25 -.25 .25 -1 1 1 1 -.25 .25 .25 1 .25 .25 .25 1 1 1 Total 0.75 0.75 0.75

A Real Failure a b e Dw1 Dw2 c Dw3 d Dw4 a b e w1 w2 c w3 d w4 1 -1 1 -1 1 .25 -.25 .25 -.25 1 .25 .25 .25 .25 1 1 1 1 -1 -.25 -.25 -.25 .25 1 1 1 -1 -1 1 -1 -1 1 -.25 .25 .25 -.25 Total 0.0 0.0 0.75 0.0

Properties of the Hebb Rule What if unit outputs are always positive? What happens over a long period of time? Doesn’t always work (even if solution exists). The essential problem: weight value is only a function of behavior of units it connects.

Something More Sophisticated: The Delta Rule Weight change a function of activity in the from unit (unit j), and the error, e on the to unit i. This means: weight updates are implicitly a function of the overall network behavior

The “Failure” Revisited x e b c d Dw1 Dw2 Dw3 Dw4 a b x w1 w2 c w3 d w4 1 -1 1 -1 1 1 .25 -.25 .25 -.25 1 1 .25 .25 .25 .25 1 1 1 1 -1 -2 -.5 -.5 -.5 .5 1 1 1 -1 -1 -2 1 -1 -1 1 -.5 .5 .5 -.5 Total -.5 0.0 0.5 0.0 And so on….

Properties of the Delta Rule Can be proven that it will converge to the weight vector that produces the global minimum possible sum squared error between targets and outputs.

Proof...

Limitations a b -1 -1 1 -1 -1 1 1 1 a or b -1 1 a and b -1 1 -1 1 a xor b Cannot solve problems that are not linearly separable (e.g., XOR) 1,1 -1,-1 -1,1 1,-1 I’m going to a Penguins game on Monday or Wednesday. Implies that I’m not going to BOTH games!

Next Class… Pattern Association Read handout … And Chapter 11 of PDP1 Optional: read Chapter 9 of PDP1 (a brush up on linear algebra) Homework #2 handed out