Simple Learning: Hebbian Learning and the Delta Rule

Slides:

Advertisements

Similar presentations

Beyond Linear Separability

Advertisements

Slides from: Doug Gray, David Poole

Introduction to Neural Networks Computing

G53MLE | Machine Learning | Dr Guoping Qiu

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.

CS Perceptrons1. 2 Basic Neuron CS Perceptrons3 Expanded Neuron.

Classification Neural Networks 1

x – independent variable (input)

Perceptron Learning Rule Assuming the problem is linearly separable, there is a learning rule that converges in a finite time Motivation A new (unseen)

Lecture 12 Least Square Approximation Shang-Hua Teng.

CHAPTER 11 Back-Propagation Ming-Feng Yeh.

Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.

-Artificial Neural Network- Chapter 3 Perceptron 朝陽科技大學資訊管理系李麗華教授.

Supervised Hebbian Learning

Where We’re At Three learning rules  Hebbian learning regression  LMS (delta rule) regression  Perceptron classification.

Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.

The Boltzmann Machine Psych 419/719 March 1, 2001.

1 CSC 9010 Spring Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.

Chapter 13 Multiple Regression

Perceptron Models The perceptron is a kind of binary classifier that maps its input x (a vector of type Real) to an output value f(x) (a scalar of type.

Unit 1: Functions Sec1: Relations, Functions, Domain and Range Algebra II.

Chapter 2 Single Layer Feedforward Networks

Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.

1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Pattern Associators, Generalization, Processing Psych /719 Feb 6, 2001.

1 Neural networks 2. 2 Introduction: Neural networks The nervous system contains 10^12 interconnected neurons.

Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,

Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.

Fall 2004 Backpropagation CS478 - Machine Learning.

Statistics 101 Chapter 3 Section 3.

Learning with Perceptrons and Neural Networks

Chapter 2 Single Layer Feedforward Networks

One-layer neural networks Approximation problems

第 3 章神经网络.

Correlation – Regression

Chapter 5 STATISTICS (PART 4).

A Simple Artificial Neuron

Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)

Dr. Kenneth Stanley September 6, 2006

with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017

CHAPTER 10 Correlation and Regression (Objectives)

Simple learning in connectionist networks

Perceptrons for Dummies

Data Mining with Neural Networks (HK: Chapter 7.5)

Unsupervised learning

Neural Networks Advantages Criticism

Biological and Artificial Neuron

Classification Neural Networks 1

Chapter 3. Artificial Neural Networks - Introduction -

Biological and Artificial Neuron

Artificial Intelligence Chapter 3 Neural Networks

Perceptron as one Type of Linear Discriminants

Biological and Artificial Neuron

Neural Networks Chapter 5

Backpropagation.

CS 621 Artificial Intelligence Lecture 25 – 14/10/05

High School – Pre-Algebra - Unit 8

Artificial Intelligence Chapter 3 Neural Networks

Backpropagation.

Artificial neurons Nisheeth 10th January 2019.

Artificial Intelligence Chapter 3 Neural Networks

Chapter - 3 Single Layer Percetron

Artificial Intelligence Chapter 3 Neural Networks

Simple learning in connectionist networks

Peer Instruction I pose a challenge question (usually multiple choice), which will help solidify understanding of topics we have studied Might not just.

The McCullough-Pitts Neuron

Artificial Intelligence Chapter 3 Neural Networks

Outline Announcement Neural networks Perceptrons - continued

Presentation transcript:

Simple Learning: Hebbian Learning and the Delta Rule Psych 85-419/719 Feb 1, 2001

“Correlational” Learning Suppose we have a set of samples from some domain we’re interested in. People in this class: age, height, weight, etc. The set of variables (e.g., age, weight) is a vector of numbers (22, 30, 19…), (175, 130, 200)… We can compute the correlation between any of those vectors. Can predict one variable from another.

In Statistics... We gather samples 250 We can use a regression to predict one variable from another weight For novel instances, use regression line to guess what predicted variable is. 100 48” 84” height

Learning: Decomposability of the problem: treat each output unit as a simpler problem to solve So: how do we update the weights from input units to outputs?

The Hebb Rule The change in weight from unit i to j is the product of the two units’ activities, and a scaling factor u If both units’ activities are positive, or both negative, weight goes up If signs are opposite, weight goes down j wi,j i

Example: c=a (u=0.25) a b c Dw1 Dw2 a b c w1 w2 -1 -1 -1 .25 .25 1 -1 -1 -1 .25 .25 1 .25 -.25 1 -1 -1 .25 -.25 -1 1 1 .25 .25 1 1 Total 1.0 0.0

A Failure: c=a or b a b c Dw1 Dw2 a b c w1 w2 -1 -1 -1 .25 .25 1 -1 -1 -1 .25 .25 1 .25 -.25 1 -1 1 -.25 .25 -1 1 1 .25 .25 1 1 Total 0.75 0.75

With Biases… d = a or b (unit c as bias unit) Dw1 Dw2 c Dw3 a b d w1 w2 c w3 -1 -1 1 -1 .25 .25 -.25 1 -1 1 1 .25 -.25 .25 -1 1 1 1 -.25 .25 .25 1 .25 .25 .25 1 1 1 Total 0.75 0.75 0.75

A Real Failure a b e Dw1 Dw2 c Dw3 d Dw4 a b e w1 w2 c w3 d w4 1 -1 1 -1 1 .25 -.25 .25 -.25 1 .25 .25 .25 .25 1 1 1 1 -1 -.25 -.25 -.25 .25 1 1 1 -1 -1 1 -1 -1 1 -.25 .25 .25 -.25 Total 0.0 0.0 0.75 0.0

Properties of the Hebb Rule What if unit outputs are always positive? What happens over a long period of time? Doesn’t always work (even if solution exists). The essential problem: weight value is only a function of behavior of units it connects.

Something More Sophisticated: The Delta Rule Weight change a function of activity in the from unit (unit j), and the error, e on the to unit i. This means: weight updates are implicitly a function of the overall network behavior

The “Failure” Revisited x e b c d Dw1 Dw2 Dw3 Dw4 a b x w1 w2 c w3 d w4 1 -1 1 -1 1 1 .25 -.25 .25 -.25 1 1 .25 .25 .25 .25 1 1 1 1 -1 -2 -.5 -.5 -.5 .5 1 1 1 -1 -1 -2 1 -1 -1 1 -.5 .5 .5 -.5 Total -.5 0.0 0.5 0.0 And so on….

Properties of the Delta Rule Can be proven that it will converge to the weight vector that produces the global minimum possible sum squared error between targets and outputs.

Proof...

Limitations a b -1 -1 1 -1 -1 1 1 1 a or b -1 1 a and b -1 1 -1 1 a xor b Cannot solve problems that are not linearly separable (e.g., XOR) 1,1 -1,-1 -1,1 1,-1 I’m going to a Penguins game on Monday or Wednesday. Implies that I’m not going to BOTH games!

Next Class… Pattern Association Read handout … And Chapter 11 of PDP1 Optional: read Chapter 9 of PDP1 (a brush up on linear algebra) Homework #2 handed out