Download presentation
Presentation is loading. Please wait.
Published byClinton Perkins Modified over 9 years ago
1
IE 585 History of Neural Networks & Introduction to Simple Learning Rules
2
2 Elements of Neural Networks
3
3 Types of Transfer Function Linear Function Identity Function Transfer Function: y = (wx) Piecewise-Linear Function Transfer Function: if (wx)≥ 1/2θ, y = 1 | (wx)|< 1/2θ, y = θ* (wx)+1/2 (wx)≤ - 1/2θ, y = 0 (wx) y y 1 -1/2θ1/2θ (wx)
4
4 Types of Transfer Function Step Function Binary Transfer Function: if (wx)≥0, y= 1 (wx)<0, y= 0 Bipolar Transfer Function: if (wx)≥0, y= 1 (wx)<0, y= -1 y y 1 1 0 (wx)
5
5 Types of Transfer Function Sigmoid Function Binary Transfer Function: y=1/(1+exp(- (wx))) Bipolar Transfer Function: y= (1-exp(- (wx))) / (1+exp(- (wx))) y y 1 1 0 (wx)
6
6 -+ + Linear Separability “OR” example x 1 x 2 t 111 -111 1-11 -1-1-1 + -- -+ “AND” example x 1 x 2 t 111 -11-1 1-1-1 -1-1-1
7
7 Why Needs Biases?
8
8 Foundations 1943 - McCulloch (neurobiologist) & Pitts (statistician) –described a model of a biological neuron –all or nothing activation (0,1) –threshold for activation –fixed structure –delay in transmitting signals –no learning
9
9 McCulloch-Pitts Neuron Example
10
10 Donald Hebb The Organization of Behavior, 1949 –first learning rule –information stored in synapse weights –weights learn in proportion to the activation of the neuron –weights are symmetric
11
11 Donald Hebb " When the axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased." [D.O. Hebb, The Organization of Behavior] In other words, in a neural net, the connections between neurons get larger weights if they are repeatedly used during training.
12
12 Hebb Training Algorithm 1.Set all w = 0 2. =learning rate (0 < ≤ 1), x=input, y=output, t=target (unsupervised) (supervised) 3.Train the net for 1 epoch (1 epoch: 1 pass through the training set)
13
13 Hebb Net Example x1x1 x2x2 B y Transfer Function: if (wx) 0, y= 1 (wx)<0, y=-1 “AND” example x 1 x 2 t 111 -11-1 1-1-1 -1-1-1 w1w1 w2w2 wBwB
14
14 Transfer Function: if (wx) 0, y= 1 (wx)<0, y=-1 TRAINING x 1 x 2 tw 1 w 2 w B y w 1 w 2 w B 1110001111 -11-111111-1-1 1-1-12001-11-1 -1-1-111-1-111-1 22-2 =1, supervised mode, substituting t for y Final Weights w = w new - w old
15
15 - - - + net=2x 1 +2x 2 -2 x 2 =-x 1 +1 Linearly Separable
16
16 Frank Rosenblatt The perceptron, 1958 –model of biological vision –self organized and supervised –first model simulated on a computer (at Cornell)
17
17 The Perceptron
18
18 Perceptron Learning Rule Convergence Theorem If weights exist to allow the net to respond correctly to all training patterns, then the rule’s procedure for adjusting the weights will eventually find values in a finite number of training steps such that the net does respond correctly to all training patterns.
19
19 Perceptron Training 1.Set all w = 0 2. =learning rate (0 < ≤ 1), x=input, y=output, t=target if 3.If no weights change in one epoch, stop (1 epoch: 1 pass through the training set)
20
20 Perceptron Example x1x1 x2x2 B y Transfer Function: if (wx)≥0, y= 1 (wx)<0, y= -1 “AND” example x 1 x 2 t 111 1-11 -111 -1-1-1
21
21 Transfer Function: if (wx) ≥ 0, y= 1 (wx) < 0, y= -1 TRAINING x 1 x 2 tw 1 w 2 w B y w 1 w 2 w B -1-1-1000111-1 -11111-1-1-111 1-11020-11-11 1111111000 =1 Final Weights w = w new - w old
22
22 - + + + net=x 1 +x 2 +1 x 2 =-x 1 -1 Linearly Separable
23
23 Widrow / Hoff “Adaptive Switching Circuits”, 1960 –electrical engineers –hardware implementations –the ADALINE (ADAptive Linear Neuron) –least mean squares error training (LMS) –Delta rule
24
24 Widrow - Hoff Discovery of the "LMS" or "Widrow-Hoff" Learning Algorithm The first doctoral student that I had [at Stanford] was a man named Ted Hoff.... the two of us began talking. I was telling Ted about research.... One day I had a session with him, and out of this session came the LMS [least mean squares] algorithm.
25
25 LMS Learning The connections between neurons change in proportion to the error between the target and the output, which is the net for adalines.
26
26 Derivation of Learning Rule
27
27 LMS Training 1.Set all w to small random numbers 2. =learning rate (0 < ≤ 1), x=input, y=output, t=target adjust w in the direction of max error decreases 3.iterate until Δw’s are very small
28
28 Rule of Thumb for 0.1 ≤ n ≤ 1.0n: number of inputs If is too large won’t converge If is too small learning process is too slow
29
29 LMS Example
30
30 The Party’s Over Minsky and Papert’s Perceptrons, 1969 –looked at simple perceptrons –can only solve linearly separable problems (no XOR) –thought multi layer nets would be similarly useless –funding for neural net research essentially stops
31
31 -+ + - Not Linearly Separable - XOR “XOR” example x 1 x 2 t 11-1 -111 1-11 -1-1-1
32
32 Monks Working Through the Dark Age Stephen Grossberg (with Gail Carpenter, often), BU - adaptive resonance theory (ART), sigmoid transfer function Teuvo Kohonen, Finland - self organizing map John Hopfield, CalTech - Hopfield network for optimization and pattern recall
33
33 Things Get Rolling Again Rumelhart & McClelland, The PDP Group, Parallel Distributed Processing, 1986 –popularized multi layered perceptrons trained by backpropagation DARPA (Defense Advanced Research Projects Agency), 1988, study on most promising applications of neural nets - funding starts up in a big way
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.