NN – cont. Alexandra I. Cristea USI intensive course Adaptive Systems April-May 2003
We have seen how the neuron computes, lets see –What it can compute? –How it can learn?
What does the neuron compute?
Perceptron, discrete neuron First, simple case: –no hidden layers –Only one neuron –Get rid of threshold – b becomes w 0 –Y – Boolean function : > 0 fires 0 doesnt fire
Threshold function f f (w0 = - t = -1)
Y = X1 or X2 1 X1 X Y X1 X2
Y = X1 and X2 0,5 X1 X Y X1 X2
Y = or(x1,…,xn) w1=w2=…=wn=1
Y = and(x1,…,xn) w1=w2=…=wn=1/n
What are we actually doing? X1 X Y X1 X Y X1 X Y w0+w1*X1+w2*X2 0=-1; 7; 9 0=-1; 0,7; 0,9 0=1; 7; 9 X1 X2
x1 x2 w0+w1*x1+w2*x2 w0= - 1 w1= - 0,67 w2= 1 Linearly Separable Set
w0+w1*x1+w2*x2 Linearly Separable Set x1 x2 w0= - 1 w1= 0,25 w2= - 0,1
w0+w1*x1+w2*x2 Linearly Separable Set x1 x2 w0= - 1 w1= 0,25 w2= 0,04
w0+w1*x1+w2*x2 Linearly Separable Set x1 x2 w0= - 1 w1= 0,167 w2= 0,1
Non-linearly separable Set
w0+w1*x1+w2*x2 Non Linearly Separable Set x1 x2 w0= w1= w2=
w0+w1*x1+w2*x2 Non Linearly Separable Set x1 x2 w0= w1= w2=
w0+w1*x1+w2*x2 Non Linearly Separable Set x1 x2 w0= w1= w2=
w0+w1*x1+w2*x2 Non Linearly Separable Set x1 x2 w0= w1= w2=
Perceptron Classification Theorem A finite set X can be classified correctly by a one-layer perceptron if and only if it is linearly separable.
w0+w1*x1+w2*x2 Typical non-linearly separable set: Y=XOR(x1,x2) x1 x2 0,01,0 0,1 1,1 Y=1 Y=0
How does the neuron learn?
Learning: weight computation W1* X1 W * X2= X2 X1 W1*X1 W2*X2
Perceptron Learning Rule incremental version FOR i:= 0 TO n DO wi:=random initial value ENDFOR; REPEAT select a pair (x,t) in X; (* each pair must have a positive probability of being selected *) IF w T * x' > 0 THEN y:=1 ELSE y:=0 ENDIF; IF y t THEN FOR i:= 0 TO n DO wi:= wi + (t-y) xi' ENDFOR ENDIF; UNTIL X is correctly classified ROSENBLATT (1962)
Idea Perceptron Learning Rule w x w new w new =w + x t=1 y=0 (w T x 0) w niew x w x x w new =w - x wi:= wi + (t-y) xi' w changes in the direction of the input +- t=0 y=1 (w T x>0)
For multi-layered perceptrons w. continuous neurons, a simple and successful learning algorithm exists.
BKP:Error Input Output Hidden layer d d d d e1=d1 y1 e2=d2 y2 e3=d3 y3 e4=d4 y4 Hiddenlayer error error
Synapse W weight neuron1 neuron2 y1 value y2 w*y1 value Value (y1,y2)= Internal activation Forward propagation Weight serves as amplifier!
Inverse Synapse W weight neuron1 neuron2 ?? e1= ?? value e2 value Value(e1,e2)= Error Backward propagation Weight serves as amplifier!
Inverse Synapse W weight neuron1 neuron2 w e2 e1=w e2 value e2 value Value(e1,e2)= Error Backward propagation Weight serves as amplifier!
BKP:Error Input Output Hidden layer d d d d e1=d1 y1 e2=d2 y2 e3=d3 y3 e4=d4 y4 Hiddenlayer error error O2 O1 I1 O2, I2
Backpropagation to hidden layer Input I1 Output O1 Hidden layer ee j i e i j,i Backpropagation e e e O2, I2
Update rule for 2 weight types I2 hidden layer, O1 system output I1 system input, O2 hidden layer (simplification f=1 for repeater, e.g.) Δ =α(d[i]-y[i]) f(S[i])f(S[i]) = =αe[i] f(S[i]) (simplification f=1 for repeater, e.g.) S[i] = j w[j, ](t)h[j] Δ =α i e[i] [j,i] f(S[j])f(S[j]) =α ee[j]f(S[j]) S[j] = k w[k,j](t)x[k]
Backpropagation algorithm FOR s := 1 TO r DO Ws := initial matrix (often random); REPEAT select a pair (x,t) in X; y 0 :=x; # forward phase: compute the actual output ys of the network with input x FOR s := 1 TO r DO y s := F(Ws y s-1 ) END; # yr is the output vector of the network # backpropagation phase: propagate the errors back through the network # and adapt the weights of all layers d r := F r (t - y r ) ; FOR s := r TO 2 DO d s-1 := F s-1 ' Ws T d s ; Ws := Ws + d s y s-1 T ; END; W1 := W1 + d 1 y 0 T UNTIL stop criterion
Conclusion We have seen binary function representation with single layer perceptron We have seen a learning algorithm for SLP We have seen a learning algorithm for MLP (BP) So, neurons can represent knowledge AND learn!