Neural Networks (II) Simple Learning Rule Artificial Intelligence Lecture Note Information & Communication University 1999 Spring Sun Hwa Hahn
Threshold Logic Unit An artificial neuron model the functionality of a neuron Threshold function = f(activation) Activation Threshold Function : Hard limiter, Sigmoid, Stochastic Semi-linear
Threshold Function Hard limiter Sigmoid function y a y 1.0 0.5 a
Threshold Function Stochastic semi-linear unit Output is the probability of outputting ‘1’ N : Time Slots, N1 : pulses Probability of pulses is given by 0 P() Probability of firing if activation is a is the probability that the threshold is less than a
Geometric interpretation of TLU Input space and pattern classification n inputs n-dimensional input space each point in the input space represents input-output mapping classification of the outputs are done by decision line (plane, or hyper plane) X1 X2 1 X1 1 X2 Activation 2 Output w1 = w2 = 1
Linear Separation of Classes Critical condition for classification occurs X2 1.5 Decision line 1 X1 1.5
Vectors v Quantities that have magnitude and direction (||v||, ) or (v1, v2, … , vn) ||v|| v v1 X1 v2 X2
Comparing Vectors Inner Product For n-dimensional vectors v w Vector Projection of w along v v w vw Vector Projection of v along w
Inner Product 두 Vector간 각도가 유사할수록 내적의 값이 크다. Cooperative vectors v v v w w w
Inner Product and TLUs w x w w w x x x xw xw xw Activation is greater than threshold Xw < /||w|| Activation is less than threshold
Training TLUs Training : adjust weight vector and threshold for desired classification Augmented weight vector : consider threshold as an input permanently connected with weight -1
Pattern Space : 2D vs. 3D
Changing Vectors Vector Addition : Vector Subtraction Addition of negative vector
Adjusting Weight Vector : Learning Weight vector : orthogonal to the decision plane Decision plane must pass through the origin
Training Set Training set {v, t} consists of input vector and target class of the input vector Supervised training, supervised learning Misclassification : for given network, output is different to target value Rotate weight vector per each misclassification misclassification of 1 - 0 : > 90 misclassification of 0 - 1 : < 90 : learning rate, 0 < < 1 w ‘ = w + v
Learning : Misclassification 1 - 0 Activation is negative when it should have been positive rotate the weight vector toward the input vector
Learning : Misclassification 0 - 1 Activation is positive when it should have been negative rotate the weight vector against the input vector
Learning Rule Training Rule or Learning Rule : learning rate Training algorithm for TLU : Perceptron learning algorithm Repeat for each training vector (v, t) evaluate the output y when v is input to TLU if y t then form a new weight vector w’ end for until y = t for all vectors
Example Initial weight (0, 0.4, 0.3) Learn logical AND, learning rate = 0.25
Perceptron
Example Initial weight (0, 0.4, 0.3) Learn logical OR, learning rate = 0.25