Introduction to Artificial Intelligence (G51IAI) Dr Matthew Hyde Neural Networks More precisely: “Artificial Neural Networks” Simulating, on a computer, what we understand about neural networks in the brain Machines can think/act as human being.
Lecture Outline Recap on perceptrons Linear Separability Learning / Training The Neuron’s Activation Function G51IAI – Introduction to AI
Recap from last lecture A ‘Perceptron’ Single layer NN (one neuron) Inputs can be any number Weights on the edges Output can only be 0 or 1 5 0.5 θ = 6 6 2 Z 0 or 1 3 -3 G51IAI – Introduction to AI
Truth Tables and Linear Separability G51IAI – Introduction to AI
AND function, and OR function XOR X1 X2 Z 1 X1 X2 Z 1 These are called “truth tables” G51IAI – Introduction to AI
AND function, and OR function XOR X1 X2 Z T F X1 X2 Z T F These are called “truth tables” G51IAI – Introduction to AI
Important!!! You can represent any truth table graphically, as a diagram The diagram is 2-dimensional if there are two inputs 3-dimensional if there are three inputs Examples on the board in the lecture, and in the handouts G51IAI – Introduction to AI
3 Inputs means 3-dimensions X Y Z Output 1 0,1,0 1,1,0 0,1,1 1,1,1 Y axis X axis Z axis 0,0,0 1,0,0 0,0,1 1,0,1 G51IAI – Introduction to AI
Linear Separability in 3-dimensions Instead of a line, the dots are separated by a plane G51IAI – Introduction to AI
Only linearly Separable functions can be represented by a Perceptron X1 X2 Z 1 X1 X2 Z 1 AND XOR 0,0 1,0 0,1 1,1 0,0 XOR 1,0 0,1 1,1 Minsky & Papert Graphically represent Minsky & Papert (1969) AND Functions which can be separated in this way are called Linearly Separable Only linearly Separable functions can be represented by a Perceptron G51IAI – Introduction to AI
Examples – Handout 3 Linear Separability Fill in the diagrams with the correct dots black or white, for an output of 1 or 0 G51IAI – Introduction to AI
How to Train your Perceptron G51IAI – Introduction to AI
Simple Networks -1 X Y Z 1 AND X 1 θ=1.5 Y 1 -1 Both of these represent the AND function. It is sometimes convenient to set the threshold to zero, and add a constant negative input 1.5 Relation between threshold and weights, how about different number of input? Or if we want to set threshold to 0, is an extra input helpful? X 1 θ=0 1 Y
Training a NN AND X1 X2 Z 1 AND G51IAI – Introduction to AI 0,1 1,1 0,0 1,0 0,1 1,1 X1 X2 Z 1 G51IAI – Introduction to AI
Randomly Initialise the Network We set the weights randomly, because we do not know what we want it to learn. The weights can change to whatever value is necessary It is normal to initialise them in the range [-1,1]
Randomly Initialise the Network -1 0.3 0.5 X θ=0 Y -0.4
Learning While epoch produces an error End While Present network with next inputs (pattern) from epoch Err = T – O If Err <> 0 then Wj = Wj + LR * Ij * Err End If End While Final important thing to learn Get used to this notation!! Make sure that you can reproduce this pseudocode AND understand what all of the terms mean G51IAI – Introduction to AI
Epoch The ‘epoch’ is the entire training set The training set is the set of four input and output pairs INPUT DESIRED OUTPUT X Y Z 1
The learning algorithm INPUT DESIRED OUTPUT X Y Z 1 Input the first inputs from the training set into the Neural Network What does the neural network output? Is it what we want it to output? If not then we work out the error and change some weights
First training step 0.3 0.5 -0.3 + 0.5 + -0.4 = -0.2 -0.4 X Y Z 1 Input 1, 1 Desired output is 1 Actual output is 0 -1 0.3 1 0.5 θ=0 -0.3 + 0.5 + -0.4 = -0.2 1 -0.4 = Output of 0
First training step We wanted 1 We got 0 Error = 1 – 0 = 1 X Y Z 1 We wanted 1 We got 0 Error = 1 – 0 = 1 While epoch produces an error Present network with next inputs (pattern) from epoch Err = T – O If Err <> 0 then Wj = Wj + LR * Ij * Err End If End While If there IS an error, then we change ALL the weights in the network
If there is an error, change ALL the weights Wj = Wj + ( LR * Ij * Err ) New Weight = Old Weight + (Learning Rate * Input Value * Error) New Weight = 0.3 + (0.1 * -1 * 1) = 0.2 -1 0.3 0.2 1 0.5 θ=0
If there is an error, change ALL the weights Wj = Wj + ( LR * Ij * Err ) New Weight = 0.5 + (0.1 * 1 * 1) = 0.6 -1 0.2 1 0.5 0.6 θ=0 -0.4 1
Effects of the first change The output was too low (it was 0, but we wanted 1) Weights that contributed negatively have reduced Weights that contributed positively have increased It is trying to ‘correct’ the output gradually -1 -1 0.3 0.2 X 0.5 X 0.6 θ=0 θ=0 Y -0.4 Y -0.3
Epoch not finished yet The ‘epoch’ is the entire training set We do the same for the other 3 input-output pairs INPUT DESIRED OUTPUT X Y Z 1
The epoch is now finished Was there an error for any of the inputs? If yes, then the network is not trained yet We do the same for another epoch, from the first inputs again
The epoch is now finished If there were no errors, then we have the network that we want It has been trained While epoch produces an error Present network with next inputs (pattern) from epoch Err = T – O If Err <> 0 then Wj = Wj + LR * Ij * Err End If End While
Effect of the learning rate Set too high The network quickly gets near to what you want But, right at the end, it may ‘bounce around’ the correct weights It may go too far one way, and then when it tries to compensate it will go too far back the other way Wj = Wj + ( LR * Ij * Err )
Effect of the learning rate Set too high It may ‘bounce around’ the correct weights AND 0,0 1,0 0,1 1,1
Effect of the learning rate Set too low The network slowly gets near to what you want It will eventually converge (for a linearly separable function) but that could take a long time When setting the learning rule, you have to strike a balance between speed and effectiveness Wj = Wj + LR * Ij * Err
The Neuron’s Activation Function G51IAI – Introduction to AI
Expanding the Model of the Neuron: Outputs other than ‘1’ Output is 1 or 0 It doesn’t matter about how far over the threshold we are X1 θ = 5 2 20 Y1 -5 -10 1 X2 θ = 2 -2 θ = 9 1 -4 Z Y2 5 1 1 X3 3 1 2 1 G51IAI – Introduction to AI 6 θ = 0
Example from last lecture Left wheel speed ... Right wheel speed The speed of the wheels is not just 0 or 1 ... G51IAI – Introduction to AI
Expanding the Model of the Neuron: Outputs other than ‘1’ So far, the neurons have only output a value of 1 when they fire. If the input sum is greater than the threshold the neuron outputs 1. In fact, the neurons can output any value that you want. G51IAI – Introduction to AI
Modelling a Neuron aj : Input value (output from unit j) model in more details Threshold function seen so far: one of activation function aj : Input value (output from unit j) wj,i : Weight on the link from unit j to unit i ini : Weighted sum of inputs to unit i ai : Activation value of unit i g : Activation function G51IAI – Introduction to AI
Activation Functions Stept(x) = 1 if x >= t, else 0 Sign(x) = +1 if x >= 0, else –1 Sigmoid(x) = 1/(1+e-x) Sigmoid: output as input for other neuron aj : Input value (output from unit j) ini : Weighted sum of inputs to unit i ai : Activation value of unit i g : Activation function G51IAI – Introduction to AI
Summary Linear Separability Learning Algorithm Pseudocode Activation function (threshold, sigmoid, etc) G51IAI – Introduction to AI