Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal

Supervised learning in single layer networks Learning in perceptron Learning in perceptron - perceptron learning rule Learning in Adaline Learning in Adaline - Widrow-Hoff learning rule (delta rule, least mean square)

Issue common to single layer networks Single layer networks can solve only linearly separable problems Single layer networks can solve only linearly separable problems Linear separability Linear separability - Two categories are linearly separable patterns if their members can be separated by a single line

Linearly separable Consider a system like ANDConsider a system like AND x 1 x 2 x 1 AND x 2 111111111111 010010010010 100100100100 000000000000 1- 1 Decision boundary

Linearly inseparable - XOR Consider a system like XORConsider a system like XOR x 1 x 2 x 1 XOR x 2 110 011 101 000 Consider a system like XORConsider a system like XOR x 1 x 2 x 1 XOR x 2 110 011 101 000 1 1

Single layer perceptron A perceptron neuron has the step function as the transfer function A perceptron neuron has the step function as the transfer function Output is either 1 or 0 Output is either 1 or 0 1 when net input into transfer function is 0 or greater than 0 1 when net input into transfer function is 0 or greater than 0 0 otherwise, i.e., when net input is less than 0 0 otherwise, i.e., when net input is less than 0

Single layer perceptron 1 x1 x2 f w1 w2 b A bias acts as a weight on a connection from a unit whose value is always one. The bias shifts the function f b units to the left If bias not included decision boundary would be forced to go through origin. Many linearly separable function would change into linearly inseparable bias

Perceptron learning rule Supervised learning Supervised learning We have both inputs and outputs We have both inputs and outputs Let p i =input i Let p i =input i a=output of network a=output of network t= target t= target E.g. AND function E.g. AND function x 1 x 2 x 1 AND x 2 111 010 100 000 We train the network with the aim that a new (unseen) input similar to old (seen) pattern will be classified correctly. We train the network with the aim that a new (unseen) input similar to old (seen) pattern will be classified correctly.

Perceptron learning rule 3 cases to consider: 3 cases to consider: Case 1: Case 1: an input vector is presented to the network and the output of the network is correct. an input vector is presented to the network and the output of the network is correct. a=t and e=t-a=0. a=t and e=t-a=0. the weights are not changed the weights are not changed

Perceptron learning rule Case 2: If neuron output is 0 and should have been 1, then Case 2: If neuron output is 0 and should have been 1, then a=0 and t=1, a=0 and t=1, e=t-a=1-0=1 e=t-a=1-0=1 then the inputs are added to their corresponding weights then the inputs are added to their corresponding weights Case 3: if neuron output is 1 and should have been 0, then Case 3: if neuron output is 1 and should have been 0, then a=1 and t=0, a=1 and t=0, e=t-a=0-1=-1 e=t-a=0-1=-1 then the inputs are subtracted from their corresponding weights then the inputs are subtracted from their corresponding weights

Perceptron learning rule Perceptron learning rule can be more conveniently represented as: Perceptron learning rule can be more conveniently represented as: w new =w old +LR*e*p (LR=learning rate) b new =b old +LR*e Convergence Convergence The perceptron learning rule will converge to a solution in a finite number of steps if a solution exists. These include all classification problems that are linearly separable.

Perceptron Learning Algorithm While epoch produces an error Present network with next inputs from epoch e = t – a If e <> 0 then w j = w j + LR * p j * e b j =b j +LR*e End If End While Epoch : Presentation of the entire training set to the neural network. In the case of the AND function an epoch consists of four sets of inputs being presented to the network (i.e. [0,0], [0,1], [1,0], [1,1])

Example x 1 x 2 t x 1 x 2 t 220 220 1-21 1-21 -220 -111 Learning rate =1 Initial weights =0, 0 Bias = 0

Adaline Adaline – Adaptive Linear Filter Adaline – Adaptive Linear Filter Similar to perceptrons but has the identity function (f(x)=x) as transfer function instead of the step function Similar to perceptrons but has the identity function (f(x)=x) as transfer function instead of the step function Uses the Widrow-Hoff learning rule (delta rule, least mean square-LMS) Uses the Widrow-Hoff learning rule (delta rule, least mean square-LMS) More powerful than perceptron learning rule. More powerful than perceptron learning rule. Rule provides basis for the backpropagation algorithm which can learn with many interconnected neurons and layers Rule provides basis for the backpropagation algorithm which can learn with many interconnected neurons and layers

Adaline LMS learning rule adjusts the weights and biases so as to minimise the mean squared error for each pattern LMS learning rule adjusts the weights and biases so as to minimise the mean squared error for each pattern is based on the gradient descent algorithm is based on the gradient descent algorithm

Gradient Descent

The ADALINE Training algorithm goes through the all training examples a number of times, until a stopping criterion is reached Training algorithm goes through the all training examples a number of times, until a stopping criterion is reached Step 1Initialise all weights and set learning rate w i = (small random values) LR = 0.2 (for example) Step 2While stopping condition is false (for example, error >0.01) Update bias and weights b i (new) = b i (old) + 2*LR*e i w i (new) = w i (old) + 2*LR*e * p i

Comparison Perceptron and Adaline learning rules One fixes binary error, the other minimises continuous error One fixes binary error, the other minimises continuous error The perceptron rule converges after a finite number of iterations if solution is linearly separable, LMS converges asymptotically towards the minimum error, probably requiring unbounded time The perceptron rule converges after a finite number of iterations if solution is linearly separable, LMS converges asymptotically towards the minimum error, probably requiring unbounded time

Recommended Reading Fundamentals of neural networks; Architectures, Algorithms and Applications, L. Fausett, 1994. Fundamentals of neural networks; Architectures, Algorithms and Applications, L. Fausett, 1994. Artificial Intelligence: A Modern Approach, S. Russel and P. Norvig, 1995. Artificial Intelligence: A Modern Approach, S. Russel and P. Norvig, 1995. An Introduction to Neural Networks. 2 nd Edition, Morton, IM. An Introduction to Neural Networks. 2 nd Edition, Morton, IM.

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Similar presentations

Presentation on theme: "Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Similar presentations

Presentation on theme: "Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal."— Presentation transcript:

Similar presentations

About project

Feedback