Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.

Slides:



Advertisements
Similar presentations
G53MLE | Machine Learning | Dr Guoping Qiu
Advertisements

Artificial Neural Networks
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Classification Neural Networks 1
Perceptron.
Overview over different methods – Supervised Learning
Simple Neural Nets For Pattern Classification
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
The back-propagation training algorithm
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
20.5 Nerual Networks Thanks: Professors Frank Hoffmann and Jiawei Han, and Russell and Norvig.
An Illustrative Example
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Artificial Neural Networks
Before we start ADALINE
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
Data Mining with Neural Networks (HK: Chapter 7.5)
Artificial Neural Networks
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
CS 4700: Foundations of Artificial Intelligence
September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +
Radial-Basis Function Networks
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
Neural Networks. Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation.
Artificial Neural Networks
Computer Science and Engineering
Artificial Neural Networks
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
DIGITAL IMAGE PROCESSING Dr J. Shanbehzadeh M. Hosseinajad ( J.Shanbehzadeh M. Hosseinajad)
Machine Learning Chapter 4. Artificial Neural Networks
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.
Multi-Layer Perceptron
Non-Bayes classifiers. Linear discriminants, neural networks.
11 1 Backpropagation Multilayer Perceptron R – S 1 – S 2 – S 3 Network.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
Chapter 2 Single Layer Feedforward Networks
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Supervised learning network G.Anuradha. Learning objectives The basic networks in supervised learning Perceptron networks better than Hebb rule Single.
EEE502 Pattern Recognition
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Chapter 2 Single Layer Feedforward Networks
第 3 章 神经网络.
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
CSSE463: Image Recognition Day 17
Hebb and Perceptron.
Data Mining with Neural Networks (HK: Chapter 7.5)
Classification Neural Networks 1
Artificial Intelligence Chapter 3 Neural Networks
Neural Networks Chapter 5
Artificial Intelligence Chapter 3 Neural Networks
Backpropagation.
CSSE463: Image Recognition Day 17
Artificial Intelligence Chapter 3 Neural Networks
Chapter - 3 Single Layer Percetron
Artificial Intelligence Chapter 3 Neural Networks
CSSE463: Image Recognition Day 17
Backpropagation.
Artificial Intelligence Chapter 3 Neural Networks
Presentation transcript:

Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University

Class Topics Introduction Decision Functions Midterm One Midterm Two Data Mining Project Presentations Introduction Decision Functions Cluster Analysis Statistical Decision Theory Feature Selection Machine Learning Neural Nets

Review Features of Neural Nets Learning in Neural Nets McCulloch/Pitts Neuron Hebbian Learning

Review – MCP Neuron One of the first neuron models to be implemented Its output is 1 (fired) or 0 Each input is weighted with weights in the range -1 to + 1 It has a threshold value, T T w1w1 w2w2 w3w3  x1x1 x2x2 x3x3 The neuron fires if the following inequality is true: x 1 w 1 + x 2 w 2 + x 3 w 3 > T

OUTLINE Introduction to the Perceptron Adaline Network

Introduction to the Perceptron

The Perceptron The perceptron was suggest by Rosenblatt in It uses an iterative learning procedure which can be proven to converge to the correct weights for linearly separable data It has a bias and a threshold function  w1w1 w2w2  x1x1 x2x2 b b Activation function: f(s) = 1 if s >  0 if -  <=  s <=  -1 if s < -  {

Perceptron Learning Rule Weights are changed only when an error occurs The weights are updated using the following: w i (new) = w i (old) +  tx i t is either +1 or -1  is the learning rate If an error does not occur, the weights are not changed

Limitations of the Perceptron The perceptron can only learn to distinguish between classifications if the classes are linearly separable. If the problem is not linearly separable then the behaviour of the algorithm is not guaranteed. If the problem is linearly separable, there may be a number of possible solutions. The algorithm as stated gives no indication of the quality of the solution found.

XOR Problem A perceptron network can not implement an XOR function x 0 x 1 x 2 o(x 1,x 2 ) w0w0 w1w1 w2w2 There is no assignment of values to w 0,w 1 and w 2 that satisfies above inequalities. XOR cannot be represented! XOR(x 1,x 2 ) INPUTS ARE DIFFERENT – 1 OUT

MultiLayer Perceptrons Perceptrons can be improved if placed in a multilayered network         x1x1 x2x2 x3x3 y1y1 y2y2   w ij w jk w kl Input Layer Hidden Layers Output Layer Inputs Outputs

Adaline Network

Adaline Network Variation on the Perceptron Network –inputs are +1 or -1 –outputs are +1 or -1 –uses a bias input Differences –trained using the Delta Rule which is also known as the least mean squares (LMS) or Widrow-Hoff rule –the activation function, during training is the identity function –after training the activation is a threshold function

Adaline Algorithm Step 0: initialize the weights to small random values and select a learning rate,  Step 1: for each input vector s, with target output, t set the inputs to s Step 2: compute the neuron inputs Step 3: use the delta rule to update the bias and weights Step 4: stop if the largest weight change across all the training samples is less than a specified tolerance, otherwise cycle through the training set again y_in = b + x i w i  Neuron input Delta rule b(new) = b(old) +  (t - y_in) w i (new) = w i (old) +  (t - y_in)x i

The Learning Rate,  The performance of an ADALINE neuron depends heavily on the choice of the learning rate –if it is too large the system will not converge –if it is too small the convergence will take to long Typically,  is selected by trial and error –typical range: 0.01 <  < 10.0 –often start at 0.1 –sometimes it is suggested that: 0.1 < n  < 1.0 where n is the number of inputs

Running Adaline One unique feature of ADALINE is that its activation function is different for training and running When running ADALINE use the following: –initialize the weights to those found during training –compute the net input –apply the activation function y_in = b + x i w i  Neuron input Activation Function y = 1 if y_in >= 0 -1 if y_in < 0 {

Example – AND function Construct an AND function for a ADALINE neuron –let  = 0.1 x1 x2 bias Target w1w1 w2w2  x1x1 x2x2 1 b Initial Conditions: Set the weights to small random values:  x1x1 x2x

First Training Run Apply the input (1,1) with output  The net input is: y_in = * *1 = 0.6 y_in = b + x i w i  Neuron input Delta rule b(new) = b(old) +  (t - y_in) w i (new) = w i (old) +  (t - y_in)x i The new weights are: b = (1-0.6) = 0.14 w 1 = (1-0.6)1 = 0.24 w 2 = (1-0.6)1 = 0.34 The largest weight change is 0.04

Second Training Run Apply the second training set (1 -1) with output  The net input is: y_in = * *(-1) = 0.04 The new weights are: b = (1+0.04) = 0.04 w 1 = (1+0.04)1 = 0.14 w 2 = (1+0.04)1 = 0.44 The largest weight change is 0.1

Third Training Run Apply the third training set (-1 1) with output  The net input is: y_in = * *1 = 0.34 The new weights are: b = (1+0.34) = w 1 = (1+0.34)1 = 0.27 w 2 = (1+0.34)1 = 0.31 The largest weight change is 0.13

Fourth Training Run Apply the fourth training set (-1 -1) with output  The net input is: y_in = * *1 = The new weights are: b = (1+0.67) = w 1 = (1+0.67)1 = 0.43 w 2 = (1+0.67)1 = 0.47 The largest weight change is 0.16

Conclusion Continue to cycle through the four training inputs until the largest change in the weights over a complete cycle is less than some small number (say 0.01) In this case, the solution becomes –b = -0.5 –w 1 = 0.5 –w 2 = 0.5

Example – Recognize Digits

Multi-Layered Net (again) output input neurons interconnects hidden layer

Properties Patterns of activation are presented at the inputs and the resulting activation of the outputs is computed. The values of the weights determine the function computed. A network with one hidden layer is sufficient to represent every Boolean function. With real weights every real valued function can be approximated with a single hidden layer.

Training Given Training Data, –input vector set : { i n | 1 < n < N } – corresponding output (target) vector set: { t n | 1 < n < N } Find the weights of the interconnects using training data to minimize the error in the test data

Error Input, target & response –input vector set : { i n | 1 < n < N } –target vector set: { t n | 1 < n < N } –o n = neural network output when the input is i n Error       o n - t n   1212

Error Minimization Techniques The error is a function of the –fixed training and test data –neural network weights Find weights that minimize error (Standard Optimization) –conjugate gradient descent –random search –genetic algorithms –steepest descent (error backpropagation)

Possible Quiz What is a limitation of the perceptron? What are the names of the layers in a multilayer net? What is the role of training?

SUMMARY Introduction to the Perceptron Adaline Network