Supervised Learning in ANNs

Slides:



Advertisements
Similar presentations
Perceptron Lecture 4.
Advertisements

Computer Vision Lecture 18: Object Recognition II
Machine Learning Neural Networks
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.
September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
September 16, 2010Neural Networks Lecture 4: Models of Neurons and Neural Networks 1 Capabilities of Threshold Neurons By choosing appropriate weights.
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
Neural Networks Lecture 17: Self-Organizing Maps
November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:
Image Compression Using Neural Networks Vishal Agrawal (Y6541) Nandan Dubey (Y6279)
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
Artificial Neural Networks
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Multiple-Layer Networks and Backpropagation Algorithms
Introduction to Artificial Neural Network Models Angshuman Saha Image Source: ww.physiol.ucl.ac.uk/fedwards/ ca1%20neuron.jpg.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
From Biological to Artificial Neural Networks Marc Pomplun Department of Computer Science University of Massachusetts at Boston
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
November 26, 2013Computer Vision Lecture 15: Object Recognition III 1 Backpropagation Network Structure Perceptrons (and many other classifiers) can only.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.
Introduction to Neural Networks. Biological neural activity –Each neuron has a body, an axon, and many dendrites Can be in one of the two states: firing.
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Neural Networks - lecture 51 Multi-layer neural networks  Motivation  Choosing the architecture  Functioning. FORWARD algorithm  Neural networks as.
November 20, 2014Computer Vision Lecture 19: Object Recognition III 1 Linear Separability So by varying the weights and the threshold, we can realize any.
Image Source: ww.physiol.ucl.ac.uk/fedwards/ ca1%20neuron.jpg
EEE502 Pattern Recognition
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.
Neural Networks 2nd Edition Simon Haykin
Chapter 6 Neural Network.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
April 5, 2016Introduction to Artificial Intelligence Lecture 17: Neural Network Paradigms II 1 Capabilities of Threshold Neurons By choosing appropriate.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
1 Neural Networks Winter-Spring 2014 Instructor: A. Sahebalam Instructor: A. Sahebalam Neural Networks Lecture 3: Models of Neurons and Neural Networks.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Multiple-Layer Networks and Backpropagation Algorithms
Neural Networks Winter-Spring 2014
CS623: Introduction to Computing with Neural Nets (lecture-5)
One-layer neural networks Approximation problems
Ranga Rodrigo February 8, 2014
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
CS621: Artificial Intelligence
CSC 578 Neural Networks and Deep Learning
Artificial Intelligence Methods
Artificial Neural Network & Backpropagation Algorithm
of the Artificial Neural Networks.
Artificial Intelligence Chapter 3 Neural Networks
Supervised Function Approximation
Capabilities of Threshold Neurons
Artificial Intelligence Chapter 3 Neural Networks
The Naïve Bayes (NB) Classifier
Artificial Intelligence Chapter 3 Neural Networks
Neuro-Computing Lecture 2 Single-Layer Perceptrons
Artificial Intelligence Chapter 3 Neural Networks
CS623: Introduction to Computing with Neural Nets (lecture-5)
Computer Vision Lecture 19: Object Recognition III
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Artificial Intelligence Chapter 3 Neural Networks
Presentation transcript:

Supervised Learning in ANNs In supervised learning, we train an ANN with a set of vector pairs, so-called exemplars. Each pair (x, y) consists of an input vector x and a corresponding output vector y. Whenever the network receives input x, we would like it to provide output y. The exemplars thus describe the function that we want to “teach” our network. Besides learning the exemplars, we would like our network to generalize, that is, give plausible output for inputs that the network had not been trained with. April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Supervised Learning in ANNs There is a tradeoff between a network’s ability to precisely learn the given exemplars and its ability to generalize (i.e., inter- and extrapolate). This problem is similar to fitting a function to a given set of data points. Let us assume that you want to find a fitting function f:RR for a set of three data points. You try to do this with polynomials of degree one (a straight line), two, and nine. April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Supervised Learning in ANNs deg. 2 f(x) x deg. 9 deg. 1 Obviously, the polynomial of degree 2 provides the most plausible fit. April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Supervised Learning in ANNs The same principle applies to ANNs: If an ANN has too few neurons, it may not have enough degrees of freedom to precisely approximate the desired function. If an ANN has too many neurons, it will learn the exemplars perfectly, but its additional degrees of freedom may cause it to show implausible behavior for untrained inputs; it then presents poor ability of generalization. Unfortunately, there are no known equations that could tell you the optimal size of your network for a given application; you always have to experiment. April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

The Backpropagation Network The backpropagation network (BPN) is the most popular type of ANN for applications such as classification or function approximation. Like other networks using supervised learning, the BPN is not biologically plausible. The structure of the network is identical to the one we discussed before: Three (sometimes more) layers of neurons, Only feedforward processing: input layer  hidden layer  output layer, Sigmoid activation functions April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

The Backpropagation Network BPN units and activation functions: input vector x f(neth) I1 output vector y I2 II H1 H2 H3 HJ O1 OK … f(neto) April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Learning in the BPN Before the learning process starts, all weights (synapses) in the network are initialized with pseudorandom numbers. We also have to provide a set of training patterns (exemplars). They can be described as a set of ordered vector pairs {(x1, y1), (x2, y2), …, (xP, yP)}. Then we can start the backpropagation learning algorithm. This algorithm iteratively minimizes the network’s error by finding the gradient of the error surface in weight-space and adjusting the weights in the opposite direction (gradient-descent technique). April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Learning in the BPN Gradient-descent example: Finding the absolute minimum of a one-dimensional error function f(x): f(x) x slope: f’(x0) x0 x1 = x0 - f’(x0) Repeat this iteratively until for some xi, f’(xi) is sufficiently close to 0. April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Learning in the BPN Gradients of two-dimensional functions: The two-dimensional function in the left diagram is represented by contour lines in the right diagram, where arrows indicate the gradient of the function at different locations. Obviously, the gradient is always pointing in the direction of the steepest increase of the function. In order to find the function’s minimum, we should always move against the gradient. April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Learning in the BPN In the BPN, learning is performed as follows: Randomly select a vector pair (xp, yp) from the training set and call it (x, y). Use x as input to the BPN and successively compute the outputs of all neurons in the network (bottom-up) until you get the network output o. Compute the error opk, for the pattern p across all K output layer units by using the formula: April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Learning in the BPN Compute the error hpj, for all J hidden layer units by using the formula: Update the connection-weight values to the hidden layer by using the following equation: April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Learning in the BPN Update the connection-weight values to the output layer by using the following equation: Repeat steps 1 to 6 for all vector pairs in the training set; this is called a training epoch. Run as many epochs as required to reduce the network error E to fall below a threshold : April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Sigmoidal Neurons 1 fi(neti(t)) neti(t) -1  = 0.1  = 1 In backpropagation networks, we typically choose  = 1 and  = 0. April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Sigmoidal Neurons In order to derive a more efficient and straightforward learning algorithm, let us use a simplified form of the sigmoid function. We do not need a modifiable threshold ; instead, let us set  = 0 and add an offset (“dummy”) input for each neuron that always provides an input value of 1. Then modifying the weight for the offset input will work just like varying the threshold. The choice  = 1 works well in most situations and results in a very simple derivative of S(net): April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Sigmoidal Neurons Very simple and efficient to compute! April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Learning in the BPN Then the derivative of our sigmoid function, for example, f’(netk) for the output neurons, is: April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I

Learning in the BPN Now our BPN is ready to go! If we choose the type and number of neurons in our network appropriately, after training the network should show the following behavior: If we input any of the training vectors, the network should yield the expected output vector (with some margin of error). If we input a vector that the network has never “seen” before, it should be able to generalize and yield a plausible output vector based on its knowledge about similar input vectors. April 7, 2016 Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I