November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Kostas Kontogiannis E&CE
Machine Learning Neural Networks
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Radial Basis Functions
November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
September 14, 2010Neural Networks Lecture 3: Models of Neurons and Neural Networks 1 Visual Illusions demonstrate how we perceive an “interpreted version”
September 16, 2010Neural Networks Lecture 4: Models of Neurons and Neural Networks 1 Capabilities of Threshold Neurons By choosing appropriate weights.
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
Chapter 6: Multilayer Neural Networks
November 30, 2010Neural Networks Lecture 20: Interpolative Associative Memory 1 Associative Networks Associative networks are able to store a set of patterns.
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
October 14, 2010Neural Networks Lecture 12: Backpropagation Examples 1 Example I: Predicting the Weather We decide (or experimentally determine) to use.
An Introduction To The Backpropagation Algorithm Who gets the credit?
November 24, 2009Introduction to Cognitive Science Lecture 21: Self-Organizing Maps 1 Self-Organizing Maps (Kohonen Maps) In the BPN, we used supervised.
October 28, 2010Neural Networks Lecture 13: Adaptive Networks 1 Adaptive Networks As you know, there is no equation that would tell you the ideal number.
Neural Networks Lecture 17: Self-Organizing Maps
CS 484 – Artificial Intelligence
Image Compression Using Neural Networks Vishal Agrawal (Y6541) Nandan Dubey (Y6279)
December 5, 2012Introduction to Artificial Intelligence Lecture 20: Neural Network Application Design III 1 Example I: Predicting the Weather Since the.
Multiple-Layer Networks and Backpropagation Algorithms
Artificial Neural Networks
Appendix B: An Example of Back-propagation algorithm
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 20 Oct 26, 2005 Nanjing University of Science & Technology.
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
November 26, 2013Computer Vision Lecture 15: Object Recognition III 1 Backpropagation Network Structure Perceptrons (and many other classifiers) can only.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
Multi-Layer Perceptron
Akram Bitar and Larry Manevitz Department of Computer Science
Introduction to Neural Networks. Biological neural activity –Each neuron has a body, an axon, and many dendrites Can be in one of the two states: firing.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
November 20, 2014Computer Vision Lecture 19: Object Recognition III 1 Linear Separability So by varying the weights and the threshold, we can realize any.
EEE502 Pattern Recognition
Chapter 8: Adaptive Networks
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Chapter 6 Neural Network.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
March 31, 2016Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 1 … let us move on to… Artificial Neural Networks.
An Introduction To The Backpropagation Algorithm.
April 5, 2016Introduction to Artificial Intelligence Lecture 17: Neural Network Paradigms II 1 Capabilities of Threshold Neurons By choosing appropriate.
1 Neural Networks Winter-Spring 2014 Instructor: A. Sahebalam Instructor: A. Sahebalam Neural Networks Lecture 3: Models of Neurons and Neural Networks.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Multiple-Layer Networks and Backpropagation Algorithms
Supervised Learning in ANNs
One-layer neural networks Approximation problems
CSE 473 Introduction to Artificial Intelligence Neural Networks
CSE P573 Applications of Artificial Intelligence Neural Networks
Prof. Carolina Ruiz Department of Computer Science
Artificial Neural Network & Backpropagation Algorithm
CSE 573 Introduction to Artificial Intelligence Neural Networks
Capabilities of Threshold Neurons
The Naïve Bayes (NB) Classifier
Computer Vision Lecture 19: Object Recognition III
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Carolina Ruiz Department of Computer Science
Presentation transcript:

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions: The two-dimensional function in the left diagram is represented by contour lines in the right diagram, where arrows indicate the gradient of the function at different locations. Obviously, the gradient is always pointing in the direction of the steepest increase of the function. In order to find the function’s minimum, we should always move against the gradient.

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 2 Learning in the BPN In the BPN, learning is performed as follows: 1.Randomly select a vector pair (x p, y p ) from the training set and call it (x, y). 2.Use x as input to the BPN and successively compute the outputs of all neurons in the network (bottom-up) until you get the network output o. 3.Compute the error  o pk, for the pattern p across all K output layer units by using the formula:

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 3 Learning in the BPN 4.Compute the error  h pj, for all J hidden layer units by using the formula: 5.Update the connection-weight values to the hidden layer by using the following equation:

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 4 Learning in the BPN Repeat steps 1 to 6 for all vector pairs in the training set; this is called a training epoch. Run as many epochs as required to reduce the network error E to fall below a threshold  : 6.Update the connection-weight values to the output layer by using the following equation:

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 5 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and  = f i (net i (t)) net i (t)  = 1  = 0.1

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 6 Sigmoidal Neurons In order to derive a more efficient and straightforward learning algorithm, let us use a simplified form of the sigmoid function. We do not need a modifiable threshold  ; instead, let us set  = 0 and add an offset (“dummy”) input for each neuron that always provides an input value of 1. Then modifying the weight for the offset input will work just like varying the threshold. The choice  = 1 works well in most situations and results in a very simple derivative of S(net):

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 7 Sigmoidal Neurons Very simple and efficient to compute!

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 8 Learning in the BPN Then the derivative of our sigmoid function, for example, f’(net k ) for the output neurons, is:

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 9 Learning in the BPN Now our BPN is ready to go! If we choose the type and number of neurons in our network appropriately, after training the network should show the following behavior: If we input any of the training vectors, the network should yield the expected output vector (with some margin of error). If we input any of the training vectors, the network should yield the expected output vector (with some margin of error). If we input a vector that the network has never “seen” before, it should be able to generalize and yield a plausible output vector based on its knowledge about similar input vectors. If we input a vector that the network has never “seen” before, it should be able to generalize and yield a plausible output vector based on its knowledge about similar input vectors.

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 10 Backpropagation Network Variants The standard BPN network is well-suited for learning static functions, that is, functions whose output depends only on the current input. For many applications, however, we need functions whose output changes depending on previous inputs (for example, think of a deterministic finite automaton). Obviously, pure feedforward networks are unable to achieve such a computation. Only recurrent neural networks (RNNs) can overcome this problem. A well-known recurrent version of the BPN is the Elman Network.

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 11 The Elman Network In comparison to the BPN, the Elman Network has an extra set of input units, so-called context units. These neurons do not receive input from outside the network, but from the network’s hidden layer in a one-to-one fashion. Basically, the context units contain a copy of the network’s internal state at the previous time step. The context units feed into the hidden layer just like the other input units do, so the network is able to compute a function that not only depends on the current input, but also on the network’s internal state (which is determined by previous inputs).

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 12 The Elman Network

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 13 The Counterpropagation Network Another variant of the BPN is the counterpropagation network (CPN). Although this network uses linear neurons, it can learn nonlinear functions by means of a hidden layer of competitive units. Moreover, the network is able to learn a function and its inverse at the same time. However, to simplify things, we will only consider the feedforward mechanism of the CPN.

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 14 The Counterpropagation Network A simple CPN network with two input neurons, three hidden neurons, and two output neurons can be described as follows: X1X1X1X1 X2X2X2X2 Input layer H1H1H1H1 H2H2H2H2 H3H3H3H3 Hidden layer Y1Y1Y1Y1 Y2Y2Y2Y2 Output layer

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 15 The Counterpropagation Network The CPN learning process (general form for n input units and m output units): 1.Randomly select a vector pair (x, y) from the training set. 2.Normalize (shrink/expand to “length” 1) the input vector x by dividing every component of x by the magnitude ||x||, where

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 16 The Counterpropagation Network 3.Initialize the input neurons with the normalized vector and compute the activation of the linear hidden-layer units. 4.In the hidden (competitive) layer, determine the unit W with the largest activation (the winner). 5.Adjust the connection weights between W and all N input-layer units according to the formula: 6.Repeat steps 1 to 5 until all training patterns have been processed once.