October 28, 2010Neural Networks Lecture 13: Adaptive Networks 1 Adaptive Networks As you know, there is no equation that would tell you the ideal number.

Slides:



Advertisements
Similar presentations
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
Advertisements

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
November 18, 2010Neural Networks Lecture 18: Applications of SOMs 1 Assignment #3 Question 2 Regarding your cascade correlation projects, here are a few.
Artificial Neural Networks
Machine Learning Neural Networks
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Radial Basis Functions
September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
Neural Networks. R & G Chapter Feed-Forward Neural Networks otherwise known as The Multi-layer Perceptron or The Back-Propagation Neural Network.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
October 14, 2010Neural Networks Lecture 12: Backpropagation Examples 1 Example I: Predicting the Weather We decide (or experimentally determine) to use.
December 7, 2010Neural Networks Lecture 21: Hopfield Network Convergence 1 The Hopfield Network The nodes of a Hopfield network can be updated synchronously.
September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +
Neural Networks Lecture 17: Self-Organizing Maps
November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:
Radial Basis Function (RBF) Networks
Radial-Basis Function Networks
Radial Basis Function Networks
Chapter 4 Supervised learning: Multilayer Networks II.
Multiple-Layer Networks and Backpropagation Algorithms
Cascade Correlation Architecture and Learning Algorithm for Neural Networks.
Artificial Neural Networks
Multi-Layer Perceptrons Michael J. Watts
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Chapter 9 Neural Network.
Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.
Appendix B: An Example of Back-propagation algorithm
Using Neural Networks to Predict Claim Duration in the Presence of Right Censoring and Covariates David Speights Senior Research Statistician HNC Insurance.
Gap filling of eddy fluxes with artificial neural networks
November 26, 2013Computer Vision Lecture 15: Object Recognition III 1 Backpropagation Network Structure Perceptrons (and many other classifiers) can only.
Applying Neural Networks Michael J. Watts
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
Multi-Layer Perceptron
CSE & CSE6002E - Soft Computing Winter Semester, 2011 Neural Networks Videos Brief Review The Next Generation Neural Networks - Geoff Hinton.
Intro. ANN & Fuzzy Systems Lecture 14. MLP (VI): Model Selection.
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
Neural Networks - Berrin Yanıkoğlu1 Applications and Examples From Mitchell Chp. 4.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Chapter 8: Adaptive Networks
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
Perceptrons Michael J. Watts
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Soft Computing Lecture 15 Constructive learning algorithms. Network of Hamming.
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Machine Learning Supervised Learning Classification and Regression
Multiple-Layer Networks and Backpropagation Algorithms
Deep Feedforward Networks
Supervised Learning in ANNs
Chapter 4 Supervised learning: Multilayer Networks II
Neural Networks Winter-Spring 2014
Advanced information retreival
Predicting Salinity in the Chesapeake Bay Using Neural Networks
One-layer neural networks Approximation problems
A Simple Artificial Neuron
CSE 473 Introduction to Artificial Intelligence Neural Networks
Chapter 4 Supervised learning: Multilayer Networks II
Artificial Intelligence Methods
Artificial Neural Network & Backpropagation Algorithm
network of simple neuron-like computing elements
Capabilities of Threshold Neurons
Computer Vision Lecture 19: Object Recognition III
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Presentation transcript:

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 1 Adaptive Networks As you know, there is no equation that would tell you the ideal number of neurons in a multi-layer network. Ideally, we would like to use the smallest number of neurons that allows the network to do its task sufficiently accurately, because of: the small number of weights in the system, the small number of weights in the system, fewer training samples being required, fewer training samples being required, faster training, faster training, typically, better generalization for new test samples. typically, better generalization for new test samples.

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 2 Adaptive Networks So far, we have determined the number of hidden- layer units in BPNs by “trial and error.” However, there are algorithmic approaches for adapting the size of a network to a given task. Some techniques start with a large network and then iteratively prune connections and nodes that contribute little to the network function. Other methods start with a minimal network and then add connections and nodes until the network reaches a given performance level. Finally, there are algorithms that combine these “pruning” and “growing” approaches.

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 3 Cascade Correlation None of these algorithms are guaranteed to produce “ideal” networks. (It is not even clear how to define an “ideal” network.) However, numerous algorithms exist that have been shown to yield good results for most applications. We will take a look at one such algorithm named “cascade correlation.” It is of the “network growing” type and can be used to build multi-layer networks of adequate size. However, these networks are not strictly feed-forward in a level-by-level manner.

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 4 Refresher: Covariance and Correlation For a dataset (x i, y i ) with i = 1, …, n the covariance is: x y cov(x,y) > 0 x y x y cov(x,y) 0 cov(x,y) ≈ 0 x y x y cov(x,y) < 0 x y

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 5 Refresher: Covariance and Correlation Covariance tells us something about the strength and direction (directly vs. inversely proportional) of the linear relationship between x and y. For many applications, it is useful to normalize this variable so that it ranges from -1 to 1. The result is the correlation coefficient r, which for a dataset (x i, y i ) with i = 1, …, n is given by:

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 6 Refresher: Covariance and Correlation x y 0 < r < 1 0 < r < 1 x y r 0 r ≈ 0 x y -1 < r < 0 x y r = 1 r = 1 x y r r = -1 x y r undef’d r undef’d

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 7 Refresher: Covariance and Correlation In the case of high (close to 1) or low (close to -1) correlation coefficients, we can use one variable as a predictor of the other one. To quantify the linear relationship between the two variables, we can use linear regression: x y regression line

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 8 Cascade Correlation Now let us return to the cascade correlation algorithm. We start with a minimal network consisting of only the input neurons (one of them should be a constant offset = 1) and the output neurons, completely connected as usual. The output neurons (and later the hidden neurons) typically use output functions that can also produce negative outputs; e.g., we can subtract 0.5 from our sigmoid function for a (-0.5, 0.5) output range. Then we successively add hidden-layer neurons and train them to reduce the network error step by step:

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 9 Cascade Correlation Input nodes x1x1x1x1 x2x2x2x2 x3x3x3x3 Output node Solid connections are being modified o1o1o1o1

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 10 Cascade Correlation Input nodes x1x1x1x1 x2x2x2x2 x3x3x3x3 Output node Solid connections are being modified o1o1o1o1 First hidden node

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 11 Cascade Correlation Input nodes x1x1x1x1 x2x2x2x2 x3x3x3x3 Output node Solid connections are being modified o1o1o1o1 First hidden node Second hidden node

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 12 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance of the node’s output with the current network error. Covariance: : vector of weights to the new node : vector of weights to the new node : output of the new node to p-th input sample : output of the new node to p-th input sample : error of k-th output node for p-th input sample before the new node is added : error of k-th output node for p-th input sample before the new node is added : averages over the training set : averages over the training set