Prediction Networks Prediction –Predict f(t) based on values of f(t – 1), f(t – 2),… –Two NN models: feedforward and recurrent A simple example (section.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Multi-Layer Perceptron (MLP)
Computational Intelligence Winter Term 2009/10 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
Chapter3 Pattern Association & Associative Memory
Backpropagation Learning Algorithm
Computational Intelligence Winter Term 2011/12 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
Computational Intelligence Winter Term 2014/15 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Support Vector Machines
Tuomas Sandholm Carnegie Mellon University Computer Science Department
Ch. 4: Radial Basis Functions Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009 based on slides from many Internet sources Longin.
Machine Learning Neural Networks
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Radial Basis-Function Networks. Back-Propagation Stochastic Back-Propagation Algorithm Step by Step Example Radial Basis-Function Networks Gaussian response.
Radial Basis Functions
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Aula 4 Radial Basis Function Networks
Artificial Neural Network
Radial Basis Function (RBF) Networks
Radial Basis Function G.Anuradha.
Radial-Basis Function Networks
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Radial Basis Function Networks
Neural networks.
Radial Basis Function Networks
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Chapter 4 Supervised learning: Multilayer Networks II.
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
Artificial Neural Networks Shreekanth Mandayam Robi Polikar …… …... … net k   
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
NEURAL NETWORKS FOR DATA MINING
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
Akram Bitar and Larry Manevitz Department of Computer Science
CS621 : Artificial Intelligence
What is Unsupervised Learning? Learning without a teacher. No feedback to indicate the desired outputs. The network must by itself discover the relationship.
Neural Networks - lecture 51 Multi-layer neural networks  Motivation  Choosing the architecture  Functioning. FORWARD algorithm  Neural networks as.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
BACKPROPAGATION (CONTINUED) Hidden unit transfer function usually sigmoid (s-shaped), a smooth curve. Limits the output (activation) unit between 0..1.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Machine Learning 12. Local Models.
Machine Learning Supervised Learning Classification and Regression
Chapter 4 Supervised learning: Multilayer Networks II
Neural Networks Winter-Spring 2014
Other Classification Models: Neural Network
Data Mining, Neural Network and Genetic Programming
Radial Basis Function G.Anuradha.
Chapter 4 Supervised learning: Multilayer Networks II
LECTURE 28: NEURAL NETWORKS
General Aspects of Learning
Computational Intelligence
Neuro-Computing Lecture 4 Radial Basis Function Network
Neural Network - 2 Mayank Vatsa
Computational Intelligence
Chapter 8: Generalization and Function Approximation
LECTURE 28: NEURAL NETWORKS
Introduction to Radial Basis Function Networks
Computational Intelligence
Prediction Networks Prediction A simple example (section 3.7.3)
Computational Intelligence
Akram Bitar and Larry Manevitz Department of Computer Science
Presentation transcript:

Prediction Networks Prediction –Predict f(t) based on values of f(t – 1), f(t – 2),… –Two NN models: feedforward and recurrent A simple example (section 3.7.3) –Forecasting gold price at a month based on its prices at previous months –Using a BP net with a single hidden layer 1 output node: forecasted price for month t k input nodes (using price of previous k months for prediction) k hidden nodes Training sample: for k = 2: {(x t-2, x t-1 ) x t } Raw data: gold prices for 100 consecutive months, 90 for training, 10 for cross validation testing one-lag forecasting: predict x t based on x t-2 and x t-1 multilag: using predicted values for further forecasting

Prediction Networks Training: –Three attempts: k = 2, 4, 6 –Learning rate = 0.3, momentum = 0.6 –25,000 – 50,000 epochs –2-2-2 net with good prediction –Two larger nets over-trained Results NetworkMSE Training one-lag multilag Training one-lag multilag Training one-lag multilag0.0176

Prediction Networks Generic NN model for prediction –Preprocessor prepares training samples from time series data –Train predictor using samples (e.g., by BP learning) Preprocessor –In the previous example, Let k = d + 1 (using previous d + 1data points to predict) –More general: c i is called a kernel function for different memory model (how previous data are remembered) Examples: exponential trace memory; gamma memory (see p.141)

Prediction Networks Recurrent NN architecture –Cycles in the net Output nodes with connections to hidden/input nodes Connections between nodes at the same layer Node may connect to itself –Each node receives external input as well as input from other nodes –Each node may be affected by output of every other node –With a given external input vector, the net often converges to an equilibrium state after a number of iterations (output of every node stops to change) An alternative NN model for function approximation –Fewer nodes, more flexible/complicated connections –Learning is often more complicated

Prediction Networks Approach I: unfolding to a feedforward net –Each layer represents a time delay of the network evolution –Weights in different layers are identical –Cannot directly apply BP learning (because weights in different layers are constrained to be identical) –How many layers to unfold to? Hard to determ ine A fully connected net of 3 nodes Equivalent FF net of k layers

Prediction Networks Approach II: gradient descent –A more general approach –Error driven: for a given external input –Weight update

NN of Radial Basis Functions Motivations: better performance than Sigmoid function –Some classification problems –Function interpolation Definition –A function is radial symmetric (or is RBF) if its output depends on the distance between the input vector and a stored vector to that function Output –NN with RBF node function are called RBF-nets

NN of Radial Basis Functions Gaussian function is the most widely used RBF – a bell-shaped function centered at u = 0. –Continuous and differentiable –Other RBF Inverse quadratic function, hypersh]pheric function, etc Gaussian function μ Inverse quadratic function μ hyperspheric function μ

NN of Radial Basis Functions Pattern classification –4 or 5 sigmoid hidden nodes are required for a good classification –Only 1 RBF node is required if the function can approximate the circle x x x x x x xx x x x

NN of Radial Basis Functions XOR problem –2-2-1 network 2 hidden nodes are RBF: Output node can be step or sigmoid –When input x is applied Hidden node calculates distance then its output All weights to hidden nodes set to 1 Weights to output node trained by LMS t 1 and t 2 can also been trained x (1,1) (0,1) (0,0) (1,0) (0, 0) (1, 1) (0, 1) (1, 0)

NN of Radial Basis Functions Function interpolation –Suppose you know and, to approximate ( ) by linear interpolation: –Let be the distances of from and then i.e., sum of function values, weighted and normalized by distances –Generalized to interpolating by more than 2 known f values Only those with small distance to are useful

NN of Radial Basis Functions Example: –8 samples with known function values – can be interpolated using only 4 nearest neighbors Using RBF node to achieve neighborhood effect –One hidden node per sample: –Netw ork o utput for approximating is proportional to

Clustering samples –Too many hidden nodes when # of samples is large –Grouping similar samples together into N clusters, each with The center: vector Desired mean output: Network output: Suppose we know how to determine N and how to cluster all P samples (not a easy task itself), and can be determined by learning NN of Radial Basis Functions

Learning in RBF net –Objective: learning to minimize –Gradient descent approach –One can also obtain by other clustering techniques, then use GD learning for only NN of Radial Basis Functions

Polynomial Networks Polynomial networks –Node functions allow direct computing of polynomials of inputs –Approximating higher order functions with fewer nodes (even without hidden nodes) –Each node has more connection weights Higher-order networks –# of weights per node: –Can be trained by LMS

Polynomial Networks Sigma-pi networks –Does not allow terms with higher powers of inputs, so they are not a general function approximater –# of weights per node: –Can be trained by LMS Pi-sigma networks –One hidden layer with Sigma function: –Output nodes with Pi function: Product units: Node computes product: Integer power P j,i can be learned Often mix with other units (e.g., sigmoid)