Additional NN Models Reinforcement learning (RL) Basic ideas:

Slides:



Advertisements
Similar presentations
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Advertisements

Simple Neural Nets For Pattern Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Neural Networks Basic concepts ArchitectureOperation.
Radial Basis Functions
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
Prediction Networks Prediction –Predict f(t) based on values of f(t – 1), f(t – 2),… –Two NN models: feedforward and recurrent A simple example (section.
Chapter 6: Multilayer Neural Networks
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
Radial-Basis Function Networks
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
8/10/ RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.
Biointelligence Laboratory, Seoul National University
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Chapter 4 Supervised learning: Multilayer Networks II.
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 21 Oct 28, 2005 Nanjing University of Science & Technology.
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Optimal Bayes Classification
Other NN Models Reinforcement learning (RL)
Chapter 2 Single Layer Feedforward Networks
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Supervised Learning. Teacher response: Emulation. Error: y1 – y2, where y1 is teacher response ( desired response, y2 is actual response ). Aim: To reduce.
Additional NN Models Reinforcement learning (RL) Basic ideas: –Supervised learning: (delta rule, BP) Samples (x, f(x)) to learn function f(.) precise error.
Chapter 6 Neural Network.
Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.
Chapter 7 Optimization Methods. Introduction Examples of optimization problems –IC design (placement, wiring) –Graph theoretic problems (partitioning,
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Machine Learning Supervised Learning Classification and Regression
Neural networks and support vector machines
Chapter 5 Unsupervised learning
Fall 2004 Backpropagation CS478 - Machine Learning.
Deep Feedforward Networks
Chapter 4 Supervised learning: Multilayer Networks II
The Gradient Descent Algorithm
Neural Networks Winter-Spring 2014
Chapter 2 Single Layer Feedforward Networks
One-layer neural networks Approximation problems
Real Neurons Cell structures Cell body Dendrites Axon
Chapter 4 Supervised learning: Multilayer Networks II
LECTURE 28: NEURAL NETWORKS
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Data Mining Lecture 11.
FUNDAMENTAL CONCEPT OF ARTIFICIAL NETWORKS
Counter propagation network (CPN) (§ 5.3)
CSC 578 Neural Networks and Deep Learning
ECE 471/571 - Lecture 17 Back Propagation.
General Aspects of Learning
Synaptic DynamicsII : Supervised Learning
Neuro-Computing Lecture 4 Radial Basis Function Network
of the Artificial Neural Networks.
Additional NN Models Reinforcement learning (RL) Basic ideas:
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Neural Network - 2 Mayank Vatsa
Neural Networks ICS 273A UC Irvine Instructor: Max Welling
LECTURE 28: NEURAL NETWORKS
Ensemble learning.
Boltzmann Machine (BM) (§6.4)
Introduction to Radial Basis Function Networks
Parametric Methods Berlin Chen, 2005 References:
Machine Learning: Lecture 6
Machine Learning: UNIT-3 CHAPTER-1
Prediction Networks Prediction A simple example (section 3.7.3)
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Additional NN Models Reinforcement learning (RL) Basic ideas: Supervised learning: (delta rule, BP) sample(x, f(s)) to learn f(.) precise error can be determined and is used to drive the learning. Unsupervised learning: (competitive, BM) no target/desired output provided to help learning, learning is self-organized/clustering reinforcement learning: in between the two no target output for input vectors in training samples a judge/critic will evaluate the output good: reward signal (+1) bad: penalty signal (-1)

RL exists in many places Originated from psychology( training animal) Machine learning community, different theories and algorithms major difficulty: credit/blame distribution chess playing: W/L (multi-step) soccer playing: W/L(multi-player) In many applications, it is much easier to determine good/bad, right/wrong, acceptable/unacceptable than to provide precise correct answer/error. It is up to the learning process to improve the system’s performance based on the critic’s signal.

Principle of RL Let r = +1 reword (good output) r = -1 penalty (bad output) If r = +1, the system is encouraged to continue what it is doing If r = -1, the system is encouraged not to do what it is doing. Need to search for better output because r = -1 does not indicate what the good output should be. common method is “random search”

ARP: the associative reword-and-penalty algorithm for NN (Barton and Anandan, 1985) Architecture critic z(k) y(k) x(k) input: x(k) output: y(k) stochastic units: z(k) for random search

Random search by stochastic units zi or let zi obey a continuous probability distribution function. or let is a random noise, obeys certain distribution. Key: z is not a deterministic function of x, this gives z a chance to be a good output. Prepare desired output (temporary)

Compute the errors at z layer where E(z(k)) is the expected value of z(k) because z is a random variable How to compute E(z(k)) take average of z over a period of time compute from the distribution, if possible if logistic sigmoid function is used, Training: BP or other method to minimize the error

(II) Probabilistic Neural Networks 1. Purpose: classify a given input pattern x into one of the pre-defined classes by Bayesian decision rule. Suppose there are k predefined classes s1, …sk P(si): prior probability of class si P(x|si): conditional probability of class si P(x): probability of x P(si|x): posterior probability of si, given x example: S=s1Us2Us3….Usk, the set of all patients si: the set of all patients having disease I x: a description of a patient(manifestation)

P(x|si): prob. One with disease I will have description x P(si|x): prob. one with description x will have disease i. by Bayes’ theorem:

2. PNN architecture: feed forward with 2 hidden layers learning is not used to minimize error but to obtain P(x|si) 3. Learning assumption: P(si) are known, P(x|si) obey Gaussian distr. estimate

4.Comments: (1) Bayesian classification by (2) fast classification( especially if implemented in parallel machine). (3) fast learning (4) trade nodes for time( not good with large training smaples/clusters).

(III)Recurrent BP Recurrent networks: network with feedback links - state(output) of the network evolves along the time. - may or may not have hidden nodes. - may or may not stabilize when t - how to learn w so that an initial state(input) will lead to a stable state with the desired output. 2. Unfolding for any recurrent network with finite evolution time, there is an equivalent feedforward network. problems: too many repetitions too many layers when the network need a long time to

reach stable state. standard BP needs to be relized to hard duplicate weights. 3. Recurrent BP (1987) system: assume at least one fixed point exists for the system with the given initial state when a fixed point is reduced can be obtained. error

take the gradient descent approach to minimize E by update W direct derivation will have

Computing is very time consuming. Pineda and Almeida/s proposal: can be computed by another recurrent net with identical structure of the original RN direction of each are is reversed( transposed network) in the original network: weight for node j to i: Wij in the transposed network, weight for node j to i:

Weight-update procedure for RBP with a given input and its desired output Relax the original network to a fixed point Compute error Relax the transposed network to a fixed point Update the weight of the original network

The complete learning algorithm incremental/sequential W is updated by the preseting of each learning pair using the weight-update procedure. to ensure the dearned network is stable, learning rate must be small(much smaller than the rate for standard BP learning) time consuming: two relaxation processes are involved for each step of weight update better performance than BP in some applications

III network of radial basis functions Motivations better function approximation BP network( hidden units are sigmoid) training time is very long generalization(with non-training input) not always good Counter Propagation(hidden units are WTA) poor approximation, especially with interpolation any input is forced to be classified into one class and intern produces class/ output as its function value.

2. Architecture inputhiddenoutput(similar to BP and CPN) operation/learning: similar to CPN inputhidden: competitive learning for class character hiddenoutput delta rule(LMS error) for mapping difference: hidden units obey Radial Basis function 3. Hidden unit: Gaussian function suppose unit I represent a class of inputs with centroid

Radial basis function input vectors with equal distance to Ci will have the same output. Each hidden unit I has a receptive fied with Ci as its center if x=Ci , unit I has the largest output if x!=Ci, unit I has the smallest output the size of the receptive field is determined by During computation, hidden units are not WTA( no lateral inhibition with an input x, usually more than one hidden units can have non-zero output. These outputs can be combined at output layer to produce better approximation.

4. Learning inputhidden Ci: competitive, based on neti : ad hoc(performance not sensitive to hiddenoutput delta rule(LMS)

5. Comments compare with BP approximate any …..L2 function(same as BP) may have better … usually requires many more training samples and many more hidden units only one hidden layer is needed. training is faster Compare with CPN much better function approximation theoretical analysis is only prelimnary