Pseudoinverse Learning Algorithm for Feedforward Neural Networks Guo, Ping Department of Computer Science & Engineering, The Chinese University of Hong.

Slides:



Advertisements
Similar presentations
Perceptron Lecture 4.
Advertisements

EE 690 Design of Embodied Intelligence
Introduction to Neural Networks Computing
Yuri R. Tsoy, Vladimir G. Spitsyn, Department of Computer Engineering
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
B.Macukow 1 Lecture 12 Neural Networks. B.Macukow 2 Neural Networks for Matrix Algebra Problems.
Tuomas Sandholm Carnegie Mellon University Computer Science Department
Artificial Neural Networks ECE 398BD Instructor: Shobha Vasudevan.
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Efficient Convex Relaxation for Transductive Support Vector Machine Zenglin Xu 1, Rong Jin 2, Jianke Zhu 1, Irwin King 1, and Michael R. Lyu 1 4. Experimental.
Back-Propagation Algorithm
Chapter 6: Multilayer Neural Networks
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
October 28, 2010Neural Networks Lecture 13: Adaptive Networks 1 Adaptive Networks As you know, there is no equation that would tell you the ideal number.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Biointelligence Laboratory, Seoul National University
A Shaft Sensorless Control for PMSM Using Direct Neural Network Adaptive Observer Authors: Guo Qingding Luo Ruifu Wang Limei IEEE IECON 22 nd International.
Multiple-Layer Networks and Backpropagation Algorithms
Multi Layer NN and Bit-True Modeling of These Networks SILab presentation Ali Ahmadi September 2007.
Multi-Layer Perceptrons Michael J. Watts
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 23 Nov 2, 2005 Nanjing University of Science & Technology.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Appendix B: An Example of Back-propagation algorithm
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
Classification / Regression Neural Networks 2
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Intelligence Techniques Multilayer Perceptrons.
Artificial Neural Networks An Introduction. What is a Neural Network? A human Brain A porpoise brain The brain in a living creature A computer program.
Well Log Data Inversion Using Radial Basis Function Network Kou-Yuan Huang, Li-Sheng Weng Department of Computer Science National Chiao Tung University.
Multi-Layer Perceptron
Non-Bayes classifiers. Linear discriminants, neural networks.
Akram Bitar and Larry Manevitz Department of Computer Science
CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.
An Artificial Neural Network Approach to Surface Waviness Prediction in Surface Finishing Process by Chi Ngo ECE/ME 539 Class Project.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Neural Networks - lecture 51 Multi-layer neural networks  Motivation  Choosing the architecture  Functioning. FORWARD algorithm  Neural networks as.
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
Neural Networks 2nd Edition Simon Haykin
Chapter 6 Neural Network.
BACKPROPAGATION (CONTINUED) Hidden unit transfer function usually sigmoid (s-shaped), a smooth curve. Limits the output (activation) unit between 0..1.
Kim HS Introduction considering that the amount of MRI data to analyze in present-day clinical trials is often on the order of hundreds or.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
An Introduction To The Backpropagation Algorithm.
Big data classification using neural network
Multiple-Layer Networks and Backpropagation Algorithms
Extreme Learning Machine
One-layer neural networks Approximation problems
Prof. Carolina Ruiz Department of Computer Science
ECE 471/571 - Lecture 17 Back Propagation.
Artificial Intelligence Methods
Artificial Neural Network & Backpropagation Algorithm
Synaptic DynamicsII : Supervised Learning
network of simple neuron-like computing elements
Artificial Neural Networks
An Introduction To The Backpropagation Algorithm
Ch4: Backpropagation (BP)
2. Matrix-Vector Formulation of Backpropagation Learning
Artificial Intelligence 10. Neural Networks
Nonlinear Conjugate Gradient Method for Supervised Training of MLP
Ch4: Backpropagation (BP)
Akram Bitar and Larry Manevitz Department of Computer Science
Prof. Carolina Ruiz Department of Computer Science
Presentation transcript:

Pseudoinverse Learning Algorithm for Feedforward Neural Networks Guo, Ping Department of Computer Science & Engineering, The Chinese University of Hong Kong, Hong Kong Supervisor: Professor Michael Lyu Markers: Professor L.W. Chan and I. King June 11, 2015

2 Introduction RFeedforward Neural Network 1Widely used for pattern classification and universal approximation 1Supervised learning task 1Back propagation algorithm used to train the neural network 1Poor convergence rate and local minima problem 1Learning factors problem ( learning rate, momentum constant) 1Time-consuming computation for some task by BP 1 Pseudoinverse Learning Algorithm 1 Batch-way learning 1 Matrix inner product and pseudoinverse operation

3 Network Structure (a) R Multilayer Neural Network (Mathematics Expression) R Input matrix:, output matrix: R Connect weight matrix R Nonlinear activate function R Network Mapping Function (with two hidden layers)

4 Network Structure (b) R Multilayer Neural Network (Mathematics Expression)  Denote l -th layer output R Network output: R To find the weigh matrices based on training data set

5 Pseudoinverse Solution (a) RExistence of the Solution  Linear Algebra Theorem: RBest Approximation Solution (Theorem) ÕThe best solution for is Õ Pseudoinverse solution

6 Pseudoinverse Solution (b) RMinimize error function RLearning Task  If Y is full rank, above equation will be held  Learning task becomes to raise the rank of Y.

7 Pseudoinverse Learning Algorithm 1.Let 2.Compute 3. Yes, go to 6. No, next step 4.Let feed this as input to next layer, compute 5.Compute and go to step 3 6.Let 7.Stop training. Real network output is

8 Add and Delete Sample (b) Computation efficiently Griville ’ s Theorem Add a sample: From (k-1)-th to calculate k-th pseudoinverse matrix

9 Add and Delete Sample (b) Computation efficiently Delete a sample: Delete a sample: From (k+1)-th to calculate k-th pseudoinverse matrix Let Bordering algorithm:

10 Numerical Examples (a) Function Mapping (1) Sin(x) (smooth function) (2) Nonlinear function: 8-D input, 3-D output (3) Smooth function (4) Piecewise smooth function

11 Numerical Examples (b) Function Mapping Table 1 Generalization ability test results. 20 training samples, 100 test samples Input rangeGeneralized ERMSEMax deviation Example 10-2p Example Example 30-p Table 2 Generalization ability test results. 5 or 50 training samples, 100 test samples Input rangeTest no. NGeneralized ERMSEMax deviation Example 10-2p x x x10 -5 Example 40- 2p

12 Numerical Examples (c) Function Mapping “ * ”— training data, “ o ” – test data Input Output * * * * * * * * * * * * * * * * * * * * o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o Input Output * * * * * * * * * * * * * * * * * * * * o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o oo o o o o o o o o Input Output * * * * * * * * * * * * * * * * * * * * o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o Input Output * * * * * o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o oo o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o Example 1Example 3 Example 4, 5 training samplesExample 4, 20 training samples

13 Numerical Examples (d) Real world data set Software reliability growth model -- Sys1 data Total 54 samples, partitioned data into training samples (37) and test samples (17). “ * ”— training data, “ o ” – test data

14 Numerical Examples (e) Real world data set Software reliability growth model -- Sys1 data “ o ”— level-0 output, “ + ” – level-1 output. Stacked generalization test, level-0 output is the level-1 input. Generalization is poor

15 Discussion [Local minima can be avoided by certain initialization. [No user selected parameter, “learning factor” problem is avoided. [Differentiable activate function is not necessary [Batch way learning, speed is fast [Provide an effective method to investigate some computation-intensive techniques [Further work: to find the techniques for generalization when noise data presented.

16 Thanks End of Presentation Q & A June 11, 2015