Download presentation
Presentation is loading. Please wait.
Published byFrancine Hardy Modified over 9 years ago
1
ECE 539 Project Kalman Filter Based Algorithms for Fast Training of Multilayer Perceptrons: Implementation and Applications Dan Li Spring, 2000
2
Introduction Multilayer perceptron (MLP) MLP training algorithms
A feedforward neural network model Extensively used in pattern classification Essential issue: training/learning algorithm MLP training algorithms Error backpropogation (EBP) A conventional iterative gradient algorithm Easy to implement Long and uncertain training process An algorithm proposed by Scalero and Tepedelenlioglu [1]: S.T. Algorithm (based on Kalman filter techniques) Modified S.T. algorithm proposed by Wang and Chen [2] : Layer-by-layer (LBL) Algorithm (based on Kalman filter techniques)
3
EBP Algorithm . x1 x2 xM 1 Fh(.) Fo(.) u1 u2 uH y1 y2 yH z1 z2 zN v1
vH . For the hidden layer For the output layer
4
S.T. Algorithm . x1 x2 xM 1 Fh(.) Fo(.) u1 u2 uH y1 y2 yH z1 z2 zN
tN v1 v2 vN - + e v1* v2* vN* u1* u2* uM* . For the hidden layer For the output layer
5
LBL Algorithm . x1 x2 xM 1 Fh() Fo() u1 u2 uH y1 y2 yH z1 z2 zN
tN v1 v2 vN - + e v1* v2* vN* F-1h() u1* u2* uN* y1* y2* yH* . For the hidden layer For the output layer
6
Experiment #1: 4-4 Encoding/Decoding
200 400 600 800 1000 1 1.5 2 2.5 3 3.5 4 4.5 Learning Curve Epoch MSE EBP S.T. LBL MLP Structure: 4-3-4; =0.16 EBP: =0.3; =0.8; S.T.: =0.3; H= o=0.9; LBL: =0.15; H= o=0.9;
7
Experiment #2: Pattern Classification (IRIS)
4 input features 3 classes (001, 010, 100) 75 training patterns 75 testing patterns MLP Structure: 4-3-3; =0.01 EBP: =0.3; =0.8; S.T.: =20; H= o=0.9;
8
Experiment #3: Pattern Classification (wine)
13 input features 3 classes (001, 010, 100) 60 training patterns 118 testing patterns MLP Structure: ; EBP: =0.3; =0.8; S.T.: =20; H= o=0.9; LBL: =0.2; H= o=0.9;
9
Experiment #4: Image Restoration
100 200 300 400 500 5 10 15 20 25 Learning Curve Epoch MSE EBP (bat) EBP (seq) LBL (seq) LBL (bat) 20 40 60 10 30 50 Raw image 64 648 bit MLP structure: EBP: =0.3; =0.8; S.T.: =0.3; H= o=0.9; LBL: =0.15; H= o=0.9; 20 40 60 10 30 50 LBL (bat) LBL (seq) EBP (bat) EBP (seq)
10
Experiment #5: Image Reconstruction (I)
Original Image (2562568 bit) * Schemes of selecting training subsets (shaded area) A 32 input features B 64 input features 1 32 256 1 64 256
11
Experiment #5: Image Reconstruction (II)
Scheme A Epoch 50 100 150 200 10 20 30 40 60 MSE EBP (bat) LBL (bat) LBL (seq) EBP (seq) MLP structure: Convergence threshold: MSE=5 EBP: =0.3; =0.8; LBL: =0.15; H= o=0.9; Restored: LBL (bat) Restored: LBL (seq) Restored: EBP (seq)
12
Experiment #5: Image Reconstruction (III)
50 100 150 200 10 20 30 40 60 70 80 90 Epoch MSE EBP (bat) EBP (seq) LBL (bat) LBL (seq) Scheme B MLP structure: Convergence threshold: MSE=5 EBP: =0.3; =0.8; LBL: =0.15; H= o=0.9; Restored: LBL (bat) Restored: LBL (seq) Restored: EBP (seq)
13
Experiment #5: Image Reconstruction (IV)
Scheme A, Noisy Image for Training 20 40 60 80 100 10 30 50 70 Epoch MSE ST (seq) EBP (seq) LBL (seq) LBL (bat) EBP (bat) MLP structure: Convergence threshold: MSE=5 EBP: =0.3; =0.8; LBL: =0.15; H= o=0.9; Restored: LBL (seq) Restored: S.T. (seq) Restored: EBP (seq)
14
Conclusions Compared with EBP algorithm, Kalman-filter-based S.T. and LBL algorithms generally induce a lower MSE in the training process in a significantly smaller number of epochs. However, the CPU time needed to run one iteration is longer for the S.T. and LBL algorithms, due to the computation of Kalman gain, the inverse of correlation matrices and the (pseudo)inverse of the output in each layer. LBL often required even longer computation time than the S.T. algorithm. Therefore, the total computation time required is determined by the user’s demand: how well the training result would you like? This is in fact the issue of assigning the “convergence threshold of MSE”. Our examples showed that in various applications, the choice of this threshold generally results a shorter overall training time for the Kalman-filter-based method than for the EBP method. There is no definite answer to the question “which algorithm converges faster, the LBL or the S.T.?”. Essentially it is case-related. Especially in the S.T. algorithm, the learning rate has a more flexible range not bounded to [0, 1], in contrast to the EBP algorithm.
15
References Robert S. Scalero and Nazif Tepedelenlioglu, “A fast new algorithm for training feedforward neural networks”, IEEE Transactions on Signal Processing, Vol. 40, No. 1, pp , 1992. Gou-Jen Wang and Chih-Cheng Chen, “A fast multilayer neural-network training algorithm based on the layer-by-layer optimizing procedures”, IEEE Transactions on Neural Networks, Vol. 7, No. 3, pp , 1996. Brijesh Verma, “Fast training of multilayer perceptrons”, IEEE Transactions on Neural Networks, Vol. 8, No. 6, pp , 1997. Adriana Dumitras and Vasile Lazarescu, “The influence of the MLP’s output dimension on its performance in image restoration”, ISCAS ’96, Vol. 1, pp
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.