ECE 539 Project

ECE 539 Project Kalman Filter Based Algorithms for Fast Training of Multilayer Perceptrons: Implementation and Applications • Dan Li • Spring, 2000

Introduction • Multilayer perceptron (MLP) • A feedforward neural network model • Extensively used in pattern classification • Essential issue: training/learning algorithm • MLP training algorithms • Error backpropogation (EBP) • A conventional iterative gradient algorithm • Easy to implement • Long and uncertain training process • An algorithm proposed by Scalero and Tepedelenlioglu [1]: S.T. Algorithm (based on Kalman filter techniques) • Modified S.T. algorithm proposed by Wang and Chen [2] : Layer-by-layer (LBL) Algorithm (based on Kalman filter techniques)

1 1 u1 y1 v1 Fo(.) Fh(.) z1   x1 u2 v2 y2 x2 z2 Fo(.) Fh(.)   xM uH vH yH Fo(.) zN Fh(.)   . . . . . . . . . . . . . . . . . . EBP Algorithm For the hidden layer For the output layer

- - - 1 1 + + + e e e y1 u1 v1 Fo(.) Fh(.) z1   - x1 + F-1o(.) t1 u1* v1* e u2 y2 v2 x2 z2 Fo(.) Fh(.)   - + t2 F-1o(.) v2* u2* e xM yH uH vN Fo(.) zN Fh(.) . . .   - . . . . . . . . . + . . . F-1o(.) tN . . . uM* vN* e S.T. Algorithm For the hidden layer For the output layer

1 1 u1 v1 y1 Fo() Fh() z1   - - + + x1 y1* F-1o() F-1h() t1 v1* u1* e e y2 v2 u2 x2 z2 Fo() Fh()   - - + + y2* F-1o() t2 F-1h() v2* u2* e e xM yH vN uH Fo() Fh() zN   - - + + yH* F-1h() F-1o() tN . . . vN* . . . uN* . . . e e . . . . . . . . . LBL Algorithm For the hidden layer For the output layer

Learning Curve 4.5 4 3.5 3 MSE LBL EBP 2.5 S.T. 2 1.5 1 0 200 400 600 800 1000 Epoch Experiment #1: 4-4 Encoding/Decoding • MLP Structure: 4-3-4; =0.16 • EBP: =0.3; =0.8; • S.T.: =0.3; H= o=0.9; • LBL: =0.15; H= o=0.9;

Experiment #2: Pattern Classification (IRIS) 4 input features 3 classes (001, 010, 100) 75 training patterns 75 testing patterns • MLP Structure: 4-3-3; =0.01 • EBP: =0.3; =0.8; • S.T.: =20; H= o=0.9;

Experiment #3: Pattern Classification (wine) 13 input features 3 classes (001, 010, 100) 60 training patterns 118 testing patterns • MLP Structure: 13-15-3; • EBP: =0.3; =0.8; • S.T.: =20; H= o=0.9; • LBL: =0.2; H= o=0.9;

Learning Curve 25 10 20 20 30 40 15 EBP (bat) 50 MSE 60 10 EBP (seq) LBL (seq) 20 40 60 LBL (bat) 5 0 0 100 200 300 400 500 Epoch LBL (bat) EBP (seq) EBP (bat) LBL (seq) 10 10 10 10 20 20 20 20 30 30 30 30 40 40 40 40 50 50 50 50 60 60 60 60 20 40 60 20 40 20 40 60 60 20 40 60 Experiment #4: Image Restoration • Raw image 64  648 bit • MLP structure: 64-16-64 • EBP: =0.3; =0.8; • S.T.: =0.3; H= o=0.9; • LBL: =0.15; H= o=0.9;

1 32 256 1 32 1 64 256 1 64 256 256 Experiment #5: Image Reconstruction (I) Original Image (2562568 bit) * Schemes of selecting training subsets (shaded area) A 32 input features B 64 input features

Restored: LBL (bat) 60 Restored: LBL (seq) 50 EBP (bat) 40 MSE 30 EBP (seq) LBL (seq) LBL (bat) 20 10 0 0 50 100 150 200 Restored: EBP (seq) Epoch Experiment #5: Image Reconstruction (II) Scheme A • MLP structure: 32-16-32 • Convergence threshold: MSE=5 • EBP: =0.3; =0.8; • LBL: =0.15; H= o=0.9;

Restored: EBP (seq) Restored: LBL (bat) Restored: LBL (seq) 90 80 70 EBP (bat) 60 50 EBP (seq) MSE 40 30 LBL (seq) LBL (bat) 20 10 0 0 50 100 150 200 Epoch Experiment #5: Image Reconstruction (III) Scheme B • MLP structure: 64-32-64 • Convergence threshold: MSE=5 • EBP: =0.3; =0.8; • LBL: =0.15; H= o=0.9;

80 70 60 50 LBL (bat) EBP (bat) MSE 40 ST (seq) EBP (seq) EBP (seq) LBL (seq) 30 20 10 0 0 20 40 60 80 100 Epoch Restored: EBP (seq) Restored: S.T. (seq) Restored: LBL (seq) Experiment #5: Image Reconstruction (IV) Scheme A, Noisy Image for Training • MLP structure: 32-16-32 • Convergence threshold: MSE=5 • EBP: =0.3; =0.8; • LBL: =0.15; H= o=0.9;

Conclusions • Compared with EBP algorithm, Kalman-filter-based S.T. and LBL algorithms generally induce a lower MSE in the training process in a significantly smaller number of epochs. • However, the CPU time needed to run one iteration is longer for the S.T. and LBL algorithms, due to the computation of Kalman gain, the inverse of correlation matrices and the (pseudo)inverse of the output in each layer. LBL often required even longer computation time than the S.T. algorithm. • Therefore, the total computation time required is determined by the user’s demand: how well the training result would you like? This is in fact the issue of assigning the “convergence threshold of MSE”. Our examples showed that in various applications, the choice of this threshold generally results a shorter overall training time for the Kalman-filter-based method than for the EBP method. • There is no definite answer to the question “which algorithm converges faster, the LBL or the S.T.?”. Essentially it is case-related. Especially in the S.T. algorithm, the learning rate has a more flexible range not bounded to [0, 1], in contrast to the EBP algorithm.

References • Robert S. Scalero and Nazif Tepedelenlioglu, “A fast new algorithm for training feedforward neural networks”, IEEE Transactions on Signal Processing, Vol. 40, No. 1, pp. 202-210, 1992. • Gou-Jen Wang and Chih-Cheng Chen, “A fast multilayer neural-network training algorithm based on the layer-by-layer optimizing procedures”, IEEE Transactions on Neural Networks, Vol. 7, No. 3, pp. 768-775, 1996. • Brijesh Verma, “Fast training of multilayer perceptrons”, IEEE Transactions on Neural Networks, Vol. 8, No. 6, pp. 1314-1320, 1997. • Adriana Dumitras and Vasile Lazarescu, “The influence of the MLP’s output dimension on its performance in image restoration”, ISCAS ’96, Vol. 1, pp. 329-332

ECE 539 Project

ECE 539 Project

Presentation Transcript

ECE 002 Final Project

ECE 345 - Senior Design Project

ECE 539 Final Project

ECE 533 Project

ECE Senior Design Project

ECE/CS/ME 539 Artificial Neural Networks Final Project

Project 3 Neural Networks CS 539

ECE 647 TERM PROJECT

ECE 408 Final Project

Project 7 Genetic algorithms CS 539

ECE 533 Project Tribute

ECE 3551 Final Project

ECE 363 Design Project

ECE 300 Microprocessor Project

ECE 734: Project Presentation

ECE 355 Project Pointers

ECE 533 Final Project

ECE-1021 Instructor’s Project

ECE 539 191M Introduction to Antennas and Antenna Systems

ECE Advocacy Project

539

IEEE ECE Project