Least-squares-based Multilayer perceptron training with weighted adaptation
Download
1 / 22

EE 690 Design of Embodied Intelligence - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

Least-squares-based Multilayer perceptron training with weighted adaptation -- Software simulation project. EE 690 Design of Embodied Intelligence. Outline. Multilayer Perceptron Least-squares based Learning Algorithm Weighted Adaptation in training

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' EE 690 Design of Embodied Intelligence' - roana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Least-squares-based Multilayer perceptron training with weighted adaptation-- Software simulation project

EE 690

Design of Embodied Intelligence


Outline
Outline weighted adaptation

  • Multilayer Perceptron

  • Least-squares based Learning Algorithm

  • Weighted Adaptation in training

  • Signal-to-Noise Ratio Figure and Overfitting

  • Software simulation project


Multilayer perceptron mlp

Inputs weighted adaptationx

Outputs z

Multilayer perceptron (MLP)

Feedforward (no recurrent connections) network with units arranged in layers


Multilayer perceptron mlp1

MLP weighted adaptation

Multilayer perceptron (MLP)

  • Efficient mapping from inputs to outputs

  • Powerful universal function approximation

  • Number of inputs and outputs determined by the data

  • Number of hidden neurons

  • Number of hidden layers

outputs

inputs


Multilayer perceptron learning

Back-propagation (BP) weighted adaptation training algorithm: how much each weight is responsible for the error signal

BP has two phases:

Forward pass phase: feedforward propagation of input signals through network

Backward pass phase: propagates the error backwards through network

output layer

input layer

hidden layer

Multilayer Perceptron Learning


Multilayer Perceptron Learning weighted adaptation

  • Backward Pass

    We want to know how to modify weights in order to decrease E.

  • Use gradient descent:

  • Gradient-based adjustment could go to local minima

  • Time-consuming due to large number of learning steps and the step size needs to be configured


Least squares based learning algorithm

Optimized weights weighted adaptation

Optimized signals

Least-squares based Learning Algorithm

  • Least-squared fit (LSF): to obtain the minimum sum of squared error

  • For underdetermined problem, LSF finds the solution with the minimum SSE

  • For overdetermined problem, pseudo-inverse finds the solution with minimum norm

  • Can be applied in the optimization for weights or signals on the layers


Least squares based learning algorithm i

b1 weighted adaptation

b2

W1

W2

d

x

z2

y2

y1

z1

Least-squares based Learning Algorithm (I)

  • Start with desired output signal back-propagation

  •  signals optimization

  • Propagation of the desired outputs back through layers

  • Optimization of the weights between layers

(1). y2=f -1(z2), scale y1 to (-1, 1).

(2). Based on W2,b2:W2.z1=y2-b2.

(3). y1=f-1(z1), scale y1 to (-1, 1).

(4). Optimize W1, b1 to satisfy W1.x-b1=y1.

(5). Evaluate z1, y1 using the new W1 and bias b1.

(6). Optimize W2, b2 to satisfy W2.z1+b2=y2.

(7). Evaluate z2, y2 using the new W2 and bias b2.

(8). Evaluate the MSE


Least squares based learning algorithm i1

Optimize weighted adaptationW1, b1 to satisfy W1.x=y1-b1

Least-squares based Learning Algorithm (I)

  • Weights optimization with weighted LSF

    The location of x on the transfer function determines its effect on output signal of this layer

    dy/dx weighting term in LSF

Δy

Δy

Δx

Δx

Weighted LSF


Least squares based learning algorithm ii

x weighted adaptation

Least-squares based Learning Algorithm (II)

II. Weights optimization with iterative fitting

W1 can be further adjusted based on the output error

Each hidden neuron: basis function

Start with the 1st hidden neurons, and continue to other neurons

as long as eout exists


Least squares based learning algorithm iii

b1 weighted adaptation

b2

W1

W2

d

x

z2

y2

y1

z1

Least-squares based Learning Algorithm (III)

  • III. Start with input feedforward  weights optimization

  • Propagation of the inputs forward through layers

  • Optimization of the weights between layers and signals on layers

(1). Evaluate z1, y1 using the initial W1 and bias b1.

(2). y2=f -1(d).

(3). Optimize W2, b2 to satisfy W2.z1+b2=y2.

(4). Based on W2,b2, optimize z1 to satisfy

W2.z1-b2=y2.

(5). y1=f-1(z1).

(6). Optimize W1, b1 to satisfy W1.x+b1=y1.

(7). Evaluate y1, z1, y2, z2 using the new W1,W2

and bias b1,b2.

(8). Evaluate the MSE


Least squares based learning algorithm iii1
Least-squares based Learning Algorithm (III) weighted adaptation

y

  • Signal optimization with weighted adaptation

    The location of x on the transfer function determines how much the signal can be changed

x


Overfitting problem
Overfitting problem weighted adaptation

  • Learning algorithm can adapt MLP to fit into the training data.

  • For the noisy training data, how well we should learn into the data?

  • Overfitting

  • Number of hidden neurons

    Number of layers

     affect the training accuracy, determined by users: critical

  • Optimized Approximation Algorithm –SNRF criterion


Signal to noise ratio figure snrf
Signal-to-noise ratio figure (SNRF) weighted adaptation

  • Sampled data: function value + noise

  • Error signal:

    approximation error component + noise component

Noise part

Should not be learned

Useful signal

Should be reduced

  • Assumption: continuous function & WGN as noise

  • Signal-to-noise ratio figure (SNRF):

    signal energy/noise energy

  • Compare SNRFe and SNRFWGN

Learning should stop – ?

If there is useful signal left unlearned

If noise dominates in the error signal


Signal to noise ratio figure snrf1

Error signal weighted adaptation

Training data and approximating function

Signal-to-noise ratio figure (SNRF)

noise component

approximation error component

+


Optimization using snrf
Optimization using SNRF weighted adaptation

  • SNRFe< threshold SNRFWGN

  • Start with small network (small # of neurons or layers)

  • Train the MLP  etrain

  • Compare SNRFe & SNRFWGN

  • Add hidden neurons

Noise dominates in the error signal,

Little information left unlearned,

Learning should stop

Stopping criterion:

SNRFe< threshold SNRFWGN


Optimization using snrf1
Optimization using SNRF weighted adaptation

  • Set the structure of MLP

  • Train the MLP with back-propagation iteration

     etrain

  • Compare SNRFe & SNRFWGN

  • Keep training with more iterations

Applied in optimizing number of iterations in back-propagation training to avoid overfitting (overtraining)


Software simulation project

M x N matrix: weighted adaptation“Features”

1 x N vector: “Values”

Software simulation project

  • Prepare the data

  • Data sample along the row: N samples

  • Features along the column: M features

  • Desired output in a row vector: N values

  • Save “features” and “values” in a training MAT file

  • How to recall the function

  • Run “main_MLP_LS.m”

  • Specify MAT file path and name and MLP parameters in command window.


Software simulation project1
Software simulation project weighted adaptation

  • Input the path where data file can be found (C:*): E:\Research\MLP_LSInitial_desired\MLP_LS_package\

  • Input the name of data file (*.mat): mackey_glass_data.mat

  • There are overall 732 samples. How do you like to divide them into training and testing set?

    Number of training samples: 500

    Number of testing samples: 232

  • How many layers does MLP have? 3:2:7

  • How many neurons there are on each hidden layer ? 3:1:10

  • What kind of tranfer function you like to have on hidden neurons?

  • 0. Linear tranfer function

  • 1. Tangent sigmoid

  • 2. Logrithmic sigmoid

  • 2


Software simulation project2

b1 weighted adaptation

W1

b2

W2

d

x

z2

y2

y1

z1

Software simulation project

  • There are 4 types of training algorithms you can choose from. Which type you like to use?

  • 1. Least-squared based training (I)

  • 2. Least-squared based training with iterative neuron fitting (II)

  • 3. Least-squared based training with weighted signal adaptation (III)

  • 4. Back-propagation training (BP)

  • 1

  • How many iterations you would like to have in the training ? 3

  • How many Monte-Carlo runs you would like to have for the training? 2


Software simulation project3
Software simulation project weighted adaptation

  • Results:

    J_train (num_layer, num_neuron)

    J_test (num_layer, num_neuron)

    SNRF (num_layer, num_neuron)

  • Present training and testing errors for various configurations of the MLP

  • Present the optimum configuration found by SNRF

  • Present the comparison of the results, including errors, network structure


Software simulation project4
Software simulation project weighted adaptation

  • Typical database and literature survey

  • Function approximation & classification dataset

    “IEEE Neural Networks Council Standards Committee Working Group on Data modeling Benchmarks”

    http://neural.cs.nthu.edu.tw/jang/benchmark/#MG

    “Neural Network Databases and Learning Data”

    http://www.neoxi.com/NNR/Neural_Network_Databases.php

    “UCI Machine Learning Repository”

    http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Data are normalized

  • Multiple input, with signal output.

  • For multiple output data, use separate MLPs.

  • Compare results from literature which uses the same dataset (*)


ad