EE 690 Design of Embodied Intelligence

Least-squares-based Multilayer perceptron training with weighted adaptation-- Software simulation project EE 690 Design of Embodied Intelligence

Outline • Multilayer Perceptron • Least-squares based Learning Algorithm • Weighted Adaptation in training • Signal-to-Noise Ratio Figure and Overfitting • Software simulation project

Inputs x Outputs z Multilayer perceptron (MLP) Feedforward (no recurrent connections) network with units arranged in layers

MLP Multilayer perceptron (MLP) • Efficient mapping from inputs to outputs • Powerful universal function approximation • Number of inputs and outputs determined by the data • Number of hidden neurons • Number of hidden layers outputs inputs

Back-propagation (BP) training algorithm: how much each weight is responsible for the error signal BP has two phases: Forward pass phase: feedforward propagation of input signals through network Backward pass phase: propagates the error backwards through network output layer input layer hidden layer Multilayer Perceptron Learning

Multilayer Perceptron Learning • Backward Pass We want to know how to modify weights in order to decrease E. • Use gradient descent: • Gradient-based adjustment could go to local minima • Time-consuming due to large number of learning steps and the step size needs to be configured

Optimized weights Optimized signals Least-squares based Learning Algorithm • Least-squared fit (LSF): to obtain the minimum sum of squared error • For underdetermined problem, LSF finds the solution with the minimum SSE • For overdetermined problem, pseudo-inverse finds the solution with minimum norm • Can be applied in the optimization for weights or signals on the layers

b1 b2 W1 W2 d x z2 y2 y1 z1 Least-squares based Learning Algorithm (I) • Start with desired output signal back-propagation •  signals optimization • Propagation of the desired outputs back through layers • Optimization of the weights between layers (1). y2=f -1(z2), scale y1 to (-1, 1). (2). Based on W2,b2:W2.z1=y2-b2. (3). y1=f-1(z1), scale y1 to (-1, 1). (4). Optimize W1, b1 to satisfy W1.x-b1=y1. (5). Evaluate z1, y1 using the new W1 and bias b1. (6). Optimize W2, b2 to satisfy W2.z1+b2=y2. (7). Evaluate z2, y2 using the new W2 and bias b2. (8). Evaluate the MSE

Optimize W1, b1 to satisfy W1.x=y1-b1 Least-squares based Learning Algorithm (I) • Weights optimization with weighted LSF The location of x on the transfer function determines its effect on output signal of this layer dy/dx weighting term in LSF Δy Δy Δx Δx Weighted LSF

x Least-squares based Learning Algorithm (II) II. Weights optimization with iterative fitting W1 can be further adjusted based on the output error Each hidden neuron: basis function Start with the 1st hidden neurons, and continue to other neurons as long as eout exists

b1 b2 W1 W2 d x z2 y2 y1 z1 Least-squares based Learning Algorithm (III) • III. Start with input feedforward  weights optimization • Propagation of the inputs forward through layers • Optimization of the weights between layers and signals on layers (1). Evaluate z1, y1 using the initial W1 and bias b1. (2). y2=f -1(d). (3). Optimize W2, b2 to satisfy W2.z1+b2=y2. (4). Based on W2,b2, optimize z1 to satisfy W2.z1-b2=y2. (5). y1=f-1(z1). (6). Optimize W1, b1 to satisfy W1.x+b1=y1. (7). Evaluate y1, z1, y2, z2 using the new W1,W2 and bias b1,b2. (8). Evaluate the MSE

Least-squares based Learning Algorithm (III) y • Signal optimization with weighted adaptation The location of x on the transfer function determines how much the signal can be changed x

Overfitting problem • Learning algorithm can adapt MLP to fit into the training data. • For the noisy training data, how well we should learn into the data? • Overfitting • Number of hidden neurons Number of layers  affect the training accuracy, determined by users: critical • Optimized Approximation Algorithm –SNRF criterion

Signal-to-noise ratio figure (SNRF) • Sampled data: function value + noise • Error signal: approximation error component + noise component Noise part Should not be learned Useful signal Should be reduced • Assumption: continuous function & WGN as noise • Signal-to-noise ratio figure (SNRF): signal energy/noise energy • Compare SNRFe and SNRFWGN Learning should stop – ? If there is useful signal left unlearned If noise dominates in the error signal

Error signal Training data and approximating function Signal-to-noise ratio figure (SNRF) noise component approximation error component +

Optimization using SNRF • SNRFe< threshold SNRFWGN • Start with small network (small # of neurons or layers) • Train the MLP  etrain • Compare SNRFe & SNRFWGN • Add hidden neurons Noise dominates in the error signal, Little information left unlearned, Learning should stop Stopping criterion: SNRFe< threshold SNRFWGN

Optimization using SNRF • Set the structure of MLP • Train the MLP with back-propagation iteration  etrain • Compare SNRFe & SNRFWGN • Keep training with more iterations Applied in optimizing number of iterations in back-propagation training to avoid overfitting (overtraining)

M x N matrix: “Features” 1 x N vector: “Values” Software simulation project • Prepare the data • Data sample along the row: N samples • Features along the column: M features • Desired output in a row vector: N values • Save “features” and “values” in a training MAT file • How to recall the function • Run “main_MLP_LS.m” • Specify MAT file path and name and MLP parameters in command window.

Software simulation project • Input the path where data file can be found (C:*): E:\Research\MLP_LSInitial_desired\MLP_LS_package\ • Input the name of data file (*.mat): mackey_glass_data.mat • There are overall 732 samples. How do you like to divide them into training and testing set? Number of training samples: 500 Number of testing samples: 232 • How many layers does MLP have? 3:2:7 • How many neurons there are on each hidden layer ? 3:1:10 • What kind of tranfer function you like to have on hidden neurons? • 0. Linear tranfer function • 1. Tangent sigmoid • 2. Logrithmic sigmoid • 2

b1 W1 b2 W2 d x z2 y2 y1 z1 Software simulation project • There are 4 types of training algorithms you can choose from. Which type you like to use? • 1. Least-squared based training (I) • 2. Least-squared based training with iterative neuron fitting (II) • 3. Least-squared based training with weighted signal adaptation (III) • 4. Back-propagation training (BP) • 1 • How many iterations you would like to have in the training ? 3 • How many Monte-Carlo runs you would like to have for the training? 2

Software simulation project • Results: J_train (num_layer, num_neuron) J_test (num_layer, num_neuron) SNRF (num_layer, num_neuron) • Present training and testing errors for various configurations of the MLP • Present the optimum configuration found by SNRF • Present the comparison of the results, including errors, network structure

Software simulation project • Typical database and literature survey • Function approximation & classification dataset “IEEE Neural Networks Council Standards Committee Working Group on Data modeling Benchmarks” http://neural.cs.nthu.edu.tw/jang/benchmark/#MG “Neural Network Databases and Learning Data” http://www.neoxi.com/NNR/Neural_Network_Databases.php “UCI Machine Learning Repository” http://www.ics.uci.edu/~mlearn/MLRepository.html • Data are normalized • Multiple input, with signal output. • For multiple output data, use separate MLPs. • Compare results from literature which uses the same dataset (*)

EE 690 Design of Embodied Intelligence

EE 690 Design of Embodied Intelligence

Presentation Transcript

EE 367 Logic Design

EE 367 Logic Design

EE 466: VLSI Design

Embodied Foresight Intelligence Analysis

EMBODIED INTELLIGENCE

EE 466: VLSI Design

EE 4951 – Design Workshop

EE 4325 VLSI DESIGN

Embodied Computing

EE 447 VLSI Design

EE 241 Design Problem

EE 613 VLSI Design

Embodied Intelligence

EE Detailed Design Review

Embodied histories

Sociology 690

EE 367 – Logic Design

EE 4271 VLSI Design

EE 690 Design of Embodied Intelligence

EE 4271 VLSI Design

EE 466: VLSI Design