Soft computing

1 / 26

# Soft computing - PowerPoint PPT Presentation

Soft computing. Lecture 7 Multi-Layer perceptrons. Why hidden layer is needed. Problem of XOR for simple perceptron. X 2. (0,1). (1,1). Class 1. Class 2. Class 1. Class 2. (0,0). (1,0). X 1. In this case it is not possible to draw descriminant line. Minimization of error.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Soft computing

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Soft computing

Lecture 7

Multi-Layer perceptrons

Problem of XOR for simple perceptron

X2

(0,1)

(1,1)

Class 1

Class 2

Class 1

Class 2

(0,0)

(1,0)

X1

In this case it is not possible to draw descriminant line

Kinds of sigmoid used in perceptrons

Exponential

Rational

Hyperbolic tangent

Formulas for error back propagation algorithm

(1)

Modification of weights of synapses

of jthneuron connected with ith ones,

xj – state of jthneuron (output)

(2)

For output layer

(3)

For hidden layers

k – number of neuron in next layer

connected with jth neuron

(2), (1)

(1)

(3), (1)

(1)

Example of implementation

TNN=Class(TObject)

public

State:integer;

N,NR,NOut,NH:integer;

a:real;

Step:real;

NL:integer; // ъюы-тю шЄхЁрЎшщ яЁш юсєўхэшш

S1:array[1..10000] of integer;

S2:array[1..200] of real;

S3:array[1..5] of real;

G3:array[1..5] of real;

LX,LY:array[1..10000] of integer;

W1:array[1..10000,1..200] of real;

W2:array[1..200,1..5] of real;

W1n:array[1..10000,1..200] of real;

W2n:array[1..200,1..5] of real;

SymOut:array[1..5] of string[32];

procedure FormStr;

procedure Learn;

procedure Work;

procedure Neuron(i,j:integer);

end;

Procedure of simulation of neuron;

procedure TNN.Neuron(i,j:integer);

var

k:integer; Sum:real;

begin

case i of

1: begin

if Form1.PaintBox1.Canvas.Pixels[LX[j],LY[j]]= clRed

then S1[j]:=1

else S1[j]:=0;

end;

2: begin

Sum:=0.0;

for k:=1 to NR do

Sum:=Sum + S1[k]*W1[k,j];

if Sum> 0 then S2[j]:=Sum/(abs(Sum)+Net.a)

else S2[j]:=0;

end;

3: begin

Sum:=0.0;

for k:=1 to NH do

Sum:=Sum + S2[k]*W2[k,j];

if Sum> 0 then S3[j]:=Sum/(abs(Sum)+Net.a)

else S3[j]:=0;

end;

end;

end;

Fragment of procedure of learning

For i:=1 to NR do

for j:=1 to NH do

begin

S:=0;

for k:=1 to NOut do

begin

if (S3[k]>0) and (S3[k]<1) then

D:=S3[k]*(1-S3[k])

else

D:=1;

W2n[j,k]:=W2[j,k]+Step*S2[j]*(G3[k]-S3[k])*D;

S:=S+D*(G3[k]-S3[k])*W2[j,k]

end;

if (S2[j]>0) and (S2[j]<1) then

D:=S2[j]*(1-S2[j])

else

D:=1;

S:=S*D;

W1n[i,j]:=W1[i,j]+Step*S*S1[i];

end;

end;

Some of the test data are now misclassified. The problem is that the network, with two hidden units, now has too much freedom and has fitted a decision surface to the training data which follows its intricacies in pattern space without extracting the underlying trends.

• Classification (recognition)
• Usually binary outputs
• Regression (approximation)
• Analog outputs
Theorem of Kolmogorov

“Any continuous function from input to output can be implemented in a three-layer net, given sufficient number of hidden units nH, proper nonlinearities, and weights.”

• Guarantee of possibility of solving of tasks
• Low speed of learning
• Possibility of overfitting
• Impossible to relearning
• Selection of structure needed for solving of concrete task is unknown
Increase of speed of learning
• Preliminary processing of features before getting to inputs of percepton
• Dynamical step of learning (in begin one is large, than one is decreasing)
• Using of second derivative in formulas for modification of weights
• Using hardware implementation
Fight against of overfitting
• Don’t select too small error for learning or too large number of iteration
Choice of structure
• Using of constructive learning algorithms
• Deleting of nodes (neurons) and links corresponding to one
• Appending new neurons if it is needed
• Using of genetic algorithms for selection of suboptimal structure
Impossible to relearning
• Using of constructive learning algorithms
• Deleting of nodes (neurons) and links corresponding to one
• Appending new neurons if it is needed