Loading in 5 sec....

ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路PowerPoint Presentation

ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路

- 84 Views
- Uploaded on
- Presentation posted in: General

ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

National Cheng Kung University/WalsinLihwa Corp.

「Center for Research of E-life DIgital Technology」

成功大學/華新麗華「數位生活科技研究中心」

ISMP Lab 新生訓練課程Artificial Neural Networks 類神經網路

指導教授：郭耀煌 教授

碩士班學生:黃盛裕 96級

2008/7/18

- Introduction
- Single Layer Perceptron – Perceptron
- Example
- Single Layer Perceptron – Adaline
- Multilayer Perceptron – Back–propagation neural network
- Competitive Learning - Example
- Radial Basis Function (RBF) Networks
- Q&A and Homework

- Artificial Neural Networks
- simulate human brain
- approximate any nonlinear and complex functions accuracy

Fig.1

Fig.2

Table 1

Fig.3

- About 1011 neurons in human brain
- About 1014~15 interconnections
- Pulse-transmission frequency million times slower than electronic circuits
- Face recognition
- hundred million second by human
- Network of artificial neuron operation speed only a few million second

Pattern Recognition

Fig.4

Prediction

Economics

Optimization

VLSI

Neural

Networks

Control

Power & Energy

AI

Bioinformatics

Communication

Signal Processing

Image Processing

Successful apps can be found in well-constrained environment

None is flexible enough to perform well outside its domain.

Fig.5

- Pattern classification
- Clustering/categorization
- Function approximation
- Prediction/forecasting
- Optimization (TSP problem)
- Retrieval by content
- control

- Three periods of extensive activity
- 1940s:
- McCulloch and Pitts’ pioneering work

- 1960s:
- Rosenblatt’s perceptron convergence theorem
- Minsky and Papert’s showing the limitation of a simple perceptron

- 1980s:
- Hopfield’s energy approach in 1982
- Werbos’ Back-propagation learning algorithm

- McCulloch and Pitts propose MP neural model in 1943.
- Hebb learning rule.

Fig.7

Fig.6

- Introduction
- Single Layer Perceptron – Perceptron
- Example
- Single Layer Perceptron – Adaline
- Multilayer Perceptron – Back–propagation neural network
- Competitive Learning - Example
- Radial Basis Function (RBF) Networks
- Q&A and Homework

Weight (Synapse)

Baisθj

x1

w1j

x2

w2j

Summation function

Transfer function

Output Yj

wij

xi

……

Inputs

wn-1 j

xn-1

wn j

xn

Fig.8

The McCulloch-Pitts model (1949)

- An adder for summing the input signal, weighted by the respective synapses of the neuron.
- Summation
- Euclidean Distance

- An activation function for limiting the amplitude of the neuron of a neuron.
- Threshold (step) function
- Piecewise-Linear function

Threshold function

Yj

1

0

netj

Piecewise-Linear function

Yj

-0.5

0.5

netj

Yj

- Sigmoid function
- Radial Basis Function

Where a is the slop parameter of the sigmoid function.

-0.5

0.5

netj

Yj

1

Where a is the variance parameter of the radial basis function.

netj

-0.5

0.5

Fig.9 A taxonomy of feed-forward and recurrent/feedback network architectures.

- Feed-forward networks
- Static: produce only one set of output value
- Memory-less: independent of previous state

- Recurrent (or feedback) networks
- Dynamics system

- Different architectures require different appropriate learning algorithm

- The ability to learn is a fundamental trait of intelligent.
- Automatically learn from examples.
- Instead of following a set of rules specified by human experts.
- ANNs appear to learn underlying rules.
- This is the major advantages over traditional expert systems.

- Learning process
- Have a model of the environment
- Understand how network weights are updated

- Three main learning paradigms
- Supervised
- Unsupervised
- Hybrid

- Three fundamental and practical issue of Learning theory
- Capacity
- Patterns
- Functions
- Decision boundaries

- Sample complexity
- The number of training samples (over-fitting)

- Computational complexity
- Time required (many learning algorithms have high complexity)

- Capacity

- Three basic types of learning rules:
- Error-correction rules
- Hebbian rule
- If neurons on both sides of a synapse are activated synchronously and repeatedly, the synapse’s strength is selectively increased.

- Competitive learning rules

Table 2 Well-known learning algorithms.

Fig.10

- The threshold function:
- if v > 0 , then y = +1
- otherwise y = 0

- On-line (Sequential) mode:
- Update weights for each training data
- More accurate
- Require more computational time
- Faster learning convergence

- Off-line (Batch) mode:
- Update weights after apply all training data
- Less accurate
- Require less computational time
- Require extra storage

- However, a single-layer perceptron can only separate linearly separable patterns as long as a monotonic activation is used.
- The back-propagation learning algorithm is based on error-correction principle.

- Input layers are mapping in [-1,1].
- Output layers are mapping in [0,1]

- In 1957,A single-layer Perceptron network consists of 1 or more artificial neurons in parallel. Each neuron in the single layer provides one network output, and is usually connected to all of the external (or environmental) inputs.
- Supervised
- MP neuron model + Hebb learning

……

……

Fig.11

- Learning Algorithm
- output
- Adjust weight & bias
- Energy function

- Introduction
- Single Layer Perceptron – Perceptron
- Example
- Single Layer Perceptron – Adaline
- Multilayer Perceptron – Back–propagation neural network
- Competitive Learning - Example
- Radial Basis Function (RBF) Networks
- Q&A and Homework

- Use two-layer Perceptron to solve AND problem

Initial parameter

=0.1

=0.5

W13=1.0

W23=-1.0

X3

Fig.12

X1

X2

- 1st learning cycle
- Input 1st example
- X1=-1, X2=-1, T=0
- net=W13•X1 +W23•X2-=-0.5, Y=0
- =T-Y=0
- W13=X1=0, W23=0, =-=0

- Input 2nd~4th example

- Adjust weight & bias
- W13=1, W23=-0.8, =0.5

- 2nd learning cycle

- Adjust weight & bias
- W13=1, W23=-0.6, =0.5

- 3rd learning cycle

- Adjust weight & bias
- W13=1, W23=-0.4, =0.5

- 4th learning cycle

- Adjust weight & bias
- W13=0.9, W23=-0.3, =0.6

- 5th learning cycle

- Adjust weight & bias
- W13=0.9, W23=-0.1, =0.6

- 6th learning cycle

- Adjust weight & bias
- W13=0.8, W23=0, =0.7

- 7th learning cycle

- Adjust weight & bias
- W13=0.7, W23=0.1, =0.8

- 8th learning

- Adjust weight & bias
- W13=0.8, W23=0.2, =0.7

- 9th learning

- Adjust weight & bias
- W13=0.8, W23=0.2, =0.7

- 10th learning (no change, stop learning)

Fig.13

input value desired output value

- x1 = (1, 0, 1)T y1 = -1
- x2 = (0,−1,−1)T y2 = 1
- x3 = (−1,−0.5,−1)T y3 = 1
- the learning constant is assume to be 0.1
- The initial weight vector is w0 = (1, -1, 0)T

- Step 1:
- <w0, x1> = (1, -1, 0)*(1, 0, 1)T = 1
- Correction is needed since y1 = -1 ≠ sign (1)
- w1 = w0 + 0.1*(-1-1)*x1
- w1 = (1, -1, 0)T – 0.2*(1, 0, 1)T = (0.8, -1, -0.2)T

- Step 2:
- <w1, x2> = 1.2
- y2 = 1 = sign(1.2)
- w2 = w1

- Step 3:
- <w2, x3> = (0.8, -1, -0.2 )*(−1,−0.5,−1)T = -0.1
- Correction is needed since y3 = 1 ≠ sign (-0.1)
- w3 = w2 + 0.1*(1-(-1))*x3
- w3 = (0.8, -1, -0.2 )T– 0.2*(−1,−0.5,−1)T = (0.6, -1.1, -0.4)T

- Step 4:
- <w3, x1> = (0.6, -1.1, -0.4)*(1, 0, 1)T = 0.2
- Correction is needed since y1 = -1 ≠ sign (0.2)
- w4 = w3 + 0.1*(-1-1)*x1
- w4 = (0.6, -1.1, -0.4)T– 0.2*(1, 0, 1)T = (0.4, -1.1, -0.6)T

- W6terminates the learning process.
- <w6, x1> = -0.2 < 0
- <w6, x2> = 1.7 > 0
- <w6, x3> = 0.75 > 0

- Step 5:
- <w4, x2> = 1.7
- y2 = 1 = sign(1.7)
- w5 = w4

- Step 6:
- <w5, x3> = 0.75
- y3 = 1 = sign(0.75)
- w6 = w5

X1

- Architecture of Adaline
- Application
- Filter
- communication

- Learning algorithm (Least mean Square，LMS )
- Y= purelin(ΣWX-b)=W1X1+W2X2-b
- W(t+1)=W(t)+2ηe(t)X(t)
- b(t+1)=b(t)+2ηe(t)
- e(t)=T-Y

- Y= purelin(ΣWX-b)=W1X1+W2X2-b

Fig.14

W1

X2

W2

Y

Weight

-1

b

Input Layer

Output Layer

- XOR problem

1

1

1

○

○

○

○

×

×

-1

1

-1

1

-1

1

×

○

×

×

○

×

-1

-1

-1

OR

AND

XOR

- Introduction
- Single Layer Perceptron – Perceptron
- Example
- Single Layer Perceptron – Adaline
- Multilayer Perceptron – Back–propagation neural network
- Competitive Learning - Example
- Radial Basis Function (RBF) Networks
- Q&A and Homework

Fig. 15 Network architectures:

A taxonomy of feed-forward and recurrent/feedback network architectures.

Xq

Wqi(1)

Wij(2)

Wjk(L)

Yk(L)

x1

y1

x2

y2

xn

yn

Input layer

Hidden layer

Output layer

Fig. 16 A typical three-layer feed-forward network architecture.

- Most popular class
- Which can form arbitrarily complex decision boundaries and represent any Boolean function.
- Back-propagation

- Let
- Squared-error cost function
- A geometric interpretation

Fig.17

Input layer

Hidden layer

Output layer

Input Vector

Output Vector

‧‧‧

‧‧‧

‧‧

- In 1985
- Architecture

Fig.18

- Using Gradient Steepest Descent Method to reduce error.
- Energy function E = (1/2) (Tj-Yj)2

Output layer Hidden layer

Hidden layer Hidden layer

- Introduction
- Single Layer Perceptron – Perceptron
- Example
- Single Layer Perceptron – Adaline
- Multilayer Perceptron – Back–propagation neural network
- Competitive Learning - Example
- Radial Basis Function (RBF) Networks
- Q&A and Homework

- Know as winner-take-all method
- It’s an unsupervised learning
- Often clusters or categorizes the input data

- The simplest network

Fig.19

- A geometric interpretation of competitive learning

Fig. 20 (a) Before learning (b) after learning

Fig.21

- Introduction
- Single Layer Perceptron – Perceptron
- Example
- Single Layer Perceptron – Adaline
- Multilayer Perceptron – Back–propagation neural network
- Competitive Learning - Example
- Radial Basis Function (RBF) Networks
- Q&A and Homework

- A special class of feed-forward networks
- Origin: Cover’s Theorem
- Radial basis function (kernel function)
- Gaussian function

- Radial basis function (kernel function)

ψ1

x1

Fig.22

x2

ψ2

- There are a variety of learning algorithms for the RBF network
- Basic one is two-step learning strategy
- Hybrid learning
- Converges much faster than the back-propagation
- But involves a larger number of hidden units
- Runtime speed (after training) is slower

- The efficiencies of RBF network and multilayer perceptron are problem-dependent.

- How many layers are needed for a given task,
- How many units are needed per layer,
- Generalization ability
- How large the training set should be for ‘good’ generalization.
- Although multilayer feed-forward networks has been widely used, but parameters identification still must be determined by trail and error.

- Neural networks
- Neural Networks (The Official Journal of the International Neural Network Society, INNS)
- IEEE Transactions on Neural Networks
- International Journal of Neural Systems
- International Journal of Neuroncomputing
- Neural Computation

- Artificial Intelligence (AI)
- Artificial Intelligence: A Modern Approach (2nd Edition)，Stuart J. Russell, Peter Norvig

- Machine learning
- Machine Learning，Tom M. Mitchell
- Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence，Jyh-Shing Roger Jang, Chuen-Tsai Sun, EijiMizutani

- Neural networks
- 類神經網路模式應用與實作，葉怡成
- 應用類神經網路，葉怡成
- 類神經網路 –MATLAB的應用，羅華強
- Neural Networks: A Comprehensive Foundation (2nd Edition)，Simon Haykin
- Neural Network Design，Martin T. Hagan, Howard B. Demuth, Mark H. Beale

- Genetic Algorithm
- Genetic Algorithms in Search, Optimization, and Machine Learning，David E. Goldberg
- Genetic Algorithms + Data Structures = Evolution Programs， ZbigniewMichalewicz
- An Introduction to Genetic Algorithms for Scientists and Engineers，David A. Coley

- Use two-layer Perceptron to solve OR problem.
- Draw the topology (structure) of the neural network, including the number of nodes in each layer and the associated weight linkage.
- Please discuss how initial parameters(weights, bias, learning rate) affect the learning process.
- Please discuss the difference between batch mode learning and on-line learning.

- Use two-layer Perceptron to solve XOR problem.
- Please discuss why it cannot solve XOR problem.