artificial neural networks n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Artificial Neural Networks PowerPoint Presentation
Download Presentation
Artificial Neural Networks

Loading in 2 Seconds...

play fullscreen
1 / 22

Artificial Neural Networks - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

Artificial Neural Networks. Introduction Design of Primitive Units Perceptrons The Backpropagation Algorithm Advanced Topics. Basics. In contrast to perceptrons, multilayer networks can learn not only multiple decision boundaries, but the boundaries may be nonlinear. Output nodes.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Artificial Neural Networks' - tex


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
artificial neural networks
Artificial Neural Networks
  • Introduction
  • Design of Primitive Units
    • Perceptrons
  • The Backpropagation Algorithm
  • Advanced Topics
basics
Basics

In contrast to perceptrons, multilayer networks can learn not only

multiple decision boundaries, but the boundaries may be nonlinear.

Output nodes

Internal nodes

Input nodes

one single unit
One Single Unit

To make nonlinear partitions on the space we need to define

each unit as a nonlinear function (unlike the perceptron).

One solution is to use the sigmoid unit.

x1

w1

x2

net

w2

Σ

O = σ(net) = 1 / 1 + e -net

w0

wn

xn

X0=1

more precisely
More Precisely

O(x1,x2,…,xn) =

σ ( WX )

where: σ ( WX ) = 1 / 1 + e -WX

Function σ is called the sigmoid or logistic function.

It has the following property:

d σ(y) / dy = σ(y) (1 – σ(y))

backpropagation algorithm
Backpropagation Algorithm

Goal: To learn the weights for all links in an interconnected

multilayer network.

We begin by defining our measure of error:

E(W) = ½ Σd Σk (tkd – okd) 2

k varies along the output nodes and

d over the training examples.

The idea is to use again a gradient descent over the space of

weights to find a global minimum (no guarantee).

output nodes
Output Nodes

Output nodes

algorithm
Algorithm

The idea is to use again a gradient descent over the space of weights to find a global minimum (no guarantee).

  • Create a network with nin input nodes, nhidden internal nodes,
  • and nout output nodes.
  • Initialize all weights to small random numbers.
  • Until error is small do:
    • For each example X do
      • Propagate example X forward through the network
      • Propagate errors backward through the network
propagating forward
Propagating Forward

Given example X, compute the output of every node until

we reach the output nodes:

Output nodes

Compute sigmoid

function

Internal nodes

Input nodes

Example X

propagating error backward
Propagating Error Backward
  • For each output node k compute the error:
  • δk = Ok (1-Ok)(tk – Ok)
  • For each hidden unit h, calculate the error:
  • δh = Oh (1-Oh) Σk Wkh δk
  • Update each network weight:
  • Wji = Wji + ΔWji
  • where ΔWji = η δj Xji (Wji and Xji are the input and
  • weight of node i to node j)
adding momentum
Adding Momentum
  • The weight update rule can be modified so as to depend
  • on the last iteration. At iteration n we have the following:
  • ΔWji (n) = η δj Xji + αΔWji (n)
  • Where α ( 0 <= α <= 1) is a constant called the momentum.
  • It increases the speed along a local minimum.
  • It increases the speed along flat regions.
remarks on backpropagation
Remarks on Backpropagation
  • It implements a gradient descent search over the weight space.
  • It may become trapped in local minima.
  • In practice, it is very effective.
  • 4. How to avoid local minima?
    • Add momentum
    • Use stochastic gradient descent
    • Use different networks with different initial values
    • for the weights.
representational power
Representational Power
  • Boolean functions. Every boolean function can be represented
  • with a network having two layers of units.
  • Continuous functions. All bounded continuous functions can also
  • be approximated with a network having two layers of units.
  • Arbitrary functions. Any arbitrary function can be approximated
  • with a network with three layers of units.
hypothesis space
Hypothesis Space
  • The hypothesis space is a continuous space, as opposed to the

discrete space explored by decision trees and the candidate

elimination algorithm.

  • The bias is a smooth “similarity based bias” meaning points

close to each other are expected to share the same class.

example again
Example Again

x2

x1

Smooth regions

hidden representations
Hidden Representations

An interesting property of neural networks is their ability to

capture intermediate representations within the hidden nodes.

10000000

01000000

00100000

00010000

00001000

00000100

00000010

00000001

10000000

01000000

00100000

00010000

00001000

00000100

00000010

00000001

Hidden nodes

The hidden nodes encode each number using three bits.

node evolution
Node Evolution

Different units

Error

iterations

Different weights

weight

iterations

generalization and overfitting
Generalization and Overfitting

One obvious stopping point for backpropagation is to continue

iterating until the error is below some threshold; this can lead to

overfitting.

Validation set error

Error

Training set error

Number of weight updates

solutions
Solutions
  • Use a validation set and stop until the error is small in

this set.

  • Use 10 fold cross validation
  • Use weight decay; the weights are decreased slowly on

each iteration.

artificial neural networks1
Artificial Neural Networks
  • Introduction
  • Design of Primitive Units
    • Perceptrons
  • The Backpropagation Algorithm
  • Advanced Topics
advanced topics
Advanced Topics
  • Alternative error functions, correcting the derivative of the

function, minimizing cross entropy, etc.

  • Dynamically modify the network structure.
  • Recurrent Networks
    • Apply to time series analysis
    • The output at time t serves as input to other units

at time t+1