neural networks l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Neural Networks PowerPoint Presentation
Download Presentation
Neural Networks

Loading in 2 Seconds...

play fullscreen
1 / 62

Neural Networks - PowerPoint PPT Presentation


  • 317 Views
  • Uploaded on

Neural Networks. Mundhenk and Itti , 2008. What are neural networks. Neural Networks are diverse and do many things Some are meant to solve AI problems such as classification. Some are meant to simulate the workings of the brain and the nervous system for biological research. .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Neural Networks' - adamdaniel


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
neural networks

Neural Networks

Mundhenk and Itti, 2008

CS460, Fall 2008, L. Itti

what are neural networks
What are neural networks
  • Neural Networks are diverse and do many things
    • Some are meant to solve AI problems such as classification.
    • Some are meant to simulate the workings of the brain and the nervous system for biological research.

CS460, Fall 2008, L. Itti

what are neural networks3
What are neural networks
  • Neural networks use many nodes (neurons) connected by edges (axons, dendrites) to transform an input into a desired output
  • The neurons and their edges are adjusted, frequently using gradual changes to train the neural network.
  • Neural networks can be trained many ways
    • Actor / Critic Learning
      • external reinforcement
    • Hebbian Learning (association)
      • Internal reinforcement
    • Linear Algebra (closed form solution)
  • We tend to use gradient descent like learning to make incremental changes over time.
  • We can introduce Boltzmann like mechanisms in some networks

CS460, Fall 2008, L. Itti

examples of neural networks we won t cover but are important
Examples of neural networks (we won’t cover but are important)
  • Classical McCulloch-Pitts model (1943)
    • The first neural networks devised.
    • Very basic binary neuron model designed to test early feasibility of neural computation.
  • Hopfield Networks (1982)
    • Is an associative network
    • Can be used to solve TSP (but not very well)
  • Kohonen Networks (1982)
    • Classify things based on similarity
    • Needs a metric for the properties of things in order to say how similar they are.

CS460, Fall 2008, L. Itti

single layer perceptron
Single Layer Perceptron

Accountants

Engineers

Good at Math

Hair Dressers

Likes Star Trek

CS460, Fall 2008, L. Itti

single layer perceprton
Single Layer Perceprton

Accountants

Engineers

Good at Math

Rj

Ri

Hair Dressers

Rk

Likes Star Trek

CS460, Fall 2008, L. Itti

single layer perceptron7
Single Layer Perceptron

Engineer/Hair Dresser/Accountant?

y1

yc

wki

x0

x1

xd

Likes Star Trek

Good at Math

CS460, Fall 2008, L. Itti

single layer perceptron8
Single Layer Perceptron

Training involves minimizing the error in the network.

As such, we seek to minimize the difference between

the expected output and what actually comes out of

the network. We do this by changing the weights in the

network in a logical fashion.

Target Variable (output)

Vector of Activation (input)

Some Learning Rate

Update by gradient descent (Perceptron Criterion)

CS460, Fall 2008, L. Itti

notes about single layer perceptrons
Notes about Single Layer Perceptrons
  • Classification problem must be linearly separable (more on that next slide)
  • Can be solved using a closed form linear algebra solution, no need to actually train.
  • Are simple to create and use, but are limited in power.

CS460, Fall 2008, L. Itti

linear separability
Linear Separability
  • Sometimes, some areas cannot be separated by a single line
  • Can we draw a single line that divides the state Ohionois from Indiana?
  • Single layer perceptrons fail at this since they draw single lines of boundaries.

CS460, Fall 2008, L. Itti

single layer perceptron11
Single Layer Perceptron

Accountants

College Students

Engineers

Good at Math

No longer linearly

Separable!

Likes Star Trek

CS460, Fall 2008, L. Itti

solutions multi layer perceptron back propagation neural network in practice
Solutions: Multi Layer Perceptron(Back Propagation Neural Network in practice)

Accountants

College Students

Engineers

Good at Math

Likes Star Trek

CS460, Fall 2008, L. Itti

multi layer perceptron
Multi Layer Perceptron

y1

yc

Sigmoid

S

S

wkj

Weight

Sigmoid

S

S

S

z0

z1

zm

wij

Weight

input

Note: the bias has been absorbed into the computation weights.

x0

x1

xd

Good At Math

Likes Star Trek

CS460, Fall 2008, L. Itti

multi layer perceptron14
Multi Layer Perceptron
  • The importance of adding the sigmoid after each node has additional importance for a multi layer perceptron
    • A multi layer perceptron is the product of two linear systems, the product of linear systems is a linear system, thus without a sigmoid non-linearity, adding a second layer buys us nothing.
    • A non linearity is the icing on the multi-layer cake that is the multi-layer perceptron.
    • The sigmoid on the final output layer is optional depending on the application

CS460, Fall 2008, L. Itti

multi layer perceptron15
Multi Layer Perceptron
  • Main Problem:
    • Assignment of credit in the “hidden” layers
      • How do you find the weights for the hidden layers when the output is transformed by the next layer.

?

?

?

?

CS460, Fall 2008, L. Itti

assignment of credit
Assignment of Credit
  • Example:
  • A Mad scientist wants to make billions of dollars by controlling the stock market. He will do this by controlling the stock purchases of several wealthy people. The scientist controls information that can be given by insider CEO’s to brokers who can forward this information to wealthy people. He also has a device to control how much different people can trust each other.
  • Using his ability to input insider information and control trust between people, he will control the purchases by wealthy individuals and manipulate the stock market.

CS460, Fall 2008, L. Itti

slide17

Planted Insider Information

Fat Cat CEO’s

Information

Weighted by trust

Brokers

Information

Weighted by trust

Rich Dudes

Purchases

CS460, Fall 2008, L. Itti

slide18
Idea
  • Isolate information pathways that produce the purchases you desire, do nothing. Isolate pathways that produce purchases you do not desire, alter trust.
  • Carefully adjust trust over several attempts until purchases are what you desire (gradient descent).
  • Remember: adjustments are (usually) proportional to the sum of squared error differences.
    • Adjust weights proportional for what you got out of the network based upon what you expected.
    • Note: There are other methods for determining error.

CS460, Fall 2008, L. Itti

slide19

Information

Weighted by trust

Weaken

Purchases

S

S

S

Sigmoid

65% Correct

25% Correct

95% Correct

Weight

Change

Do Some

Do Much

Do Not Much

CS460, Fall 2008, L. Itti

slide20

High error contributor

Low error contributor

Information

Weighted by trust

Purchases

S

S

S

Sigmoid

65% Correct

25% Correct

95% Correct

CS460, Fall 2008, L. Itti

backward propagation
Backward Propagation

Planted Insider Information

Weaken

Do Some

Do Some

Do Much

Do Not Much

CS460, Fall 2008, L. Itti

slide22

Single Update

Simple Case: The change in weights are done by subtracting the error rate from weight modified by a learning rate and the input variable.

or

Batch

Error

Learning Rate

Layer One

Layer Two

Error Term

Derivative of Activation Function (Sigmoid)

CS460, Fall 2008, L. Itti

slide23

NSL Matlab Back Propagation Model

CS460, Fall 2008, L. Itti

slide24

Pretty Outputs

Layer 1

Layer 2

CS460, Fall 2008, L. Itti

double layer perceptron
Double Layer Perceptron
  • Any given decision boundary can be approximated arbitrarily closely by a two-layer network having sigmoidal activation functions.
  • May use logistic sigmoid or tanh sigmoid.
  • May be thought of as a set of linear functions with a unknown basis functions connected to it. (Thus, if you know the basis functions, you don’t need a double layer perceptron)
  • Is frequently considered the second best solution to many problems. Thus its very flexible, but not necessarily optimal.

CS460, Fall 2008, L. Itti

multi layer perceptron26
Multi Layer Perceptron

( Neural Networks for

Pattern Recognition, Christopher M. Bishop, 1995, Oxford University Press, Oxford )

CS460, Fall 2008, L. Itti

slide27

( Neural Networks for

Pattern Recognition, Christopher M. Bishop, 1995, Oxford University Press, Oxford )

CS460, Fall 2008, L. Itti

what about biological models
What about biological models?
  • Sometimes we create neural networks to simulate a brain (human or animal)
  • These models need to be biologically feasible so that we can draw conclusions from them about the working of a natural neural mechanism
  • Are frequently complex and computationally expensive, but not always.
  • Sometimes have direct applications.
    • Biological models sometimes reverse engineer processes in the human brain which we don’t know how to do.

CS460, Fall 2008, L. Itti

example prey selection
Example – Prey Selection
  • This is the Didday / Amari-Arbib model of prey selection.
  • How do animals such as dragon flies know how to select and snap at prey?
  • An insect may contrast against a background such a sky.
  • The activity on the retina should be maximum, but how does it stick out?

CS460, Fall 2008, L. Itti

amari arbib winner take all model maxselector
Amari-Arbib Winner-Take-All Model (MaxSelector)

This is the model

with global inhibition

(TMB2 Sec. 3.4)

that restructures the

Didday model

(TMB2 Sec. 3.3)

which has a whole

layer of

inhibitory neurons

CS460, Fall 2008, L. Itti

the two nsl modules of the maximum selector model
The Two NSL Modules of the Maximum Selector Model

MaxSelector module (parent module) with two interconnected modules Ulayer and Vlayer (child modules).

CS460, Fall 2008, L. Itti

visualizing the maxselector

Rate of water flow

Spigot

Visualizing theMaxSelector

Leaky Beaker

Spigot

Scale

Spigot Control

Weight in scale

controls spigot

flow. Is faster

for more weight

Trap Door

Weight

CS460, Fall 2008, L. Itti

visualizing the maxselector33

Trap door remains open

unless flow can get back

above threshold

Visualizing the MaxSelector

Below a certain threshold, trap door opens

Inflow and outflow balance at 1

CS460, Fall 2008, L. Itti

max selector

Ulayer

Max Selector

Leak

Const

Time

const

Const

Vlayer

Step Function

CS460, Fall 2008, L. Itti

dominey arbib joseph model
Dominey,Arbib Joseph Model

CS460, Fall 2008, L. Itti

review
Review…
  • Takes in visual features
  • Decides where to saccade (move eyes) to, given input features
  • Can learn sequences of eye movements
  • Uses dopamine (A neurotransmitter) modulated reinforcement learning

CS460, Fall 2008, L. Itti

a little bio review
A little bio review
  • The brain contains many neurotransmitters. Today we are interested in:
    • Dopamine – Related to reinforcement learning and reward
    • GABA – Is a general purpose neural inhibitor
    • Other related neurotransmitters: Acetylcholine and Norepinephrine may also be related to reinforcement learning but will not be covered.

CS460, Fall 2008, L. Itti

dopamine fun facts
Dopamine fun-facts
  • Is related to reinforcement learning
  • Is related to reward
  • Is particularly present in SNc and Striatum
  • Connects strongly with GABAergic interneurons and may provide indirect inhibition via its connection to GABAergic interneurons.
  • Is implicated in Parkinson’s disease and schizophrenia
  • Dopamine agonists include amphetamines as well as the precursor DOPA used to treat Parkinson's disease.
  • Dopamine antagonists include “typical” neuroleptics used to treat schizophrenia (e.g. Thorazine, Haldol)

CS460, Fall 2008, L. Itti

gaba fun facts
GABA fun-facts
  • Is short for g-aminobutyric acid
  • Is related to inhibition of neural activity
  • GABAergic interneurons provide most of the brains inhibition
  • Act as brakes and gates in the brain
  • Is implicated in anxiety and perhaps epilepsy
  • GABA agonists include Benzodiazipines used to treat anxiety (e.g. Valium) and epilepsy

CS460, Fall 2008, L. Itti

a little bio review cont
A little bio review cont’
  • We will be interested in few major parts of the brain (for instance)
    • Basal Ganglia (BG) – Plays major role in critic and reinforcement
    • Prefrontal Cortex (PFC) – Is associated with working memory and task related storage
    • Inferotemporal Cortex (IT) – Is related to the “what” and feature understanding
    • Posterior Parietal Cortex (PP) – Is related to the “How” and feature location

CS460, Fall 2008, L. Itti

slide41

Where are these

Things????

BG

Caudate

Thalamus

PP

Superior Colliculus

SNr/SNc

PFC

IT

CS460, Fall 2008, L. Itti

what this model should do
What this model should do
  • Learn proper motor reaction to some perceived stimulus.
    • In this case a visual feature tells us where we should saccade to
  • Learn a sequence of motor reactions
    • We augment the model such that it not only saccades to the correct location, but can repeat sequences of saccades
  • Utilize reinforcement learning to build meaningful neural connections to create correct saccades
    • We will reinforce connections from IT to learn correct saccades
    • We will reinforce reciprocal connections with PFC to learn sequences

CS460, Fall 2008, L. Itti

why this is cool
Why this is cool
  • Demonstrates how we could learn complex sequences such as eye movements, speech or grasping movements.
    • Take note on how this model may be generalized!
  • Shows how dopamine may work in the brain and gives us clues to its workings including its involvement in several disease processes.

CS460, Fall 2008, L. Itti

model generalization scenario
Model Generalization: Scenario
  • You know that monkeys make better stock purchases than brokers so you want to create a machine which uses monkeys to make stock purchases for you. You get a bunch of monkeys which will pull on a lever when they see stock x’s P/E graph in the wall street journal. The monkeys will always pull the same, but you don’t know how much they will pull. This will cause an action of how much of stock x to purchase or sell.

CS460, Fall 2008, L. Itti

slide45

Lever Pulling Monkeys See bar graph

Stock X’s P/E

CS460, Fall 2008, L. Itti

slide46

Lever Pulling Monkeys See bar graph

Stock X’s P/E

Lever causes a current to run down some wires

Increases/Decreases peanut smell - Hamsters run faster/slower

Increases/Decreases flow of water, makes buckets heavers/lighter

CS460, Fall 2008, L. Itti

slide47

Lever Pulling Monkeys See bar graph

Stock X’s P/E

Lever causes a current to run down some wires

Increases/Decreases peanut smell - Hamsters run faster/slower

Increases/Decreases flow of water, makes buckets heavers/lighter

Tips scale

SELL X

BUY X

CS460, Fall 2008, L. Itti

Tells How much to buy or sell

slide48

Lever Pulling Monkeys See bar graph

Stock X’s P/E

Lever causes a current to run down some wires

Increases/Decreases peanut smell - Hamsters run faster/slower

Punishment/Reward

Decrease connections active

During wrong choice, increase

Less active ones

Increases/Decreases flow of water, makes buckets heavers/lighter

Was this what you wanted?

Tips scale

SELL X

BUY X

CS460, Fall 2008, L. Itti

Tells How much to buy or sell

slide49

IT

V4

Weights

Caudate

SNc/Striatum

SNr

FEF

Superior Colliculus

Left

Right

CS460, Fall 2008, L. Itti

Where to Saccade to?

Note: This is way over simplified!

model working overview single saccade
Model Working Overview (Single Saccade)
  • Abstract visual features come in through IT.
  • Signal is sent to caudate and cause a random saccade.
  • PP/FEF notes how for off saccade is and signals to create dopamine reinforcement. Weights from IT are adjusted according to reinforcement signal.
    • We use hebbian and anti-hebbian rules to update weights
  • The next saccade it less random and causes a gradient in reinforcement.
  • IT connection weights to caudate are adjusted until the saccade goes to where it is supposed to.
  • The only learning at this stage is on the IT to caudate weights.

CS460, Fall 2008, L. Itti

model working overview

SNr mask (Faucet Tap)

Abstract Visual Features in IT (Monkeys)

Model Working Overview

Weights from IT to Caudate (Wires)

Start

After Some Training

Finished

CS460, Fall 2008, L. Itti

Caudate Activation (Hamsters)

rewards and weights
Rewards and Weights

IT cell firing rates

(How hard did this

Monkey pull?)

Visual Input

Old Weights

Dopamine Activation

Reward Contingency

IT

wij

Constant

Caudate

New Weights

(Change wires

From monkeys)

Normalizer

Saccade

CS460, Fall 2008, L. Itti

slide53

Stock X’s P/E

Punishment/Reward

Decrease connections active

During wrong choice, increase

Less active ones

To the hamsters

Was this what you wanted?

Scale tipping increases

Decreases Banana smell

To different monkeys

SELL X

BUY X

CS460, Fall 2008, L. Itti

Tells How much to buy or sell

slide54

Prefrontal

Cortex!

Stock X’s P/E

Punishment/Reward

Decrease connections active

During wrong choice, increase

Less active ones

To the hamsters

Was this what you wanted?

SELL X

BUY X

CS460, Fall 2008, L. Itti

Tells How much to buy or sell

model working overview sequential saccade
Model Working Overview (Sequential Saccade)
  • We augment the model to include reciprocal connections with PFC
  • PFC co-activates with caudate
    • PFC input in some sense acts like another feature input like IT.
    • Indirect activation from caudate through SNr to Thalamus updates PFC state.
    • Think of kicking a soccer ball to a friend (Caudate) who then kicks it to another friend (Thalamus) who then kicks it back to you (PFC).
  • PFC weights are updated using the same rules as IT weight update.
  • Thus, we now update two sets of weights one from IT to caudate and one from PFC to caudate

CS460, Fall 2008, L. Itti

model working overview56

Abstract Features in PFC (new monkeys)

Model Working Overview

Weights from PFC to Caudate (Wires)

Start

Target Update

Caudate Activation (Hamsters)

CS460, Fall 2008, L. Itti

what does dopamine have to do with all of this
What does dopamine have to do with all of this?
  • Dopamine gives a priming reward signal that diminishes as the input stimulus becomes less novel.
  • Such a mechanism may provide a gradient for learning rates.
  • Priming is very fast and in a sense may act like a very short term plasticity.

CS460, Fall 2008, L. Itti

slide58

Reward causes hamsters

To run faster

Punishment/Reward

Was this what you wanted?

SELL X

BUY X

Tells How much to buy or sell

CS460, Fall 2008, L. Itti

slide59

Hamsters run faster

Which turns down

Reward!

Punishment/Reward

Was this what you wanted?

SELL X

BUY X

Tells How much to buy or sell

CS460, Fall 2008, L. Itti

slide60

Dopamine Priming

CS460, Fall 2008, L. Itti

dopamine priming
Dopamine priming

Visual Input

Dopamine Activation

IT

wij

Caudate

If Caudate Activity is High

Reduce Dopa

Saccade

CS460, Fall 2008, L. Itti

to review what do we have
To review: What do we have?
  • We have a model that
    • Learns a correct response given some unknown input
    • Learns a sequence of outputs given some unknown input
    • Uses reinforcement learning to accomplish these things

CS460, Fall 2008, L. Itti