1 / 25

Neural Networks - PowerPoint PPT Presentation

Neural Networks. Ellen Walker Hiram College. Connectionist Architectures. Characterized by (Rich & Knight) Large number of very simple neuron-like processing elements Large number of weighted connections between these elements Highly parallel, distributed control

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Neural Networks' - ling

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Neural Networks

Ellen Walker

Hiram College

• Characterized by (Rich & Knight)

• Large number of very simple neuron-like processing elements

• Large number of weighted connections between these elements

• Highly parallel, distributed control

• Emphasis on automatic learning of internal representations (weights)

• Constraint networks

• Positive and negative connections denote constraints between the values of nodes

• Weights set by programmer

• Layered networks

• Weights represent contribution from one intermediate value to the next

• Weights are learned using feedback

• A constraint network

• Every node is connected to every other node

• If the weight is 0, the connection doesn’t matter

• To use the network, set the values of the nodes and let the nodes adjust their values according to the weights.

• The “result” is the set of all values in the stabilized network.

• Nodes represent features of objects

• Compatible features support each other (weights > 0)

• Stable states (local minima) are “valid” interpretations

• Noise features (incompatible) will be suppressed (network will fall into nearest stable state)

• Algorithm to find stable state for Hopfield network (serial or parallel)

• Pick a node

• Compute [incoming weights]*[neighbors]

• If above sum > 0, node =1, else node = -1

• When values aren’t changing, network is stable

• Result can depend on order of nodes chosen

• Given an object, each vertex contrains the labels of its connected lines

Lines denote positive links between compatible labels

Each gray box contains 4 mutually exclusive nodes (with negative links between them)

• Alternative training method for a Hopfield network, based on simulated annealing

• Goal: to find the most stable state (rather than the nearest)

• Boltzmann rule is probabilistic, based on the “temperature” of the system

• Deterministic update rule

• Probabilistic update rule

• As temperature decreases, probabilistic rule approaches deterministic one

• We earlier talked about function fitting

• Finding a function that approximates a set of data so that

• Function fits the data well

• Function generalized to fit additional data

• n inputs (i0=1, i1…in)

• n+1 weights (w0…wn)

• 1 output:

• 1 if g(i) > 0

• 0 if g(i) < 0

• g(i) =

• G denotes a linear surface, and the output is 1 if the point is above this surface

• Initialize weights randomly

• Collect all misclassified examples

• If there are none, we’re done.

• Else compute gradient & update weights

• Add all points that should have fired, subtract all points that should not have fired

• Repeat steps 2-5 until done (Guaranteed to converge -- loop will end)

• We have a model and a training algorithm, but we can only compute linearly separable functions!

• Most interesting functions are not linearly separable.

• Solution: use more than one line (multiple perceptrons)

input

hidden

output

Layered, fully-connected (between layers), feed-forward

• Compute a result:

• input->hidden->output

• Compute error for each hidden node, based on desired result

• Propagate errors back:

• Output->hidden, hidden->input

• Repeat above for every example in the training set (one epoch)

• Repeat above until stopping criterion is reached

• Good enough average performance on training set

• Little enough change in network

• Hundreds of epochs…

• If the network is trained correctly, results will generalize to unseen data

• If overtrained, network will “memorize” training data, random outputs otherwise

• Tricks to avoid memorization

• Limit number of hidden nodes

• Insert noise into training data

• Kohonen network for classification

• Create inhibitory links among nodes of output layer (“winner take all”)

• For each item in training data:

• Determine an input vector

• Run network - find max output node

• Reinforce (increase) weights to maximum node

• Normalize weights so they sum to 1

• Distributed representation

• Concept = pattern

• Examples: Hopfield, backpropagation

• Localist representation

• Concept = single node

• Example: Kohonen

• Distributed can be more robust, also more efficient