Neural Networks

NeuralNetworks Applications for Aphasiology W. Katz, COMD 6305 UTD (With materials from UC San Diego PDP group, the Universities of Maastricht and Amsterdam, and University of Bath – Ian Walker)

Connectionism • = Parallel Distributed Processing (PDP) • Style of modeling • Based on networks of interconnected simple processing devices

Basic components include: • Set of processing units • Modifiable connections between units • An optimal learning procedure

Associative learning Mind Brain “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes place in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased” (Donald Hebb, 1949) • “When two elementary brain processes have been active together or in immediate succession, one of them, on reoccurring, tends to propagate its excitement into the other” (William James, 1890)

Localist versus distributed? Localist models • one unit to represent each concept • e.g., a person’s name; the meaning of the word “cat” • Units are usually grouped, with inhibition within groups and excitation between groups Distributed models  • concepts represented by more than one unit • meaning of the word “cat” is represented by a number of units: “mammal”, “purrs”, “has fur” , etc. • better at coping with incomplete data • probably closer to how our minds work

Some history… • A major approach in cognitive science • First wave in 1950s and 1960s • Culminated with the publication of Perceptronsby Minsky & Papert (1969) • Second wave began in 1980s, still going strong! McClelland & Rumelhart, Hinton and others

Logical problems to be solved • Separable – via general learning algorithm • Exists • Doesn’t exist • Not • And • Or (inclusive) • If-then • Non-separable – requires back-propagation • Symmetry • Parity (equalness, such as odd-even) • Logical OR (XOR)

OR – if A is true OR B is true or if both are true (“both”)XOR – true whenever inputs differ

Two-Layer Nets OK for Linear Separable Problems * See next slide….

Multi-Layer Nets… Required for Non-Separable Problems (e.g. XOR) AND OR

A basic connectionist unit 1. Think of a unit (“neurone”) as a light-bulb with a light sensor attached: Sensor 2. If some light falls on the sensor, the bulb will light up the same amount: 10 10> Or: 15 15>

A simple network: a decision maker Problem: “you must only leave when both Sarah and Steven have arrived” Nobody present: (0x1) + (0x1) + (-1) = -1 STAY Only Sarah present: (1x1) + (0x1) + (-1) = 0 STAY Only Steven present: (0x1) + (1x1) + (-1) = 0 STAY Both people present: (1x1) + (1x1) + (-1) = 1 GO In technical terms, this is a logical AND gate Go when > 0 Resting level  -1 x1 x1 Sarah detector (0 or 1) Steven detector (0 or 1)

The Perceptron- Frank Rosenblatt (1958, 1962) • Two-layers • binary nodes that take values 0 or 1 • continuous weights, initially chosen randomly

Simple example 0 net input = 0.4  0 + -0.1  1 = -0.1 0.4 -0.1 1 0

The Perceptron was a big hit.. • Spawned the first wave in ‘connectionism’ • Great interest and optimism about the future of neural networks • First neural network hardware - built in the late 50s and early 60s

Perceptron - Limitations • Only binary input-output values - (later fixed via the delta rule…) • Only two layers - prevented solving nonlinear problems e.g. XOR

Exclusive OR (XOR) In Out 0 1 1 1 0 1 1 1 0 0 0 0 1 0.4 0.1 1 0

An extra layer is necessary to represent the XOR • No solid training procedure existed in 1969 to accomplish this • Thus began the search for the third (or “hidden”) layer

Error-backpropagation –Rumelhart, Hinton, and Williams Meet the hidden layer

The backprop trick • To find the error value for a given node h in a hidden layer, … • …..take the weighted sum of the errors of all nodes connected from node h : To-nodes of h 1 2 3 n Backpropgation of errors: w2 w3 wn w1 h = w11 + w22 + w33 + … + wnn Node h

Characteristics of back-propagation • Works for any number of layers • Use continuous nodes • Must have differentiable activation rule • Typically, logistic: S-shape between 0 and 1 • Initial weights are random • Gradient descent in error space

NetTalk: Backprop’s ‘killer-app’ • AI project by Sejnowski and Rosenberg (1986) • Text as input, phonetic transcription for comparison • Learns to pronounce text based on associations between letters and sounds • Trained, not programmed • http://www.youtube.com/watch?v=gakJlr3GecE

Backprop: Pro / Con • Easy to use • Few parameters to set • Algorithm is easy to implement • Can be applied to a wide range of data • Very popular • Learning is slow • New learning will rapidly overwrite old representations, unless these are repeated with the new patterns • This makes it hard to keep networks up-to-date with new information • Also makes it very implausible as a psychological model of human memory

Benefits of connectionism? • Models are analogous to how the brain works • Can be used to test theories about the mind’s organization concerning: • Generalization • Coping with incomplete/noisy data • Graceful degradation of error • One can look for emergent properties • Old-style information-processing box-and-arrows diagrams can be more explicitly tested

(-- A neural modeler, Dijon, France)A bright future: “Gradually, neural network models and the computers they run on will become good enough to give us a deep understanding of neurophysiological processes and their behavioral counterparts and to make precise predictions about them.” “They will be used to study epilepsy, Alzheimer’s disease, and the effects of various kinds of stroke, without requiring the presence of human patients.” “They will be, in short, like the models used in all of the other hard sciences. Neural modeling and neurobiology will then have achieved a truly symbiotic relationship.”

The debates • Fodor & Pylyshyn (1988) argue that such models hold no real information about the relationships between nodes • i.e., a link between two nodes does not capture the real essence of the relationship between the two concepts. And all links are the same…. • But how do we know that links in the mind/brain are any different? • Others argue the models are too low-level • How do we account for things like “concepts”, “planning”, etc? • Biological plausibility? • What about consciousness?

Summary • Connectionism - a mathematical modelling technique which uses simple interconnected units to replicate complex behaviours • Models are exposed to various situations and from this learn general rules • The distinction between their processing of information and their storage of it is very blurred • Can simulate many aspects of human cognition, most notably generalization, processing of incomplete information, and graceful degradation • Have provided new metaphor for psychologists to think about the mind • Overall value -- still controversial

Neural Networks

Neural Networks

Presentation Transcript

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural networks

NEURAL NETWORKS

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks