CSM10: Introduction to neural networks and historical background • Tony Browne: A.Browne@surrey.ac.uk
Motivation • Conventional (rule-based) systems perform badly at some tasks (e.g. face recognition - may fail to recognise the same face if it is smiling (brittleness)). • Many problems where we don’t know the solution, would like a system to work out the solution for us (i.e. learn a solution from the available data). • How does the human brain do this? • Can we copy what the brain does?
Brain cells connected • Each brain cell has lots of connections with other brain cells by means of nerve fibres (the wiring connecting brain cells together). There are about 4 million miles of nerve fibres in each brain. Some fibres may have up to 10,000 branches in them.
Neurons • Each brain cell has lots of connections with other cells, possibly over 25,000. The junctions at the end of the neurones are called synapses. • Axon - A neurone (or cell body) has many axons (or nerve fibres). • Vesicles - these contain the transmitter substances.
Neurons • Transmitters - these are small chemicals used by brain cells as messengers. They are stored in the vesicles in the nerve ending ready to be released • Receptors - these are structures on the surface of the receiving cell which have a space designed just for the transmitter (if the transmitter is a key, receptors are the lock into which they fit)
Neurons • Enzymes - these surround the synapse and break down any spare transmitter that might leak out to other synapses nearby. • Electrical signal - This is the way in which one brain cell sends a message to another. The signal travels down the nerve fibre rather like an electrical "Mexican Wave".
Signal transmission • 1. A brain cell decides to send a message to another cell in order to make something happen e.g. tighten a muscle, release a hormone, think about something, pass on a message etc. • 2. An electrical impulse is sent from the brain cell down one of the nerve fibres/neurones towards the end. It travels at about 120 miles per hour.
Signal transmission • 3. This message or impulse arrives at the end of the nerve fibre. When it arrives, a chemical transmitter is released from the nerve end. • 4. The transmitter is then released and travels across the gap between the first nerve fibre and the next/receiving one. • 5. The transmitter hits a receptor on the other side. It fits into it just like a key fitting into a lock.
Signal transmission • 6. When the transmitter hits the receptor, the receptor changes shape. This causes changes inside the nerve ending which sets off an electrical message in that nerve fibre on to the next brain/nerve cell. This sequence then carries on until the effect occurs e.g. the muscle moves etc.
Signal transmission • 7. The transmitter is either broken down by enzymes (10%) and removed or taken back up again into the nerve ending (i.e. recycled) - a process known as re-uptake. • 8. The nerve fibre and synapse is then ready for next message
Important points • The passage of messages only works one way or one direction • There is only one type of transmitter per synapse • The transmitter allows an electrical message to be turned into a chemical message and back into an electrical message.
Transmitter substances • Over 80 known different transmitter substances in the brain, each nerve ending only has one type balance • In many mental health problems, it is known that some of these transmitters get out of balance e.g. you have too much or too little of a particular transmitter.
Serotonin (5HT) • In the body, 5-HT is involved with blood pressure and gut control. • In the brain, it controls mood, emotions, sleep/wake, feeding, temperature regulation, etc. • Too much serotonin and you feel sick, less hungry, get headaches or migraines • Too little and you feel depressed, drowsy etc. • Antidepressants (prozac) boost levels of serotonin
Dopamine • Three main pathways of dopamine neurones in the brain. One controls muscle tension and another controls emotions, perceptions, sorting out what is real/important/imaginary etc. • Not enough dopamine in the first group and your muscles tighten up and shake (Parkinson's disease). • Too much dopamine in the second group gives too much perception e.g. you may see, hear or imagine things that are not real (schizophrenia)
Noradrenaline (NA) (sometimes called "norepinephrine" or NE) • In the body, it controls the heart and blood pressure, in the brain, it controls sleep, wakefulness, arousal, mood, emotion and drive • Too much noradrenaline and you may feel anxious, jittery etc. • Too little and you may feel depressed, sedated, dizzy, have low blood pressure etc. • Some antidepressants affect NA
Acetylcholine (Ach) • In the body, acetylcholine passes the messages which make muscles contract. • In the brain, it controls arousal, the ability to use memory, learning tasks etc. • Too much in your body and your muscles tighten up. • Too little can produce dry mouth, blurred vision and constipation, as well as becoming confused, drowsy, slow at learning etc.
Glutamate • Acts as an ‘accelerator’ in the brain • Too much and you become anxious, excited and some parts of your brain may become overactive. • Too little and you may become drowsy or sedated • Some people may be sensitive to glutamate in food (monosodium glutamate)
GABA • Acts as a ‘brake’ in the brain • Too much and you become drowsy or sedated. • Too little and you may become anxious and excited • Valium and other sedatives act on GABA
Computational modelling • We can model what happens in neurons/synapses in software and use these models to answer interesting questions • How does the mind work? (how can we model cognitive processes?) • Can we use such machine learning systems to solve problems we do not know the solutions to? • Such problems include bioinformatics (why does a particular genetic sequence do what it does?), drug design, machine vision, robotics.
Two main types of learning • Supervised learning – a ‘teaching’ target is provided to indicate to the network what the correct output should be • Unsupervised learning – no ‘teaching’ input – network self-organises into a solution
Perceptrons • Perceptron (1950's). See Fig. Had three areas of units, a sensory area, an association area and a response area. • Impulses generated by the sensory points are transmitted to the units in the association layer, each association layer unit is connected to a random set of units in the sensory layer.
Perceptrons • Connections may be either excitatory or inhibitory (+1, 0 and –1). • When pattern appears on the sensory layer, an association layer unit becomes active if the sum of its inputs exceed a threshold value, then it produces an output which is sent to the next layer of units, the response units.
Perceptrons • Response units connected randomly to the association layer units with inhibitory feedback.
Perceptrons • The response layer units respond in a similar way to the association layer units, if the sum of their inputs exceeds a threshold they give an output value of +1, otherwise their output is -1. • The Perceptron is a learning device, in its initial configuration it is incapable of distinguishing patterns, but can learn this capability through a training process.
Perceptrons • During training a pattern is applied to the sensory area, and the stimulus is propagated through the layers until a response layer unit is activated. If the correct response layer unit is activated the output of the corresponding association layer units is increased, if the incorrect response layer unit is active the output of the corresponding association layer units is decreased.
Perceptrons • Perceptron convergence theorem: states that if a pattern can be learned by the Perceptron then it will be learned in a finite number of training cycles. • ADALINES (1960’s) similar to perceptrons but used a least-mean squares error based learning mechanism
Perceptrons • Problems with Perceptrons: (Minsky, 1969) - the end of neural networks research? • One of the main points was that Perceptrons can differentiate patterns only if they are linearly separable (places a severe restriction on the applicability of the Perceptron).
Linear inseparability e.g. XOR problem: Inputs Target x1 x2 0 0 0 0 1 1 1 0 1 1 1 0
Linear inseparability • A simplified Perceptron, with x1 and x2 representing inputs from the sensory area, two units in the association layer and one in the response layer is shown in Fig. • The output function of the output unit is 1 if its net input is greater than the threshold T, and 0 if it is less than this threshold, this type of node is called a linear threshold unit.
Linear inseparability • The problem is to select values of the weights such that each pair of input values results in a proper output value, if we refer to next Fig. we see that this cannot be done.
Linear inseparability • There is no way to arrange the position of the line so that the correct two points for each class both lie in the same region. • Hyperplanes: Could partition the space correctly if we had three regions, one region would belong to one output class, and the other two would belong to another output class (there is no reason why disjoint regions cannot belong to the same output class) as in next Fig.
Linear inseparability • Can achieve this by adding an extra layer of weights to the network, which has the effect of expanding the dimensionality of the space that the XOR problem is being represented in and allowing us to produce a hyperplane which correctly partitions the space, an example with an appropriate set of connection weights and thresholds is shown in following Fig.
Multi-Layered Perceptrons • However, no learning algorithm was known that could modify the extra layer of weights • So research in Perceptrons almost died out until the mid 1980’s • Then an algorithm was developed by two American Psychologists that could modify these other weights • Opened the way for learning in Multi-Layered Perceptrons (MLPs)