Connectionism & Consciousness. Connectionism & Consciousness. This week’s question: How have connectionism, AI and dynamical systems influenced cognitive information processing accounts and what issues do they raise with regard to conscious and unconscious processing?.
How have connectionism, AI and dynamical
systems influenced cognitive information
processing accounts and what issues do they
raise with regard to conscious and unconscious
How is memory organised?
If memory used a library addressing system, memory errors
would be unpredictable. BUT memory errors tend to be near
misses that are related in terms of meaning.
If asked a question that demands information you have
not encoded directly, your memory system often pulls related
information that allows you to make an inference to answer
the question.So the question is:
How does the memory system ‘know’ the right information to
are present in the environment.
hierarchical structure can provide property descriptions of concepts.
(animal to bird to chicken) :
true or false?
1,000msec A canary is a canary
1,160msec A canary is a bird
1,240msec A canary is an animal
BUT did not always hold:
sometimes faster at ‘a chicken is an animal’ than ‘a chicken is a bird’
Links represent associations between semantically related concepts.
Stimulation form the environment activates nodes that send some
activation to linked nodes which can also become active.
Rumelhart, Hinton & McClelland (1986) proposed 6 properties of
a semantic network:
Concepts in memory become active and when activity
surpasses a threshold they enter awareness.
Vagueness about how to determine location & strength of links
If one item primes another - assume it is linked (becomes circular)
How far does the activation spread
Mediated priming –lion primes stripes (presumably through tiger)
Radcliff & MacKoon (1994) if each word spreads activation to 20
words in 3 steps have 8,000 concepts – spreading activation becomes
pointless if most concepts active most of the time in ordinary
conversation – (does this point to the important of context?)
Parallel Distribution Processing – distributed representation
Concept is not represented with a local representation but is
distributed over a number of nodes simultaneous.
If part of the system is damaged it does not shut down but performance
gets worse (degradation).
Degradation also seems to be a feature of human memory –minimal
memory damage causes minimal memory loss (unlike computer
memory where minimal damage can be catastrophic).
Learning ability- automatically finds prototypes & exceptions to
Generalisation –responds to new stimulus the way responded to old
stimulus – a key property of human memory.
Criticisms of Parallel Distribution Processing representation
Catastrophic interference if learns one set and then another set of
associations – not a feature of human memory.
Piker & Prince aspects of childrens’ learning that are not accounted for
by the model and that require implementation of rules that are difficult
for PDP models.
Selection of words
Order of words
Selection of phonological code
Can be demonstrated experimentally
Supply the appropriate words for the following:
Results suggest that word meaning is accessed first
(feeling of knowing)
Then the sound form (phonological) of the word is accessed
(but may fail!)
Results argue for 2-stage process of lexical retrieval.
Order of words
Selection of words
Selection of phonological code
Pictures of neurons
pattern of activation
nodes and connection strength
More frequent activation - stronger connection strength
Nodes as entities
Associative strength data bases
Associative strength data bases representation
Ask 100 people : what is the first word that comes to mind when I say
‘salt’ most say ‘pepper’ but associative strength is not equivalently
bidirectional because if presented with the word ‘pepper’ most say ‘corn’
Note theses are associative-semantic pairs because as well as strong
associative strengths all are types are food and therefore also have a
Associative only pairs = traffic jam – no semantic connection between the
words apart from the association that is a symbol for a 3rd and completely
High associative strength >.1 (tomato-sauce)
Low associative strength = 0 ! – absent from tables! (tiger – neuron)
nodes and connection strength
More frequent activation - stronger connection strength
Nodes as properties NOT entities
A pattern of activation within the network results in the
concept being activated.
Red, round, can be eaten, grows on trees, juicy = ?
Semantic distance data bases
>.3 = high semantic distance score ( bag – box)
< .1 = low semantic distance score (chatter – box)
two different groups. Of course one pile may be sufficient
depending on how much there is to do. If you have to go
somewhere else due to lack of facilities, that is the next step;
otherwise you are pretty well set. It is important not to
overdo things. That is, it is better to do fewer things at once
rather than to many. In the short run this might not seem
important, but complications can easily arise. A mistake can
be expensive as well. At first the whole procedure will seem
complicated. Soon, however, it will just become just another
facet of life. After the procedure is completed, one arranges
the material into different groups again. Then they can be put
into their appropriate places. Eventually they will be used
once more and the whole procedure will be repeated.
However this is part of life.
e.g in reading simple common words (drink)
e.g. in reading complex uncommon words (hemidecortication)
example of connectionist model that does not learn
(McClelland & Rumelhart, 1981)
models that learn through back-activation training
input level – visual feature unit level
units correspond to individual letters
output level – each unit corresponds to a word
Fragment of a IAC model of word recognition – draw inhilitory (o) and facilitatory ( ) connections!
Back-propagation being the most widely used connectionist learning rule
1980s: difference between the brain & a digital computer may be important.
some irregular verbs are unique (is – was, go – went)
some group in terms of phonological similarity (sing, ring, catch, teach – sang,rang, caught,taught) that is generalised to a novel word (pling – plang [analogy to ring-rang])
Rumelhart & McClelland (1986) produced a connectionist learning
simulation that produced the same U-shaped performance over time as
that of the children, and therefore performance may not necessarily
arise from explicit rules.
Pinker & Prince (1988) disagree and state that the qualitative
difference between irregular & regular verbs means that the latter are
Braitenberg (1984)- Vehicles: Experiments in Synthetic Psychology
The book consisted of 12 short thought experiments
Reader invited to imagine different primitive vehicles
Each was a block of wood with wheels at the rear and sensors (for
headlights) and connections between sensors and the motor that drove
The nature of the sensors and their connection to the motor varied
Braitenberg then considered how each vehicle might behave when
placed on a surface with other vehicles and exposed to a stimulus
Some moved o the light and then veered off
Some sped aggressively towards the light and crashed into it
Some circled the light
Each vehicle chapter heading bore a name “love”, “hate”, “values”,
“logic” – it was easy to imagine them as animated and motivated by
anger, affection or complex reasoning BUT the circuits inside were
2-D grid of cells.
Cell is either on (alive) or off (dead)
At the tick of the clock a cell may change its state according to the
If the cell is alive and has exactly 2 or 3 neighbours which are also
alive, it survives to the next cycle.
If the cell is dead but has exactly 3 alive neighbours it is born.
In all other cases the cell remains dead or dies.
Imagine the shaded ‘on’ cells are part of a much larger grid
GLIDER: looks like biological behaviour - Over time the pattern of
‘on’ cells changes which looks as if it is falling and deforming and in
the process gliding down to the right.
Investigated the way in which such glider systems can solve
Biological systems evolve, AI systems are built!
Biological change has a random element (genetic variation) and a
quasi-directed element (some better adapted) that alters the genetic
makeup of succeeding generations. Holland (1975) proposed a
“genetic algorithm” (GA) that has much in common with natural
evolution. Powerful when there are higher-order interactions between
sub-parts of the problem- widely used in conjunction with neural
There are multiple solutions - depend on how different questions are answered,
GA models this as an artificial chromosome (vector of 1 & 0). How well it does is it’s
fitness, preferential replication of these solutions constructs a new generation but at the same
time a random switching of 1 & 0 occurs . The new generation is tested and it’s fitness passed
to a third generation and so on until the best solution is found.
Goals Behaviours are usually adaptive and not taught to us.
Connectionist model are disturbingly passive.
Nolfi et al (1994)
Goal-directed behaviour in a neural network, 10x10 grid with a small
number of cells containing food
Tested 100 organisms (simple neural network) in the grid by input of
direction & strength (smell of food) connected to hidden units
(organism’s brain), projected to 2 output motors (allowed forward,
left, right, pause). Random weights (connection strength) in intial set
up - behaviour disorganised (some stood for whole life! some marched
and then fell off). If it stumbled on food could be cloned. Random
weight changes wee introduced too. After 50 generations very
different individuals evolved (looked like – goal directed to food – but
remember Braitenberg’s warning!!!!).
role of the environment
importance of the organism’s body
social nature of intelligence
The digital framework ‘saw’ mind as computer
Connectionism in part ‘sees’ mind as brain
Some researchers studying the dynamical quality of motor
activity state it is difficult to ignore the complex manner in
which behaviours change over time. These researchers do
not use the digital framework but instead use the dynamical
systems framework as alternative to thinking about
A dynamical system changes over time according to some
changes over time uses differential equations.
MIND AS MOTION
Reasons why one should think of cognition as a dynamical system
2. Continuity in state
3. Multiple simultaneous interactions
4. Self-organisation & emergence of structure
All 3 approaches play closer attention to how may be important
natural systems might elucidate cognition
None are complete.
Together they complement each other.
3 approaches attempt to deal with the
shortcomings of cognitive models
Concern with biological implementation
Concern with problems in learning
Concern with problems of representation
Can an inductive learning procedure discover abstract
generalisations, using only examples (associations?), rather than
explicitly formulated instructions (categorical?)?
How do the resulting knowledge bases capture generalisations?
Are there important difference between traditional rule systems and
the ways in which networks represent generalisations?
Rejects cognition as highly developed mental activity (chess)
Emphasis is on intelligence for survival
Role of evolution in achieving adaptation
Role of Adaptation
Emergence of structures and behaviours not designed
Emergence of structures and behaviours as an outgrowth of
Concern with interaction & emergence
Mathematical framework for understanding emergence &
high-order interaction found in connectionist and AI models.
Deeper commitment to incorporation of the importance of
time into models.
Hinton & Shallice (1991)
If linguistic equivalent (hidden units in a connectionist network do not always acquire an easily identifiable specific function)
meanings are represented by a pattern of
activation distributed over many microfeatures
a general degradation of performance will be
resultant from the loss of microfeatures and not
The clearest indication of item loss is taken as response consistency
If ‘vampire’ as a unit is lost then the meaning of the word will be
unavailable. The same would apply to ‘bites’
If a unit is lost that is not easily encoded linguistically then
the consequences may not be obvious in a linguistic task, but it
may mean that higher level linguistically encoded units become
permanently unavailable or are less easily accessed..
Connectionist models are sensitive to multiple constraints.
The availability of items will depend to the degree the task provides constraints
Alzheimer's Disease patients perform relatively well on highly constrained tasks.
We can reach agreement on the usage of words without
any external referent
It would explain how we acquire words describing private
How do you know that I mean the same thing by
‘I’m sad today’ as you do?
In the context in which these words repeatedly do and do not occur.
Accuracy for associative word pairs was high, this may reflect the time needed to
visually recognise the word pair as a separate distinct compound (e.g. pollen and
count as the compound .pollen-count
Semantic word pairs were not primed in the LVF (RH) and in addition accuracy
related prime unrelated prime target
1-18 1-18 1-18
19-36 19-36 19-36
Participant group 1: 1-18 related, 19-36 unrelated primes
Participant group 2 : 19-36 related, 1-18 unrelated primes
Trials presented in random order .
36 target words
multiply type is presented once to a ‘word’ target and once to a ‘nonword’ target.
Prime word presented for 100 msec
Presented for 150 msec
for 2000msec between trials
SOA (stimulus onset asynchrony)
250msec (100 + 150) =
time between presentation of
the prime and target word
Critical thinking type is presented once to a ‘word’ target and once to a ‘nonword’ target.
representation & computation
Representations are ‘things’ that stand for something.
When you visually perceive a cat you don’t have a cat in
your head, but a mental image (percept) of the cat.
To shout ‘cat’ when you have this percept requires that
you have some idea of what a cat is (cat concept).
Percepts, ideas & concepts are necessarily entities that
stand for something – internal representations.
What is the ontological status of internal representations?
What representation means varies from discipline to
discipline; from theory to theory.
It is almost universally assumed that all cognitive
processes are computational processes that require
internal representations as the medium of computation.
Are intelligent systems computational systems?
Is a symbolic computational framework plausible for
explaining biological cognitive processing?
Do computer simulations explain how the mind/brain
OR is the status of internal representations problematic?
What makes something a representation?
What is the relation between computationalism,
representationalism & the medium of computation?
Questioning the notion that computation and symbolic-
digital processing are the same thing.
Are all internal representations either symbols or
Cognition without representation?
Intrinsic internal representations are required if a system
trafficks in entities whose content is separate from our
BUT mental images of past experience are intrinsic
By contrast extrinsic representations (arbitrary rocks as
content-entities) do depend on our descriptions /
“If brains implement symbolic-digital processing, we should be
able to structurally decompose brains into their discrete rules and
symbols. But we cannot. Whilst we can describe the brain as if it
implemented symbolic-digital processing, that is a far cry from the
brain actually doing it”
Connectionist (distributed networks) and anti-computationalists
take the absence of discrete rules & symbols as evidence that the
brain is not a symbolic-digital computer.
Brains could implement non-symbolic-analog processing; there are
important difference between artificial networks & real
populations of neurons in the brain. Anolog quantities do the
computational work in brains, but in artificial neural nets it is the
activation pattern among processing units. BOTH of which ar poor
candidates for being representations, instead they are the
Given the right sort of interpretation - analog-quantities or
distributed patterns of activation could berepresentations.
It is the interpretation process that makes them representations and
therefore at best they are extrinsic representations that may be
descriptively useful constructs, but they are not internal
How much computational labour do biologically intelligent
systems off-load to their environment, thus minimising the need for
Hybrid symbolic/connectionist model
(i) text (ii) reader LTM
(i) share an argument (ii) meaningful relationship
Stage 1: Construction
Propositions are generated from
all text propositions & all relevant LTM propositions
( therefore information in excess of relevance at this stage)
Related propositions are linked
Hit[boy,girl] linked to Run[girl,home] – girl shared
Information from LTM imported [girl,female]
Inhibition [-ve weighted links]: nodes inhibit one another
Process is BOTTOM-UP : simple local rules
Representation is comprehensive > coherent
Stage 2: Integration
Constructed representation converted into a coherent one
Emphasis = important information = develop situation model
Accomplished by spreading activation between nodes
Nodes with strong connection strengths share activation
Nodes with more activation spread > to neighbours
BETTER CONNECTED nodes gain most of the activation,
less well connected –little/no activation
Critical: Comprehension is cyclic
Information not currently in use ⃕ LT store
Strength LT proposition = accrued activation in cycles
Nodes kept active > one cycle accrue > activation ⃕ LTM
Leads to a situation model:
The most important nodes [as measured by connectedness]
gain most of the activation
Model predicts reader’s memory will be best for
best-connected propositions = model of the story.
Kintsch: comprehension as a broad framework for cognition
Networks show same conjunction errors as Kahneman & Tversky (1982)
Framework can be used to model preference choices
Model is agnostic to probability theory – no numerical data
only valence (+/-) of proposition
If a Construction-integration Model is able to model participant decision
And if it’s success depends on underlying text representations
Then strong argument for the role of content in preferential judgement
Belief: some of the differences in choices caused by
framing & content manipulations may be due to how the
problem elements are organised in a representation of the
3 types of representation
(ii) Decision tree structure
Temporally ordered sequence of events linked by antecedent –
consequent (causal) relationships dominate over a hierarchical
structure. Decision problem with several courses of action, each
represented as a story line
contingent future paths subordinate to main sequence in each storyline.
Format of traditional decision problem
Several alternative courses of action
Alternative courses conditioned on chance/cause nodes
Leads to outcomes with consequences
content-rich domains tabular model utterly fails
gamblesnarrative model fails
legal good fit for narrative & decision tree models
stock good fit for decision tree
grade fit for narrative & decision tree
2. MEMORY RECALLCritical to whether the simulation capturing
Representations. Models do not fit this criterion equally.
legal: narrative better at predicting what parts of the story are remembered. DISCARD decision tree.
gamble : decision tree is excellent at predicting what parts of the story
are remembered. Use more formal representations. DISCARD narrative.
grade : no consensus representation, both moderate correlations
stock : neither model does well.
Narrative representations are more appropriate to legal
stories whereas decision trees are more appropriate for
Despite an identical underlying decision structure
they produce different mental representations of that
structure in participants.
Decision making relies on a variety of cognitive processes
Domain content is associated with
do not generalise to medical, legal or financial decision contexts.
is critical in determining decisions
If the Context changes: change in the way we think about decisions
interacting with personal & moral relevance so that the information
inputs to the decision process are represented differently.
Information processing theories that integrate
Cognition computational models & decision making processes
Combining CT model & models of LTM
Construction-integration model simulates on-line processing
but the relationship with the structure of LTM is unknown:
May allow stronger predictions of decisions.
Use CT :Assess what knowledge is used in making a decision
:Specification of the mental representation