1 / 68

Computational Cognitive Modelling

Computational Cognitive Modelling. COGS 511-Lecture 7 Computational Cognitive Modelling in Neuropsychology – Lesioning Connectionist Models. Related Readings. Readings (Course Pack): Farah, Locality; Ch 36 Cohen and Servan-Schreiber,Context, Cortex and Dopamine; Ch 14 Optional:

presta
Download Presentation

Computational Cognitive Modelling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Cognitive Modelling COGS 511-Lecture 7 Computational Cognitive Modelling in Neuropsychology – Lesioning Connectionist Models COGS 511

  2. Related Readings Readings (Course Pack):Farah, Locality; Ch 36 Cohen and Servan-Schreiber,Context, Cortex and Dopamine; Ch 14 Optional: McClelland et al. Complementary Learning Systems; Ch 17 Last three articles from Polk and Seifert (2002) McLeod et al. Connectionist neuropsychology-lesioning networks in McLeod et al. (1998) Sibley and Kello (2005) Double Dissociations. Cognitive Systems Research 6. 61-69; D. Stein and J. Ludik (1998) Neural Networks and Psychopathology: Connectionist Models in Practice and Research. Cambridge University Press Saffran et al. (2000) Computational Modeling of Language Disorders. In Gazzaniga (ed) The New Cognitive Neurosciences, 2nd ed. MIT Press. - both available from METU library Suggested More Recent Readings: Frank, M. and E. D. Claus (2006) Anatomy of a Decision:Striato-orbitofrontal Interactions in Reinforcement Learning, Decision Making and Reversal. Psychological Review. 13(2): 300-326. O’Reilly, R. and M. Frank (2006). Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia. Neural Computation. 18(2): 283-328. (more in notes part of the slides..) COGS 511

  3. Desiderata for Lesioning Models • Be able to provide specific descriptions for • graceful degradation and recovery patterns of performance after brain damage. • variation among patients and between patients and normal subjects. • temporal dynamics of processing eg response time, event related potential data • Fine grain hypotheses of brain-behaviour mapping • Selectively evaluate underspecified areas of cognitive representations in areas such as semantics • Newer models: Integrate cognitive neuroscience evidence. COGS 511

  4. Lesioning Connectionist Models • Weakening/Removing Connections • Deleting Units • Increasing Noise/rate of decay of activation • Aims: • Testing the Validity of the Model • Bringing New Insights into the Pathological Conditions COGS 511

  5. Lesioning Cognitive Architectures • fMRI/ACT-R mapping studies, possible lesioning by manipulation of • Noise, decay, effort etc. parameters • Haarman, Just and Carpenter (1997) model of sentence comprehension in aphasia using CAPS: • Hypothesis: It is a result of decreased verbal working memory capacity • Realized by: Decreased activation available to working memory elements. COGS 511

  6. The capacity limit of working memory • Capacity limit: the observation that people’s performance declines rapidly with an increase in memory demand in a wide variety of experimental tasks. Memory demand: the number of independent items that must be held simultaneously available for processing. (Oberaurer and Kliegel, 2006) • Highly general phenomena, found and characterized in a variety of experimental tasks COGS 511

  7. Resource accounts of capacity limits • Limited pool of activation • must be shared for all memory and processing tasks within broad content domains (e.g., one resource pool for verbal tasks and another one for spatial tasks). Just and Carpenter (1992) • limited amount of source activation that must be shared among those chunks that are held in working memory at any time Anderson, Reder, and Lebiere (1996) • Criticized for being too unconstrained and empirically empty (Navon, 1984). COGS 511

  8. Alternatives • Capacity limit arising from rapid time-based decay (Barrouillet et al., 2004, Page and Norris, 1998 and Kieras et al., 1999). • Capacity limit arising from mutual interference between partially overlapping representations held in working memory (Nairne, 1990 and Saito and Miyake, 2004) • Or a limitation of the focus of attention to be directed to a maximum of about four chunks (Cowan, 2001). COGS 511

  9. Neurological Driven Connectionist Models • Separate modules for representation, maintenance, and selective gating. No inhibition to introduce capacity limitations. • frontal cortex - a maintenance mechanism for the processing of sensory information • basal ganglia - selective gating mechanisms that help modulate which incoming items are kept recurrently activated • the amygdala- learning and memory through dynamic gating Frank, Loughry, and O’Reilly (2001), (O’Reilly, 2006), O’Reilly & Frank, (2006). • A recurrent network of different kinds of spiking neurons, built with synaptic remodeling as a developmental mechanism Macoveanu et al (2006) causing differences in working memory capacity (factors such as overall connectivity, synaptic specificity, maximal synaptic strength). COGS 511

  10. Cohen and Servan-Schreiber (1992) • Set of connectionist models that simulate both normal and schizophrenic patterns of performance on • Stroop task, Continous Performance test (Selective Attention) • A Lexical Disambiguation Task (language processing) • Single Functional Deficit: A disturbance in the internal representation of context • Related with a Biological Deficit: Reduction of Effects of Dopamine in Prefrontal Cortex COGS 511

  11. Background Terminology • Neurotransmitters: any of a group of chemical agents released by neurons to stimulate neighbouring neurons, thus allowing impulses to be passed from one cell to the next throughout the nervous system • acetylcholine, • norepinephrine (noradrenalin), • dopamine, and • serotonin. • Neuromodulators are substances that do not directly activate ion-channel receptors but that, acting together with neurotransmitters, enhance the excitatory or inhibitory responses of the receptors. COGS 511

  12. Empirical Evidence Suggest • Schizophrenics suffer from an inability to construct and maintain internal representations of context for the control of action. Context information is relevant to but distinct from the content of the actual response (not thought to be related with short term memory) • Prefrontal cortex plays a role in maintaining such representations • An intact mesocortical dopamine system is necessary for the functioning of prefrontal cortex • Disturbances of both the mesocortical dopamine system and the prefrontal cortex appear to be involved in schizophrenia. COGS 511

  13. Stroop Task • Commonly used for studying selective attention: the ability to respond to one set of stimuli even when, other, more compelling stimuli are available. • Two subtasks: • Name the color of the ink in which a word is printed • Read the word aloud while ignoring the ink color • Three conditions • Conflict stimuli RED • Congruent stimuli RED • Control stimuli XXXX or RED • Basic Effects: • Word reading is faster than color naming • Ink color has no effect on the speed of word reading • Words have a large effect on color naming COGS 511

  14. Schizophrenics on Stroop Task • An Overall Slowing of Responses • A Statistically Disproportionate Slowing of Responses in the Interference Condition of the Color-naming Task • A result of general slowed performance or a specific attentional deficit (difficulty w. selective attention)? • Internal Context: Task instructions for a specific block of trial (a block consisting of one type of task e.g. Colour naming) COGS 511

  15. Continuous Performance Test • Aim: Detecting a target event (e.g. Detect X appearing in a sequence of letters or Respond to X when it follows A-CPT Double) among a sequence of briefly presented stimuli and avoiding responding to distractor stimuli. • Hit rate: percentage of correctly reported targets • False alarms: erroneous responses to distractors. • Schizophrenics show lower hit rates and similar or higher false alarm rates compared to normal subjects esp. in tasks that are context dependent, eg. A target event consisting of consecutive reoccurence of any stimulus-CPT Double. • Context changes from trial to trial; additional demand on internal representation of context with Stroop task. COGS 511

  16. Schizophrenic Language Deficits • Tendency to interpret the dominant meaning of a homonym (here, writing instrument) even when the context suggests otherwise: “The farmer needed a new pen for his cattle.” • This occurs only when context came first “you can not keep chickens [CLEAR SCREEN,PAUSE] without a pen” • Three conditions: weak meaning correct, context last weak meaning correct, context first dominant meaning correct, context first • Schizophrenics made significantly more dominant meaning errors than did controls when the weak meaning was correct and only when context came first. • In “clozing” speech (i.e. guessing deleted words), schizophrenics performed comparably well when contextual cues were local (two-three words); but did not improve performance as normal subjects do when more context (more words) were provided. COGS 511

  17. Biological Disturbances in Schizophrenia • Studies of prefrontal cortex and its role in maintaining representations esp inhibitance of reflexive or previously reinforced tendencies. • A not B task, Wisconsin Card Sort Task • Physiological Imaging Studies • Prefrontal cortex is a primary projection area for mesocortical dopamine system, a disturbance of which has been clearly shown for schzophrenia. • Dopamine overactivity (hallucinations, delusions) and dopamine underactivity (cognitive impairment, social disfunction)-latter mostly associated with frontal lobe activity COGS 511

  18. Assumptions of the Connectionist Simulations • Physiological influence of dopamine can be simulated by changes in the gain parameter (scaling the net input in the activation equation) of individual units. • Caveats: Potentiation has not been directly experimentally validated; Potentiation of both excitatory and inhibitory inputs have not been fully explained (at the time of writing). • Gain of all units in a supposed modular area are changed equally to simulate the neuromodulation affect. • Reducing gain for schizophrenia COGS 511

  19. Reducing Gain COGS 511 (Cohen and Servan-Schreiber, 1992)

  20. Modelling the Stroop Effect • Multilayer feedforward network with local input and output units w. backpropagation • Two pathways: Color naming and word reading • Specification of task is done via task demand units that have excitatory and inhibitory links to relevant tasks • Reaction time: Number of cycles it takes for an output unit to reach a threshold • Model has more training on word reading task • Able to simulate Stroop task effects in normal subjects COGS 511

  21. (Cohen and Servan-Schreiber, 1992) COGS 511

  22. Lesioning • Gain of only the task demand units are reduced. Result: An increase in overall response time and a disproportionate increase on color naming conflict trials. • Explanation: All processing depends on selective attention and context representation to some degree; weaker, less automatic processing (ie color naming) more so. COGS 511

  23. Lesioning for a Generalized Deficit • Cascade rate: The rate at which information accumulates (increased activity) for units in the network in response to an input. • All units’ cascade rate were reduced, matched with empirical data in word reading condition. • Alternatively, all units’ thresholds were raised. • These led to an overall slowing of response but cannot explain the interference task, which is against the view of a generalized deficit causing the symptoms of schizophrenia. COGS 511

  24. (Cohen and Servan-Schreiber, 1992) COGS 511

  25. Interpretation • Weights in colour naming channel are weaker as they have been practised less, thus input to the hidden units is lower in colour naming channel. Using a logistic activation function, same decrease in gain causes larger decrease in colour naming. COGS 511

  26. Simulation of Continous Performance Test • CPT-Double: A target consisting of any consecutive reoccurence of a stimulus (e.g. a B following immediately another B). • A recurrent network of four modules: an input module (visual features of individual letters), a prior stimulus module (context, not necessarily individual letters as shown, thus not necessarily a short term declarative memory), an intermediate associative module, and a response module COGS 511

  27. (Cohen and Servan-Schreiber, 1992) COGS 511

  28. Lesioning • Normal subjects performance simulated by adding noise to processing, adjusted to fit normal subject’s data. • Schizophrenic performance by reducing gain (from 1 to 0.6 as in Stroop simulation) in the units in the prior stimulus module. Result: 45% misses, 9% false alarms (responses to similar stimuli or random errors) as in schizophrenic performance COGS 511

  29. (Cohen and Servan-Schreiber, 1992) COGS 511

  30. Context Dependent Lexical Disambiguation • Exploring dominant response biases and maintenance of context within a single model • Same basic network as in CPT task: word input module, a response module (ie meanings such as “a writing implement” for “pen”), a discourse module (context, ie. “writing”) and a semantic module. • Training on 30 input words some of which are ambiguous lexically. Dominant meanings were more frequently presented in the training set. • Also trained on context words (through discourse module simultaneously with input words) that share contexts with input words e.g. CHICKEN in “farming” COGS 511

  31. Simulations and Lesioning • First present a context word (e.g. CHICKEN) and then an input word (e.g. PEN)-both through input model. Model chooses the appropriate response for the context. Two conditions: context last, context first • Overall noise adjusted for normal subjects’ performance. Lesioning again by reduction of gain in discourse units. Errors in choosing the less dominant meaning when context came first were similar to schizophrenics’ performance. Gain did not have significance when context came last or when dominant meaning was the correct one. • When the representation of context is degraded, it is more susceptible to cumulative effects of noise. COGS 511

  32. When the weak meaning is correct... (Cohen and Servan-Schreiber, 1992) COGS 511

  33. Evaluation of All Three Models • Single parameter change (except for behaviour of control subjects) • Tested for three domains- on models developed independently • Support for hypotheses about dopamine and prefrontal cortex; new predictions COGS 511

  34. Background Terminology • Neocortex (cf cerebral cortex) is part of the outermost layer of our brains. It is responsible for our highest mental functions (e.g. planning and strategy formation/execution). Assumed to be the structure in the brain that differentiates mammals from other vertebrates and to be responsible for the evolution of intelligence. • Hippocampus: A structure in the limbic system of the braininvolved in emotion, motivation, navigation by cognitive maps, learning, and the consolidation of long-term memory. • Amnesia is a general term used to describe instances of memory loss occurring most often as a result of damage to the brain from trauma, stroke, Alzheimer’s disease, alcohol and drug toxicity, or infection. • Anterograde Amnesia • Retrograde Amnesia COGS 511

  35. Complementary Learning Systems (McClelland et al., 1995) • Hippocampal system: special role in learning and memory • Extensive lesions of it can cause profound deficit in learning new material while material acquired well before the lesion are intact. • Lesions selective for certain kinds of learning: e.g. paired association studies; some relation to declarative memory; spatial learning • Some kinds of learning seem to be spared, characterized by implicit memory forms, e.g gradually acquired skills, repetition priming tasks • Selective memory deficit for material acquired shortly before the date of the lesion: graded retrograde amnesia (can be as severe as 15 years). Performance on recent material can be worse than performance on somewhat older material. • Consolidation: Change in the dependence on the hippocampal system over time. COGS 511

  36. Complementary Learning Systems Hypotheses • Neocortical Processing System: Task performance through patterns of activation that can act as cues for each other. Each information processing task leads to adaptive adjustments of connections but one or few repetitions are not enough to reinstate a pattern. • Hippocampal memory system: Neocortical representation is represented in a compressed copy in hippocampal memory; connections between which are bidirectional. Potential for attractor patterns to become stable memories. • Reinstatement in the hippocampal memory system causes reinstatement in neocortex. This would allow the control for behavioural responses, and memories becoming ultimately independent of hippocampus as neocortical connections are readjusted. (gradual consolidation process in episodic, semantic and encyclopedic memory) COGS 511

  37. Key Questions • Why are the changes not made directly in the neocortex? • Why does the incorporation of new material into the neocortex take so long? Connectionist modelling may provide an answer. COGS 511

  38. Experiences with Connectionist Models • Interleaved learning in connectionist systems: items acquired very gradually, interleaved with exposure to other items. What counts as related may not be apparent from the surface properties: on conceptual semantics of living things, e.g. how do you know that a robin breathes? • Semantic networks vs connectionist models (latter learning similar representations for similar concepts, relieved of the problematic issues of the former such as dealing w. exceptions, multiple inheritance) COGS 511

  39. Studies based on Rumelhart (1990) • Network on propositions about living things, learning by backpropagation with a small learning rate, trained for 500 epochs- interleaved learning • Input: Activation of appropriate concept name and relation term (isa, has, can, and is) • Output: Activation of correct completion(s) of the input • Robin is a bird, robin is an animal etc. COGS 511

  40. (McClelland, et al., 1995) COGS 511

  41. Emerging Similarity Structures • Concept representation units (first layer of hidden units connected to input units): initially everything is similar to each other; gradually similarity clusters (measured by sum of squared differences between the activations of the corresponding elements) among similar concepts form. • After training, further interleaved training assigns a similar representation to sparrow as the case for robin and canary. • Relation terms it has never been trained on (is, can, has) can be derived then. COGS 511

  42. (McClelland, et al., 1995) COGS 511

  43. Catastrophic Interference • AB-AC paradigm: Retroactive interference of one set of associations(AC; e.g. locomotive-dishtowel) on recall of a set of associations previously acquired (AB;eg. locomotive-banana). • Feedforward, multilayer network of stimulus units (A), context units (like selective attention units in Stroop task), output units for response (B or C). (McCloskey and Cohen, 1989) • While humans are able to operate more than 50% correct recall on AB list after the AC performance reaches an asymptote; the network on AB list is about 0% correct. • Reducing catastrophic interference by reducing overlap in shared representations also implies in the loss of ability to exploit similar structures. COGS 511

  44. (McClelland, et al., 1995) COGS 511

  45. Remedy • Catastrophic interference can be reduced dramatically if added gradually with ongoing exposures to other examples from the same domain. • Study on Rumelhart (1990) on acquisition of new information (eg penguins are birds but can not fly) through focused vs interleaved learning showed that learning proceeds much faster in focused learning but interference is also much more. • Focal dystonia: an impairment in limbs caused by focused learning for human sensory-motor system COGS 511

  46. Conclusions • Hippocampus is there to provide a medium for the initial storage of memories (thus rapidly forming memory traces) in a form that avoids interference with the knowledge in the neocortical system. • Incorporation takes a long time to reduce interference with the structured material already in the neocortex. COGS 511

  47. Farah (1994): A Critique of “Locality” Assumption • Damage to one component of the functional architecture will have “local” effects; nondamaged components will function normally. • Views on range of interactivity (independence and encapsulation) differ among researchers. • Empirical evidence from double dissociations in the literature • Phonological vs surface dyslexia  suggestions of dual mechanisms for grapheme-phoneme translation vs whole word reading • Function words vs content words in spoken language processing • Face recognition module:overt and covert mechanisms • Multiple learning mechanisms for implicit vs declarative memory COGS 511

  48. Two Empirical Issues • Modularity vs interactionism is a matter of degree; and thus it is important to know the degree of deviations from locality • Is the locality assumption indispensible to cognitive neuropsyhology? “Selective deficit in ability A implies a component of the functional architecture dedicated to A.” A critical investigation of dissociations in three different domains via connectionist models COGS 511

  49. Sibley and Kello (2005) • Simulating defining features of double dissociation between phonological and surface dyslexia in word reading. • Phonological dyslexia: poor reading aloud of nonwords (eg SHONG), relatively intact word reading-damage to sublexical route (orthography to lexical)? • Surface dyslexia: poor reading aloud of words with irregular spelling-sound correspondences (e.g. PINT) with relatively intact non word reading-damage to lexical route (localist/semantic representations) COGS 511

  50. Simulating Double Dissociations • Input gain parameter is manipulated in both a feed forward multilayer perceptron and a recurrent network. • No architectural separation of sublexical vs lexical processes nor of phonological vs semantic processes. • At high levels of input gain, there is conjunctive processing, conjunctions of input patterns are computed-novel words not supported, similar to phonological dyslexia. • At low levels of input gain, there is componential processing, each input dimension is used independently-irregular items are not supported, similar to surface dyslexia. COGS 511

More Related