1 / 43

Explanation and Simulation in Cognitive Science

Explanation and Simulation in Cognitive Science. Simulation and computational modeling Symbolic models Connectionist models Comparing symbolism and connectionism Hybrid architectures Cognitive architectures. Simulation and Computational Modeling.

teigra
Download Presentation

Explanation and Simulation in Cognitive Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Explanation and Simulation in Cognitive Science • Simulation and computational modeling • Symbolic models • Connectionist models • Comparing symbolism and connectionism • Hybrid architectures • Cognitive architectures

  2. Simulation and Computational Modeling • With detailed and explicit cognitive theories, we can implement the theory as a computational model • And then execute the model to: • Simulate cognitive capacity • Derive predictions from the theory • The predictions can then be compared to empirical data

  3. Questions • What kinds of theories are amenable to simulation? • What techniques work for simulation? • Is simulating the mind different from simulating the weather?

  4. The Mind & the Weather • The mind may just be a complex dynamic system, but it isn’t amenable to generic simulation techniques: • The relation between theory and implementation is indirect: theories tend to be rather abstract • The relation between simulation results and empirical data is indirect: simulations tend to be incomplete • The need to simulate helps make theories more concrete • But “improvement” of the simulation must be theory-drive, not just an attempt to capture the data

  5. Symbolic Models • High-level functions (e.g., problem solving, reasoning, language) appear to involve explicit symbol manipulation • Example: Chess and shopping seem to involve representation of aspects of the world and systematic manipulation of those representations

  6. Central Assumptions • Mental representations exist • Representations are structured • Representations are semantically interpretable

  7. What’s in a representation? • Representation must consist of symbols • Symbols must have parts • Parts must have independent meanings • Those meanings must contribute to the meanings of the symbols which contain them • e.g., “34” contains “3” and “4”, parts which have independent meanings • the meaning of “34” is a function of the meaning of “3” in the tens position and “4” in the units position

  8. In favor of structured mental representations • Productivity • It is through structuring that thought is productive (finite number of elements, infinite number of possible combinations) • Systematicity • If you think “John loves Mary”, you can think “Mary loves John” • Compositionality • The meaning of “John loves Mary is a function of its parts, and their modes of combination • Rationality • If you know A and B is true, then you can infer A is true Fodor & Pylyshyn (1988)

  9. What do you do with them? • Suppose we accept that there are symbolic representations • How can they be manipulated? …by a computing machine • Any such approach has three components • A representational system • A processing strategy • A set of predefined machine operations

  10. Automata Theory • Identifies a family of increasingly powerful computing machines • Finite state automata • Push down automata • Turning machines

  11. Automata, in brief(Figure 2.2 in Green et al., Chapter 2) • This FSA takes as input a sequence of on and off messages, and accepts any sequence ending with an “on” • A PDA adds a stack: an infinite-capacity, limited access memory, so that what a machine does depends on input, current state, plus the memory

  12. A Turing machine changes this memory to allow any location to be accessed at any time. An the State transition function specifies read/write instructions, as well as which state to move to next. • Any effective procedure can be implemented on an appropriately programmed Turing machine • And Universal Turing machines can emulate any Turing machine, via a description on the tape of the machine and its inputs • Hence, philosophical disputes: • Is the brain Turing powerful? • Does machine design matter or not?

  13. More practical architectures • Von Neumann machines: • Strictly less powerful than Turing machines (finite memory) • Distinguished area of memory for stored programs • Makes them conceptually easier to use than TMs • Special memory location points to next-instruction on each processing cycle: fetch instruction, move pointer to next instruction, execute current instruction

  14. Production Systems • Introduced by Newell & Simon (1972) • Cyclic processor with two main memory structures • Long term memory with rules (~productions) • Working memory with symbolic representation of current system state • Example: IF goal (sweeten(X) AND available (sugar) THEN action (add(sugar, X)) and retract (goal(sweeten(X)))

  15. Recognize phase (pattern matching) • Find all rules in LTM that match elements in WM • Act phase (conflict resolution) • Choose one matching rule, execute, update WM and (possibly) perform action • Complex sequences of behavior can thus result • Power of pattern matcher can be varied, allowing different use of WM • Power of conflict resolution will influence behavior given multiple matches • Most specific? • This works well for problem-solving. Would it work for pole-balancing?

  16. Connectionist Models • The basic assumption • There are many processors connected together, and operating simultaneously • Processors: units, nodes, artificial neurons

  17. A connectionist network is… • A set of nodes, connected in some fashion • Nodes have varying activation levels • Nodes interact via the flow of activation along the connections • Connections are usually directed (one-way flow), and weighted (strength and nature of interaction; positive weight = excitatory; negative = inhibitory) • A node’s activation will be computed from the weighted sum of its inputs

  18. Local vs. Distributed Representation • Parallel Distributed Processing is a (the?) major branch of connectionism • In principle, a connectionist node could have an interpretable meaning • E.g., active when ‘red’ input, or ‘grandmother’, or whatever • However, an individual PDP node will not have such an interpretable meaning • Activation over whole set of nodes corresponds to ‘red’ • Individual node participates in many such representations

  19. PDP • PDP systems lack systematicity and compositionality • Three main types of networks: • Associative • Feed-forward • Recurrent

  20. Associative • To recognize and reconstruct patterns • Present activation pattern to subset of units • Let network ‘settle’ in stable activation pattern (reconstruction of previously learned state)

  21. Feedforward • Not for reconstruction, but for mapping from one domain to another • Nodes are organized into layers • Activation spreads through layers in sequence • A given layer can be thought of as an “activation vector” • Simplest case: • Input layer (stimulus) • Output layer (response) • Two layer networks are very restricted in power. Intermediate (hidden) layers gain most of the additional computational power needed.

  22. Recurrent • Feedforward nets compute mappings given current input only. Recurrent networks allow mapping to take into account previous input. • Jordan (1986) and Elman (1990) introduced networks with: • Feedback links from output or hidden layers to context units, and • Feedforward links from the context units to the hidden units • Jordan network output depends on current input and previous output • Elman network output depends on current input and whole of previous input history

  23. Key Points about PDP • It’s not just that a net can recognize a pattern or perform a mapping • It’s the fact that it can learn to do so, on the basis of limited data • And the way that networks respond to damage is crucial

  24. Learning • Present network with series of training patterns • Adjust the weights on connections so that the patterns are encoded in the weights • Most training algorithms perform small adjustments to the weights per trial, but require many presentations of the training set to reach a reasonable degree of performance • There are many different learning algorithms

  25. Learning (contd.) • Associative nets support Hebbian learning rule: • Adjust weight of connection by amount proportional to the correlation in activity of corresponding nodes • So if both active, increase weight; if both inactive, increase weight; if they differ, decrease weight • Important because this is biologically plausible…and very effective

  26. Learning (contd.) • Feedforward and recurrent nets often exploit the backpropagation of error rule • Actual output compared to expected output • Difference computed and propagated back to input, layer by layer, requiring weight adjustments • Note: unlike Hebb, this is supervised learning

  27. Psychological Relevance • Given a network of fixed size, if there are two few units to encode the training set, then interference occurs • This is suboptimal, but is better than nothing, since at least approximate answers are provided • And this is the flipside of generalization, which provides output for unseen input • E.g., weep  wept; bid  bid

  28. Damage • Either remove a proportion of connections • Or introduce random noise into activation propagation • And behavior can simulate that of people with various forms of neurological damage • “Graceful degradation”: impairment, but residual function

  29. Example of Damage • Hinton & Shallice (1991), Plaut & Shallice (1993) on deep dyslexia: • Visual error (‘cat’ read as ‘cot’) • Semantic error (‘cat’ read as ‘dog’) • Networks constructed for orthography-to-phonology mapping, lesioned in various ways, producing behavior similar to human subjects

  30. Symbolic Networks • Though distributed representations have proved very important, some researchers prefer localist approaches • Semantic networks: • Frequently used in AI-based approaches, and in cognitive approaches which focus on conceptual knowledge • One node per concept; typed links between concepts • Inference: link-following

  31. Production systems with spreading activation • Anderson’s work (ACT, ACT*, ACT-R) • Symbolic networks with continuous activation values • ACT-R never removes working memory elements; activation instead decays over time • Productions chosen on basis of (co-) activation

  32. Interactive Activation Networks • Essentially, localist connectionist networks • Featuring self-excitatory and lateral inhibitory links, which ensure that there’s always a winner in a competition (e.g., McClelland & Rumelhart’s model of letter perception) • Appropriate combinations of levels, with feedback loops in them, allow modeling of complex data-driven and expectation-driven bahavior

  33. Comparing Symbolism & Connectionism • As is so often the case in science, the two approaches were initially presented as exclusive alternatives

  34. Connectionist: • Interference • Generalization • Graceful degradation • Symbolists complain: • Connectionists don’t capture structured information • Network computation is opaque • Networks are “merely” implementation-level

  35. Symbolic • Productive • Systematic • Compositional • Connectionists complain: • Symbolists don’t relate assumed structures to brain • They relate them to von Neumann machines

  36. Connectionists can claim: • Complex rule-oriented behavior *emerges* from interaction of subsymbolic behavior • So symbolic models describe, but do not explain

  37. Symbolists can claim: • Though PDP models can learn implicit rules, the learning mechanisms are usually not neurally plausible after all • Performance is highly dependent on exact choice of architecture

  38. Hybrid Architectures • But really, the truth is that different tasks demand different technologies • Hybrid approaches explicitly assume: • Neither connectionist nor symbolic approach is flawed • Their techniques are compatible

  39. Two main hybrid options: • Physically hybrid models: • Contain subsystems of both types • Issues: interfacing, modularity (e.g., use Interactive Activation Network to integrate results) • Non-physically hybrid models • Subsystems of only one type, but described two ways • Issue: levels of description (e.g., connectionist production systems)

  40. Cognitive Architectures • Most modeling is aimed at specific processes or tasks • But it has been argued that: • Most real tasks involve many cognitive processes • Most cognitive processes are used in many tasks • Hence, we need unified theories of cognition

  41. Examples • ACT-R (Anderson) • Soar (Newell) Both based on production system technology • Task-specific knowledge coded into the productions • Single processing mechanism, single learning mechanism

  42. Like computer architectures, cognitive architectures tend to make some tasks easy, at the price of making other hard • Unlike computer architectures, cognitive architectures must include learning mechanisms • But note that the unified approaches sacrifice genuine task-appropriateness and perhaps also biological plausibility

  43. A Cognitive Architecture is: • A fixed arrangement of particular functional components • A processing strategy

More Related