1 / 22

Functional Constraints on Architectural Mechanisms

Functional Constraints on Architectural Mechanisms. Christian Lebiere (cl@cmu.edu) Carnegie Mellon University Bradley Best (bjbest@adcogsys.com) Adaptive Cognitive Systems. Introduction. Goal: Strong Cogsci – single integrated model of human abilities that is robust, adaptive and general

iden
Download Presentation

Functional Constraints on Architectural Mechanisms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Functional Constraints on Architectural Mechanisms Christian Lebiere (cl@cmu.edu) Carnegie Mellon University Bradley Best (bjbest@adcogsys.com) Adaptive Cognitive Systems

  2. Introduction • Goal: Strong Cogsci – single integrated model of human abilities that is robust, adaptive and general • Not just an architecture that supports it (Newell test evaluation) but a system that actually does it • Not strong AI (means matter), weak Cogsci (general) • Plausible strategies: • Build a single model from scratch – traditional AI strategy • Incremental assembly – successful CS system strategy • But little/no reuse of models limits complexity! 2009 ACT-R Workshop

  3. Model Fitting Constraint • Fitting computational models to human data is the “coin of the realm” of cognitive modeling • Is it a sufficient constraint to achieve convergence toward the goal of model integration and robustness • Good news: cognitive architectures are increasingly converging toward a common modular organization • Bad news: still very little model reuse – almost every task results in a new model developed tabula rasa • Question: have we gotten right the tradeoff between precision (fitting data) and generality (reuse/integration)? 2009 ACT-R Workshop

  4. You Can’t Play 20 Models… • 35 years ago Newell raised a similar issue with convergence in experimental psychology • He diagnosed much of the issue with the lack of emphasis on the control structure to solve a problem • He offered 3 prognoses for “putting it together”: • Complete processing models (and PS suggestion) – check! • Analyze a complex task (chess suggestion) – progress but… • One program for many tasks (integration, e.g WAIS) – fail? • What have been the obstacles to putting it together? 2009 ACT-R Workshop

  5. Obstacles to Integration • Models tend to be highly task-specific – they usually cannot be used directly even for closely related tasks • They tend to represent the final point of the process from initial task discovery to final asymptotic behavior • Modeler’s meta-cognitive knowledge of the task gets hardwired into the model • Experience with High-Level Language (HLSR) compilation • Task discovery processes, including metacognitive processes, should be part of the model/architecture • Tackles broader category of tasks through adaptation 2009 ACT-R Workshop

  6. Forcing Functions for Integration • Model comparison challenges (e.g. DSF) that feature: • Breadth of applicability (e.g. multiple conditions) • Unknown conditions and/or data (tuning vs testing sets) • Integration of multiple functionalities (control, prediction) • Unpredictable domains, e.g. adversarial behavior: • Breadth and variability of behavior • Constant push for adaptivity and unpredictability • Strong incentive to maximize functionality • Architectural implications to model integration? • Focus on both control and representation structure 2009 ACT-R Workshop

  7. A Tour of Four Modules Retrieval Buffer Goal Buffer Visual Buffer Manual Buffer ImaginalBuffer • All modules have shortcomings in robustness and generality • Ability to craft models for lab tasks does not guarantee plausible behavior in open-ended situations Working Memory Module Procedural Module Declarative Module Intentional Module Vision Module Motor Module Environment 2009 ACT-R Workshop

  8. Module 1: Declarative • Base-level learning can lead to looping if unchecked • Most active chunk is retrieved, then its activation boosted… • Very hard to control if compiling higher-level model • Many logical conditions require repeated retrieval loops • Old solution: tag chunk on retrieval (e.g. list learning) • New solution: declarative finsts to perform tagging +retrieval> isa item index =index - retrieved =goal =retrieval> retrieved =goal (sgp :declarative-num-finsts 5 :declarative-finst-span 10) +retrieval> isa item index =index :recently-retrieved nil 2009 ACT-R Workshop

  9. Base-Level Inhibition (BLI) Also in other domains: arithmetic, web navigation, physical environments Provides inhibition of return resulting in soft, adaptive round-robin in free-recall procedures w/o requiring any additional constraints 2009 ACT-R Workshop

  10. Emergent Robustness Frequencies of Free Recall as a Function of Item Rank • Running the retrieval mechanism unsupervised leads to the gradual emergence of an internal power law distribution • It differs from both the pathological behavior of the default BLL, and from the hard and fixed round-robin of the tag/finst version 2009 ACT-R Workshop

  11. Module 2: Procedural • Procedural module • Production rule set need careful crafting to cover all cases • Degenerate behavior in real environments (stuck, loop, etc) • Esp. difficult in continuous domains (ad hoc thresholds, etc) • Generalization of production applicability • Often need to use declarative module to leverage semantic generalization through partial matching mechanism • Unification between symbolic (matching) and subsymbolic (selection) processes is desirable for robustness, adaptivity and generalization 2009 ACT-R Workshop

  12. Production Partial Matching (PPM) • Same principle as partial matching in declarative memory • Unification is good and logical given representation (neural models) • Matching Utility • Dynamic generalization: production condition defines ideal “prototype” situation, not range of application conditions • Adaptivity: generalization expands with success as utility rises, contracts with failure as production over-generalizes • Safe version: explicit ~ test modifier similar to -, <, >, etc • Learning new productions can collapse across range and learn differential sensitivity to individual buffer slot values 2009 ACT-R Workshop

  13. Building Sticks Task Standard Production Model (Lovett, 1996) • 4 productions • Force-over • Force-under • Decide-over • Decide-under • Hardwired range • Utility Learning • Instance-based Model (Lebiere, 1997) • Chunks: under, over, target & choice slots • Partial matching on closeness of over and under to target • Base-level learning w/ degree of match • New Partial-Matching Production Model • 2 productions • Over: match over stick against target • Under: match under stick against target • Utility learning mixed with degree of match 2009 ACT-R Workshop

  14. Procedural or Instance-based? • One of Newell’s decried “oppositions” reappeared in the computational modeling context • Neuroscience (e.g., fMRI) might provide arbitrating data between modules but likely not within module • Correct solution is likely a combination of initial declarative retrieval to procedural selection • Need a smooth transition from declarative to procedural mechanism without modeler-induced discontinuity in terms of arbitrary control structure 2009 ACT-R Workshop

  15. Module 3: Working Memory • Current WM: Named, fixed buffers, types, slots • Pros • Precise reference makes complex information processing not only possible but relatively easy • Familiar analogy to traditional programming • Cons • Substantial modeling effort required • Modeling often time-consuming and error-prone • Hard limit on flexibility of representation • Fine in laboratory tasks, more problematic in open-ended, dynamic, unpredictable environments

  16. Representation Implications • Explicit slot (and also type, buffer) management • Add more slots to represent all information needed • Pro: slots have clear semantics • Con: profligate, dilution of spreading activation • Reuse slots for different purposes over time • Pro: keep structures relatively compact • Con: uncertain semantics (what is in this slot right now?)‏ • Use different (goal) types over time • Pro: cleaner semantics, hierarchical control • Con: increase management of context transfer • More buffers or reuse buffers as storage • Less of that for now but same general drawbacks as slot, type • Integration issues (episodic memory)‏

  17. Working Memory Module • Replace chunk structures in buffers with sets of values associated with fast decaying short-term activation • Faster decay rate than LTM and no reinforcement • Generalize pattern matching to ordered set of values • Double match of semantic and position content • Assumptions about context permanence • Short-term maintenance w/ quick decay (sequence learning) • Explicit rehearsal possible but impact on strength and ordering

  18. N-Back Task (p back4 =goal> isa nback stimulus =stimulus match nil +intentional> =back1 =back2 =back3 =back4 ==> !output! (Stimulus =stimulus retrieving 4-back =back4) =goal> match =back4) (p back4 =goal> isa nback stimulus =stimulus match nil =imaginal> isa four-back back1 =back1 back2 =back2 back3 =back3 back4 =back4 ==> !output! (Stimulus =stimulus matching 4-back =back4) =goal> match =back4) • Nback working memory task: is current stimulus same as the one n back? • Default ACT-R model holds and shifts items in buffer: perfect recall! • Working memory model adds item to WM, then decays and partial match • Performance decreases with noise and n up to plateau – good fit to data

  19. Module 4: Episodic Memory • Need integration of information in LTM across modalities • Main role of episodic memory is support goal management • Store snapshots of working memory • Concept of chunk slot is replaced with activation • Similar to connectionist temporal synchrony binding • Straightforward matching of WM context to episodic chunks • Double, symmetrical match of semantic and activation content • Issues: • Creation signal: similar to current chunk switch in buffer • Reinforcement upon rehearsal? • Relation to traditional LTM? Similar to role of HC in training PC?

  20. List Memory • Pervasive task requires multi-level indexing representation • “micro-chunks” vs traditional representation • Captures positional confusion and failures • Is it strategy choice or architectural feature? • How best to provide this function pervasively +retrieval> isa item parent =group position fourth :recently-retrieved nil 2009 ACT-R Workshop

  21. Related Work • Instruction following (Anderson and Taatgen) • General model for simple step-following tasks • Minimal control principle (Taatgen) • Limit modeler-imposed control structure • Threading and multitasking (Salvucci and Taatgen) • Combine independent models and reduce interference • Metacognition (Anderson) • Enable model to discover original solution to new problem • Call for new thinking on “an increasingly watered down set of principles for the representation of knowledge” (Anderson) 2009 ACT-R Workshop

  22. Conclusion • Available data is often not enough to discriminate between competing models of single tasks • Newell might have been too optimistic about the ability to uniquely infer the control method given data and system • More data can help but often leads to more specialized and complex models and away from integration • Focus on functionality, esp. Newell’s 2nd (complex tasks) and 3rd (multiple tasks) criteria for further discrimination • Focusing on tasks that require open-ended behavior can enhance the robustness and generality of cognitive architectures without compromising their fidelity 2009 ACT-R Workshop

More Related