1 / 43

CLS, Rapid Schema Consistent Learning, and Similarity-weighted Interleaved learning

CLS, Rapid Schema Consistent Learning, and Similarity-weighted Interleaved learning. Psychology 209 Feb 26, 2019. Your knowledge is in your connections!. An experience is a pattern of activation over neurons in one or more brain regions.

irisc
Download Presentation

CLS, Rapid Schema Consistent Learning, and Similarity-weighted Interleaved learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CLS, Rapid Schema Consistent Learning, and Similarity-weighted Interleaved learning Psychology 209Feb 26, 2019

  2. Your knowledge is in your connections! • An experience is a pattern of activation over neurons in one or more brain regions. • The trace left in memory is the set of adjustments to the strengths of the connections. • Each experience leaves such a trace, but the traces are not separable or distinct. • Rather, they are superimposed in the same set of connection weights. • Recall involves the recreation of a pattern of activation, using a part or associate of it as a cue. • The reinstatement depends on the knowledge in the connection weights, which in general will reflect influences of many different experiences. • Thus, memory is always a constructive process, dependent on contributions from many different experiences.

  3. Effect of a HippocampalLesion • Intact performance on tests of intelligence, general knowledge, language, other acquired skills • Dramatic deficits in formation of some types of new memories: • Explicit memories for episodes and events • Paired associate learning • Arbitrary new factual information • Spared priming and skill acquisition • Temporally graded retrograde amnesia: • lesion impairs recent memories leaving remote memories intact. Note: HM’s lesion was bilateral

  4. Key Points • We learn about the general pattern of experiences, not just specific things • Gradual learning in the cortex builds implicit semantic and procedural knowledge that forms much of the basis of our cognitive abilities • The Hippocampal system complements the cortex by allowing us to learn specific things without interference with existing structured knowledge • In general these systems must be thought of as working together rather than being alternative sources of information. • Much of behavior and cognition depends on both specific and general knowledge

  5. Emergence of Meaning in Learned Distributed Representations through Gradual Interleaved Learning • Distributed representations (what ML calls embeddings) that capture aspects of meaning emerge through a gradual learning process • The progression of learning and the representations formed capture many aspects of cognitive development • Progressive differentiation • Sensitivity to coherent covariation across contexts • Reorganization of conceptual knowledge

  6. The Rumelhart Model

  7. The Training Data: All propositions true of items at the bottom levelof the tree, e.g.: Robin can {grow, move, fly}

  8. Early Later LaterStill Experie nce

  9. What happens in this system if we try to learn something new? Such as a Penguin

  10. Learning Something New • Used network already trained with eight items and their properties. • Added one new input unit fully connected to the representation layer • Trained the network withthe following pairs of items: • penguin-isa living thing-animal-bird • penguin-can grow-move-swim

  11. Rapid Learning Leads to Catastrophic Interference

  12. Avoiding Catastrophic Interference with Interleaved Learning

  13. RichardMorris Rapid Consolidation of Schema Consistent Information

  14. Tse et al (Science, 2007, 2011) During training, 2 wells uncovered on each trial

  15. Schemata and Schema Consistent Information • What is a ‘schema’? • An organized knowledge structure into which existing knowledge is organized. • What is schema consistent information? • Information that can be added to a schema without disturbing it. • What about a penguin? • Partially consistent • Partially inconsistent • In contrast, consider • a trout • a cardinal

  16. New Simulations • Initial training with eight items and their properties as before. • Added one new input unit fully connected to the representation layer also as before • Trained the network on one of the following pairs of items: • penguin-isa & penguin-can • trout-isa & trout-can • cardinal-isa & cardinal-can

  17. New Learning of Consistent and Partially Inconsistent Information INTERFERENCE LEARNING

  18. Connection Weight Changes after Simulated NPA, OPA and NM Analogs Tseet al. 2011

  19. How Does It Work?

  20. How Does It Work?

  21. Remaining Questions • Are all aspects of new learning integrated into cortex-like networks at the same rate? • No, some aspects are integrated much more slowly than others • Is it possible to avoid replaying everything one already knows when one wants to learn new things with arbitrary structure? • Yes, at least in some circumstances that we will explore • Perhaps the answers to these questions will allow us to make more efficient use of both cortical and hippocampal resources for learning.

  22. Toward an Explicit Mathematical Theory of Interleaved Learning • Characterizing structure in a dataset to be learned • The deep linear network that can learn this structure • The dynamics of learning the structure in the dataset • Initial learning of a base data set • Subsequent learning of an additional item • Using similarity weighted interleaved learning to increase efficiency of interleaved learning • Initial thoughts on how we might use the hippocampus more efficiently

  23. Hierarchical structure in a synthetic data set Sparrow Hawk Salmon Sunfish Oak Maple Rose Daisy SparrowHawk

  24. Processing and Learning in a deep linear network Saxe et al, (2013a,b,…)

  25. SVD of the dataset

  26. Dynamics of Learning – one-hot inputs SSE a(t) Solid lines are simulated values of a(t)Dashed lines are based on the equation Variable discrepancy affects takeoff point, but not shape

  27. Dynamics of Learning – auto-associator SSE a(t) Solid lines are simulated values of a(t)Dashed lines are based on the equation Dynamicsare a bit more predictable

  28. Adding a new member of an existing category Sparrow Hawk Salmon Sunfish Oak Maple Rose Daisy SparrowHawk

  29. SVD of New Complete Dataset

  30. The consequence of standard interleaved learning

  31. SVD Analysis of Network Output for Birds Adjusted Dimensions New Dimension

  32. Similarity Weighted Interleaved Learning FullInterleaving Similarity-weightedInterleaving UniformInterleaving

  33. Freezing the output weights initially FullInterleaving Similarity-weightedInterleaving UniformInterleaving

  34. Discussion • Integration of fine-grained structure into a deep network may always be a slow process • Sometimes this fine-grained structure is ultimately fairly arbitrary and idiosyncratic, although other times it may be part of a deeper pattern the learner has not previouslyseen • One way to address such integration: • Initial reliance on sparse / item-specific representation • This could be made more efficient by storing only the ‘correction vector’ in the hippocampus • Gradual integration through interleaved learning

  35. SparrowHawk Error Vector After Easy Integration Phase is Complete

  36. Questions, answers and next steps • Are all aspects of new learning integrated into cortex-like networks at the same rate? • No, some aspects are integrated much more slowly than others • Is it possible to avoid replaying everything one already knows when one wants to learn new things with arbitrary structure? • Yes, at least in some circumstances that we have explored • Perhaps the answers to these questions will allow us to make more efficient use of both cortical and hippocampal resources for learning.

More Related