Issue 2 shift invariance
This presentation is the property of its rightful owner.
Sponsored Links
1 / 68

Issue #2: Shift Invariance PowerPoint PPT Presentation


  • 96 Views
  • Uploaded on
  • Presentation posted in: General

Issue #2: Shift Invariance. Backprop cannot handle shift invariance (it cannot generalize from 0011, 0110 to 1100) But the cup is on the table whether you see it right in the center or from the corner of your eyes (i.e. in different areas of the retina map)

Download Presentation

Issue #2: Shift Invariance

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Issue 2 shift invariance

Issue #2: Shift Invariance

  • Backprop cannot handle shift invariance (it cannot generalize from 0011, 0110 to 1100)

  • But the cup is on the table whether you see it right in the center or from the corner of your eyes (i.e. in different areas of the retina map)

  • What structure can we utilize to make the input shift-invariant?


Topological relations

Topological Relations

  • Separation

  • Contact

  • Coincidence:

    • Overlap

    • Inclusion

  • Encircle/surround


Limitations

Limitations

  • Scale

  • Uniqueness/Plausibility

  • Grammar

  • Abstract Concepts

  • Inference

  • Representation


How does activity lead to structural change

How does activity lead to structural change?

  • The brain (pre-natal, post-natal, and adult) exhibits a surprising degree of activity dependent tuning and plasticity.

  • To understand the nature and limits of the tuning and plasticity mechanisms we study

    • How activity is converted to structural changes (say the ocular dominance column formation)

  • It is centrally important for us to understand these mechanisms to arrive at biological accounts of perceptual, motor, cognitive and language learning

    • Biological Learning is concerned with this topic.


Learning and memory introduction

Memory

Declarative

Non-Declarative

Episodic

Semantic

Procedural

Learning and Memory: Introduction

memory of a situation

general facts

skills


Learning and memory introduction1

Learning and Memory: Introduction

There are two different types of learning

  • Skill Learning

  • Fact and Situation Learning

    • General Fact Learning

    • Episodic Learning

  • There is good evidence that the process underlying skill (procedural) learning is partially different from those underlying fact/situation (declarative) learning.


  • Skill and fact learning involve different mechanisms

    Skill and Fact Learning involve different mechanisms

    • Certain brain injuries involving the hippocampal region of the brain render their victims incapable of learning any new facts or new situations or faces.

      • But these people can still learn new skills, including relatively abstract skills like solving puzzles.

    • Fact learning can be single-instance based. Skill learning requires repeated exposure to stimuli.

    • Implications for Language Learning?


    Short term memory

    Short term memory

    • How do we remember someone’s telephone number just after they tell us or the words in this sentence?

    • Short term memory is known to have a different biological basis than long term memory of either facts or skills.

      • We now know that this kind of short term memory depends upon ongoing electrical activity in the brain.

      • You can keep something in mind by rehearsing it, but this will interfere with your thinking about anything else. (Phonological Loop)


    Long term memory

    Long term memory

    • But we do recall memories from decades past.

      • These long term memories are known to be based on structural changes in the synaptic connections between neurons.

      • Such permanent changes require the construction of new protein molecules and their establishment in the membranes of the synapses connecting neurons, and this can take several hours.

    • Thus there is a huge time gap between short term memory that lasts only for a few seconds and the building of long-term memory that takes hours to accomplish.

    • In addition to bridging the time gap, the brain needs mechanisms for converting the content of a memory from electrical to structural form.


    Situational memory

    Situational Memory

    • Think about an old situation that you still remember well. Your memory will include multiple modalities- vision, emotion, sound, smell, etc.

    • The standard theory is that memories in each particular modality activate much of the brain circuitry from the original experience.

    • There is general agreement that the Hippocampal area contains circuitry that can bind together the various aspects of an important experience into a coherent memory.

    • This process is believed to involve the Calcium based potentiation (LTP).


    Models of learning

    Models of Learning

    • Hebbian ~ coincidence

    • Recruitment ~ one trial

    • Supervised ~ correction (backprop)

    • Reinforcement ~ delayed reward

    • Unsupervised ~ similarity


    Hebb s rule

    Hebb’s Rule

    • The key idea underlying theories of neural learning go back to the Canadian psychologist Donald Hebb and is called Hebb’s rule.

    • From an information processing perspective, the goal of the system is to increase the strength of the neural connections that are effective.


    Ltp and hebb s rule

    strengthen

    weaken

    LTP and Hebb’s Rule

    • Hebb’s Rule: neurons that fire together wire together

    • Long Term Potentiation (LTP) is the biological basis of Hebb’s Rule

    • Calcium channels are the key mechanism


    Chemical realization of hebb s rule

    Chemical realization of Hebb’s rule

    • It turns out that there are elegant chemical processes that realize Hebbian learning at two distinct time scales

      • Early Long Term Potentiation (LTP)

      • Late LTP

    • These provide the temporal and structural bridge from short term electrical activity, through intermediate memory, to long term structural changes.


    Long term potentiation ltp

    Long Term Potentiation (LTP)

    • These changes make each of the winning synapses more potent for an intermediate period, lasting from hours to days (LTP).

    • In addition, repetition of a pattern of successful firing triggers additional chemical changes that lead, in time, to an increase in the number of receptor channels associated with successful synapses - the requisite structural change for long term memory.

      • There are also related processes for weakening synapses and also for strengthening pairs of synapses that are active at about the same time.


    The hebb rule is found with long term potentiation ltp in the hippocampus

    The Hebb rule is found with long term potentiation (LTP) in the hippocampus

    Schafer collateral pathway

    Pyramidal cells

    1 sec. stimuli

    At 100 hz


    Computational models based on hebb s rule

    Computational Models based onHebb’s rule

    The activity-dependent tuning of the developing nervous system, as well as post-natal learning and development, do well by following Hebb’s rule.

    Explicit Memory in mammals appears to involve LTP in the Hippocampus.

    Many computational systems for modeling incorporate versions of Hebb’s rule.

    • Winner-Take-All:

      • Units compete to learn, or update their weights.

      • The processing element with the largest output is declared the winner

      • Lateral inhibition of its competitors.

  • Recruitment Learning

    • Learning Triangle Nodes

  • LTP in Episodic Memory Formation


  • Hebb s rule is not sufficient

    Hebb’s rule is not sufficient

    • What happens if the neural circuit fires perfectly, but the result is very bad for the animal, like eating something sickening?

      • A pure invocation of Hebb’s rule would strengthen all participating connections, which can’t be good.

      • On the other hand, it isn’t right to weaken all the active connections involved; much of the activity was just recognizing the situation – we would like to change only those connections that led to the wrong decision.

    • No one knows how to specify a learning rule that will change exactly the offending connections when an error occurs.

      • Computer systems, and presumably nature as well, rely upon statistical learning rules that tend to make the right changes over time. More in later lectures.


    Models of learning1

    Models of Learning

    • Hebbian ~ coincidence

    • Recruitment ~ one trial

    • Supervised ~ correction (backprop)

    • Reinforcement ~ delayed reward

    • Unsupervised ~ similarity


    Issue 2 shift invariance

    LTP and one-shot memory

    Twin requirements of LTP induction:

    • presynaptic activity + postsynaptic depolarization:

      • LTP requires synchronous activity at multiple synapses of a postsynaptic cell (cooperativity)

      • ideal for transforming a transientsynchronous-activitybased expression of a relation between multiple items into a persistentsynaptic-efficacy based encoding of the relation (Shastri, 2001)


    Recruiting connections

    Recruiting connections

    • Given that LTP involves synaptic strength changes and Hebb’s rule involves coincident-activation based strengthening of connections

      • How can connections between two nodes be recruited using Hebbs’s rule?


    The idea of recruitment learning

    K

    Y

    X

    N

    B

    F = B/N

    the point is, with a fan-out of1000,

    if we allow 2 intermediate layers,

    we can almost always find a path

    The Idea of Recruitment Learning

    • Suppose we want to link up node X to node Y

    • The idea is to pick the two nodes in the middle to link them up

    • Can we be sure that we can find a path to get from X to Y?


    Finding a connection

    Finding a Connection

    P = (1-F) **B**K

    P = Probability of NO link between X and Y

    N = Number of units in a “layer”

    B = Number of randomly outgoing units per unit

    F = B/N , the branching factor

    K = Number of Intermediate layers, 2 in the example

    N=

    106 107 108

    K=

    # Paths = (1-Pk-1)*(N/F) = (1-Pk-1)*B


    Issue 2 shift invariance

    X

    Y


    Issue 2 shift invariance

    X

    Y


    Issue 2 shift invariance

    Finding a Connection in Random Networks

    For Networks with N nodes and branching factor,

    there is a high probability of finding good links.

    (Valiant 1995)


    Issue 2 shift invariance

    Recruiting a Connection in Random Networks

    • Informal Algorithm

    • Activate the two nodes to be linked

    • Have nodes with double activation strengthen their active synapses (Hebb)

    • There is evidence for a “now print” signal based on LTP (episodic memory)


    Triangle nodes and feature structures

    A

    B

    C

    Triangle nodes and feature structures

    A

    B

    C


    Representing concepts using triangle nodes

    Representing concepts using triangle nodes


    Recruiting triangle nodes

    Recruiting triangle nodes

    • Let’s say we are trying to remember a green circle

    • currently weak connections between concepts (dotted lines)

    has-color

    has-shape

    blue

    green

    round

    oval


    Strengthen these connections

    Strengthen these connections

    • and you end up with this picture

    has-color

    has-shape

    Greencircle

    blue

    green

    round

    oval


    Issue 2 shift invariance

    Has-color

    Has-shape

    Green

    Round


    Issue 2 shift invariance

    Has-color

    Has-shape

    GREEN

    ROUND


    Models of learning2

    Models of Learning

    • Hebbian ~ coincidence

    • Recruitment ~ one trial

    • Supervised ~ correction (backprop)

    • Reinforcement ~ delayed reward, soon

    • Unsupervised ~ similarity


    5 levels of neural theory of language

    5levels of Neural Theory of Language

    Pyscholinguistic experiments

    Spatial Relation

    Motor Control

    Metaphor

    Grammar

    Cognition and Language

    Computation

    Structured Connectionism

    abstraction

    Neural Net and learning

    SHRUTI

    Triangle Nodes

    Computational Neurobiology

    Biology

    Neural Development

    Quiz

    Midterm

    Finals


    Issue 2 shift invariance

    Tinbergen’s Four Questions

    How does it work?

    How does it improve fitness?

    How does it develop and adapt?

    How did it evolve?


    Issue 2 shift invariance

    postsynaptic

    neuron

    science-education.nih.gov


    Brains computers

    1000 operations/sec

    100,000,000,000 units

    10,000 connections/

    graded, stochastic

    embodied

    fault tolerant

    evolves, learns

    1,000,000,000 ops/sec

    1-100 processors

    ~ 4 connections

    binary, deterministic

    abstract

    crashes

    designed, programmed

    Brains ~ Computers


    Issue 2 shift invariance

    Artist’s rendition of a typical cell membrane


    Issue 2 shift invariance

    Flexor-

    Crossed

    Extensor

    Reflex

    (Sheridan

    1900)

    Reflex

    Circuits

    With

    Inter-neurons

    Painful Stimulus


    Issue 2 shift invariance

    Gaits of the cat: an informal computational model


    Neural tissue

    Neural Tissue

    • The skin and neural tissue arise from a single layer, known as the ectoderm

      • in response to signals provided by an adjacent layer, known as the mesoderm.

      • A number of molecules interact to determine whether the ectoderm becomes neural tissue or develops in another way to become skin


    Critical periods in development

    Critical Periods in Development

    • There are critical periods in development (pre and post-natal) where stimulation is essential for fine tuning of brain connections.

    • Other examples of columns

      • Orientation columns


    Pre natal tuning internally generated tuning signals

    Pre-Natal Tuning: Internally generated tuning signals

    • But in the womb, what provides the feedback to establish which neural circuits are the right ones to strengthen?

      • Not a problem for motor circuits - the feedback and control networks for basic physical actions can be refined as the infant moves its limbs and indeed, this is what happens.

      • But there is no vision in the womb. Recent research shows that systematic moving patterns of activity are spontaneously generated pre-natally in the retina. A predictable pattern, changing over time, provides excellent training data for tuning the connections between visual maps.

    • The pre-natal development of the auditory system is also interesting and is directly relevant to our story.

      • Research indicates that infants, immediately after birth, preferentially recognize the sounds of their native language over others. The assumption is that similar activity-dependent tuning mechanisms work with speech signals perceived in the womb.


    Post natal environmental tuning

    Post-natal environmental tuning

    • The pre-natal tuning of neural connections using simulated activity can work quite well –

      • a newborn colt or calf is essentially functional at birth.

      • This is necessary because the herd is always on the move.

      • Many animals, including people, do much of their development after birth and activity-dependent mechanisms can exploit experience in the real world.

    • In fact, such experience is absolutely necessary for normal development.

    • As we saw, early experiments with kittens showed that there are fairly short critical periods during which animals deprived of visual input could lose forever their ability to see motion, vertical lines, etc.

      • For a similar reason, if a human child has one weak eye, the doctor will sometimes place a patch over the stronger one, forcing the weaker eye to gain experience.


    Learning rule gradient descent on an root mean square rms

    O = output layer

    Learning Rule – Gradient Descent on an Root Mean Square (RMS)

    • Learn wi’s that minimize squared error


    Backpropagation algorithm

    Backpropagation Algorithm

    • Initialize all weights to small random numbers

    • For each training example do

      • For each hidden unit h:

      • For each output unit k:

      • For each output unit k:

      • For each hidden unit h:

      • Update each network weight wij:

    with


    Distributed vs localist rep n

    What are the drawbacks of each representation?

    Distributed vs Localist Rep’n


    Distributed vs localist rep n1

    What happens if you want to represent a group?

    How many persons can you represent with n bits? 2^n

    What happens if one neuron dies?

    How many persons can you represent with n bits? n

    Distributed vs Localist Rep’n


    Word superiority effect

    Word Superiority Effect


    Modeling lexical access errors

    Modeling lexical access errors

    • Semantic error

    • Formal error (i.e. errors related by form)

    • Mixed error (semantic + formal)

    • Phonological access error


    Phonological access error selection of incorrect phonemes

    Phonological access error: Selection of incorrect phonemes

    Syl

    FOG

    DOG

    CAT

    RAT

    MAT

    On Vo Co

    f

    r

    d

    k

    m

    ae

    o

    t

    g

    Onsets

    Vowels

    Codas

    Adapted from Gary Dell, “Producing words from pictures or from other words”


    Mri and fmri

    MRI and fMRI

    • MRI: Images of brain structure.

    • fMRI: Images of brain function.

    • Tissues differ in magnetic susceptibility (grey matter, white matter, cerebrospinal fluid)


    Mirror neurons area f5

    Mirror Neurons: Area F5


    Cortical mechanism for action recognition

    Cortical Mechanism forAction Recognition

    adds additional somatosensory information to

    the movement to be imitated

    provides an early

    description of the action

    Observed

    Action

    STS

    Parietal mirror neurons (PF)

    (inferior parietal lobule)

    Frontal mirror neurons (F5) (BA 44)

    copies of the motor plans necessary to imitate actions for monitoring purposes

    codes the goal of the

    action to be imitated


    Issue 2 shift invariance

    Somatotopy of Action Observation

    Foot Action

    Hand Action

    Mouth Action

    Buccino et al. Eur J Neurosci 2001


    The wcs color chips

    The WCS Color Chips

    • Basic color terms:

      • Single word (not blue-green)

      • Frequently used (not mauve)

      • Refers primarily to colors (not lime)

      • Applies to any object (not blonde)

    FYI:

    English has 11 basic color terms


    Concepts are not categorical

    Concepts are not categorical


  • Login