1 / 57

What is special about representation of space in perception and thought?

Spatial representation in the mind/brain: Do we need a global topographical map? Zenon Pylyshyn Rutgers Center for Cognitive Science and Institute Jean Nicod. What is special about representation of space in perception and thought? Do we need a single global spatial representation?

tatum
Download Presentation

What is special about representation of space in perception and thought?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spatial representation in the mind/brain:Do we need a global topographical map?Zenon Pylyshyn Rutgers Center for Cognitive Science and Institute Jean Nicod What is special about representation of space in perception and thought? Do we need a single global spatial representation? Do we need a topographical display in the brain? Workshop on Frames of Reference Paris, November 17-19, 2005

  2. Outline of talk • Representing space in LTM vs in Working Memory (WM) • Some conditions on representing space in WM • Why a unitary global spatial ‘display’ is often assumed as the form of representation and a few reasons why that’s wrong • An alternative way of satisfying the conditions on spatial ‘representation’: The Projection Hypothesis • Aside on Spatial Index (FINST) Theory • How the projection hypothesis explains the spatial properties of certain representations: Examples from the visual modality • How to generalize this story to proprioception: Thespatial sense • Where is the global allocentric display we thought we needed?

  3. What is special about spatial representation? • I have suggested (Pylyshyn, 1973) that there is no reason why a form of representation adequate for general knowledge (i.e., a Language of Thought or LOT) cannot also serve for encoding the content of spatial representations in memory • The difference between representing spatial relations and representing other contents may lie in their being different topics requiring a different conceptual vocabulary, rather than in their having a different form or medium • This general-LOT format fails to account for certain phenomena that are observed when vision and spatial reasoning are actively engaged in solving problems or in determining actions – i.e., when spatial representations are functioning in working memory.

  4. Spatial representation during perception and reasoning • I have outlined a number of ways that the representation of space in WM is different in form from that of other contents of WM. In this talk I will focus one of these ways, namely in the way that they deal with space • Because such representations are not tied to vision or conscious visual experience, they are best referred to as spatial representations rather than mental images • By the end I will conclude that even calling them spatial representations is somewhat misleading – but that comes later!

  5. What are some constraints on a theory of spatial representation? • I begin by trying to set out some functional requirements (or boundary conditions) that may apply to a system for representing space and spatial relations in working memory in perception and especially in spatial reasoning • I will later argue that the wrong conclusions have been drawn from these requirements about the form of such spatial representations

  6. Some conditions on a system of codes for representing spatial relations (1) • The system must be able to representmagnitudes • Psychophysical evidence shows that we encode magnitudes (at least relative magnitudes) and that these magnitudes (i.e., the semantics of the codes) have systematic effects in behavior (e.g., the phenomena of scalar variance ratio, Fechner’s law, the symbolic distance effect, etc). • Thus something about the form of the representation itself must explain these systematic magnitude effects (e.g., phenomena such as those listed above would not arise if the magnitudes were encoded symbolically as numerals)

  7. Some conditions on a system of codes for representing spatial relations (2) • The system must represent stable spatialconfigurations • Spatial configurations involve relations over multiple objects – in that sense they are holistic and require simultaneous access to multiple objects (i.e., multiple arguments in relational predicates must be simultaneously bound) • What is special about such configurations is that they may allow some spatial ‘inferences’ by pattern lookup without reference to independent geometrical axioms (such as the axiom of transitivity) • Example of 3-term series problems and ‘spatial paralogic’

  8. Some conditions on a system of codes for representing spatial relations (3) • The system must somehow ‘capture’ thecontinuity and connectedness of space. This requirement leaves many unanswered questions: • Does continuity entail that empty places are represented as such? • Does continuity entail that the representational system itself determines that distances meet metrical axioms (e.g., the triangle inequality AB + BC ≥ AC) or that they are Euclidean? • Does continuity entail that the representation of movements of objects is constrained so that in getting from A to B objects must pass through ‘intermediate’ locations? • The proposal I will present later gives a partial answer to these

  9. Some conditions on a system of codes for representing spatial relations (4) • The system must represent spatial properties across modalities, including proprioceptive and efferent ‘modalities’ • Spatial representations must be able to engage the motor system in a fairly direct manner • One of the characteristics of what we call a “spatial representation’ is that we can ‘point to’ represented things (e.g., in our mental image). That’s why a proposition such as LEFT-OF(A,B) seems an inadequate representation for <A,B> • But note that motor actions towards perceptual and imagined representations are not identical because they engage different perceptual-motor pathways(Goodale et al. 1994)

  10. Some conditions on a system of codes for representing spatial relations (5) • The system must be able to represent spatial relations in 3D • When relations in the depth are encoded, they must be in a similar format as the encoding of relations in the plane since the two have to operate together (e.g., in determining the Euclidean distance between points in 3D space) • Experimental evidence from such phenomena as ‘mental rotation’ or ‘mental scanning’ show identical functions in depth as in the plane

  11. Summary of constraints to be met: A system of spatial representations must somehow do the following: • It must represent magnitudes • It must represent holistic configurationswhich enable at least some direct one-step inferences (by pattern-matching) • It must capture connectedness and continuity • It must represent spatial relations seamlessly across modalities and to engage the motor system • It must represent distances in depth as well as in the plane in a uniform manner (i.e., it must represent 3D) • I will return to these constraints when I discuss a different proposal for how we ‘represent’ space

  12. An additional major assumption about spatial representation The foregoing list of constraints has frequently led people to make one additional assumption about spatial representation that I will argue is not justified: • The single frame of reference assumption is the assumption that we represent spatial layouts in perception or in thought in a single global frame of reference, as opposed to a patchwork of distinct but coordinated frames • Every theory I know that attempts too explain mental imagery or cross-modal coordination makes this assumption, explicitly or implicitly

  13. Why a single ‘display’ for vision? In vision the global spatial-display theory explains why our visual experience is panoramic and stable even though the visual inputs are highly local, partial and constantly changing But many studies have shown that there is no such rich stable panoramic display (e.g., change blindness, superposition, etc., see O’Regan, 1992)

  14. Why a single ‘display’ for spatial reasoning? The global spatial-display theory also explains how a mental representation can meet the spatial conditions listed earlier – it does so by creating a 2D image in a real spatial medium Such a display was assumed to use the same global spatial medium that is used in vision. But both display assumptions have serious problems.

  15. The global spatial display assumption • There are many deep problems with the assumption that spatial properties are represented in vision and reasoning by an inner spatial display which corresponds to our experience of a stable world (perceived or imagined), many of which I have discussed in connection with the ‘picture theory’ of mental imagery(BBS, 2002) • V1 can’t serve as the medium for an image representation for many reasons given in my BBS paper and book – e.g., not stable, not broad enough, not 3D, images not presented in the right form (no Emmert’s law, no amodal completion, image size not in the right form, no image rotation…) • One of the main problems relevant to the present discussion is the assumption that visual spatial perception, cross-modal spatial integration, visuomotor control, and spatial reasoning derive from a single representation in an allocentric frame of reference • There are many reasons to doubt that there is a single global allocentric representation (‘master map’) for spatial information…

  16. Many reasons to reject the Master Map assumption • There are many known frames of reference between perception and motor control, relying on both external and internal sensors • While gaze-centered coordinates are common in motor control they are gain-modulated by inputs from eye, head and body positions as well as by motor intentions (Anderson & Buneo, 2002, Duhamel et al., 1992) • Visual information is also represented in hand- and body-centered (also personal & peripersonal) frames of reference (Làdavas, 2002) • Spatial neglect appears in many different frames of reference • Motor control necessarily involves many different frames of reference, including proprioceptive, kinesthetic, joint-angle, and even dynamic frames of reference based on muscle spindle and joint tendon receptors • Earlier (downstream) frames of reference are often not overwritten but may continue to have observable consequences on errors in kinesthetically-guided movements (Baud-Bovy & Viviani, 1998), so multiple frames can coexist in the nervous system

  17. A different way of approaching the question of spatial representation • Because of the many problems with the global spatial display assumption, I have proposed a provisional hypothesis that preserves some of the advantages of the global spatial display, but assumes that the relevant spatial properties are in the perceived worldand can be accessed if we have the right access mechanisms for selecting and indexing objects in the perceived world • For ease of reference let’s call this the Projection Hypothesis because it is somewhat analogous to ‘projecting’ the spatial display onto the real space that we perceive – even though only objects’ identities (labels) and locations, and none of their other visual properties, are ‘projected’

  18. The projectionhypothesis The projection hypothesis claims that the perceptual systems rely on the spatial properties of the concurrently perceived world to meet the 5 conditions outlined earlier. The hypothesis rests on three theoretical postulates: • We have a system of “pointers” (such as the FINST mechanism) by which a small number of perceived objects in the world can be selected and indexed. FINSTs are reference pointers to these target objects and remain attached to them despite changes in their locations • When we perceive a scene that contains indexed objects, our perceptual system is able to treat those indexed objects as though they were assigned unique visual labels. (Thus it can detect previously-unnoticed patterns among indexed objects) • Our LTM representation of locations need not meet the 5 conditions because it is not directly used in spatial reasoning or motor control

  19. SHORT DETOUR (while gray background)! Visual Index (FINST) Theory • Because FINST Indexes play a central role in this story I will make a short detour to illustrate this mechanism and to give some examples of indexes at work

  20. Pick out the 3 dots I will cue and keep track of them • After you pick out the 3 cued dots, I’ll ask you move your attention from the center one to the dot below it. Describe the new relation among the three dots. • In a field of identical elements you can select several of them and move your attention among them so long as they are not too close together (Intriligator & Cavanagh, 2001)

  21. Several objects must be picked out at once in making relational judgments You must have the ability to pick out several individual items and keep track of them since in order to make relational judgments, such as inside or on-the-same-contour you must pick out the relevant individual objects first. Are dots Inside-same-contour? On-same-contour?

  22. Other experimental demonstrations of FINST indexes • Recognizing the cardinality of small sets of things: Subitizing vs counting (Trick, 1994) • Searching through subsets – selecting items to search through (Burkell, 1997) • Selecting subsets and maintaining the selection during a saccade (Currie, 2002) • Multiple Object Tracking (MOT)

  23. Subset selection for search Burkell, J., & Pylyshyn, Z. W. (1997). Searching through subsets: A test of the visual indexing hypothesis. Spatial Vision, 11(2), 225-258.

  24. Subset search results: • Only properties of the subset matter • If the subset is a single-feature search it is fast and the slope (RT vs number of items) is shallow • If the subset is a conjunction search set, it takes longer and is more sensitive to the set size • The distance between targets does not matter, so observers don’t seem to be scanning the display looking for the target but can switch their attention directly to the subset items

  25. Selective search is also found when a saccade occurs between the late onset cues and start of search Even with a saccade between selection and access, items can be accessed efficiently

  26. Demonstrating the function of FINSTs withMultiple Object Tracking (MOT) • In a typical MOT experiment, 8 simple identical objects are presented on a screen and 4 of them are briefly distinguished in some visual manner – usually by flashing them on and off. • After these 4 targets are briefly identified, all objects resume their identical appearance and move randomly. The observers’ task is to keep track of the ones that had been designated as targets at the start • After a period of 5-10 seconds the motion stops and observers must indicate, using a mouse, which objects are the targets

  27. Keep track of the objects that flash

  28. How do we do it? What properties of individual objects do we use?

  29. Keep track of the objects that flash

  30. Our explanation is that FINST indexes are bound to targets when they flash and remain bound during the duration of the trial. At the end of the trial they allow attention to be moved to each target to select the targets

  31. FINST indexes allow selected objects to be accessed directly and without searching for specific properties:Indexes stay bound to objects as the objects move

  32. If you were like the cartoon character Plastic Man and could place your fingers on things in the world so as to refer to them uniquely, and if you could then move your gaze or attention to any of them at will, you would possess fingers of instantiation

  33. If you were like the cartoon character Plastic Man and could place your fingers on things in the world so as to refer to them uniquely, and if you could then move your gaze or attention to them, you would possess FINgers of INSTantiation (or FINSTs)

  34. End of aside on FINSTs! Summary • The FINST mechanism provides a limited set of indexical pointers bound to perceived objects • FINSTs can associate perceived objects with objects of thought • The binding is stable over some period of time (e.g., a few seconds) and continues despite motion of the objects or eye movements. • Perception is able to treat the indexed objects as though they were perceptually marked

  35. Examples of the projection hypothesis • To illustrate how the projection hypothesis works, first consider index-based projection in the visual modality, where indexes can convert some apparently mental-space phenomena into perceived-space phenomena (although I will return to the non-visual case shortly, the visual case is more salient and tends to dominate other modalities) • Examples from some ‘mental imagery” experiments • Mental scanning (Kosslyn, 1973) • Mental image superposition (Podgorny& Shepard, 1978) • Visual-motor adaptation (Finke, 1979) • S-R compatibility to imagined locations (Tlauka, 1998)

  36. Time to “see” feature on image Distance on image Studies of mental scanningOften cited to suggest that representations have metrical properties

  37. Brain image or index-based projection? • A way to do this task: • Associate places on the imagined map with places in the world that you perceive • Move your attention or gaze from one place to another as they are named

  38. Using a perceived room to anchor FINSTs tagged with map labels

  39. Using vision with selected ‘labeled’ objects • If you ‘project’ the pattern of map places by picking out objects in the room in front of you that correspond roughly to these memorized locations, then you can scan attention from one such marked object to another. The space here is real and the equation time = distance  speedis a physical principle, not tacit knowledge about the world. • You can also use the tagged objects to infer configurational properties you may not have noticed, despite somehow memorizing the location of all objects • Which 3 or more places on the map are collinear? • Which place on the map is furthest North, South, East, West? • Which 3 places form an isosceles triangle? • Such configurational consequence can be detected as opposed to logically inferred, so long as they involve only a few places, because the visual system can examine a scene with labeled indexed objects

  40. Another example of a result attributable to FINST-based projection: Podgorny-Shepard experiment Remember the following pattern and imagine it after it is gone Are the following dots on or off the imagined pattern?

  41. The pattern of reaction times is the same for perceived shapes as for recalled shapes • Both when the F display is seen and when the F is imagined, the time to judge that the dot was on the F was fastest when the dot was at the vertex of the F and slower when it was on an arm of the F (slowest when it was one square away). • Does this show that the F and dots are superimposed on a display in the brain and perceived with the visual system? • A more plausible explanation is that the cells corresponding to rows and columns of the F in the matrix are indexed and thus made distinct, allowing vision to be used to judge whether the dots fall on those rows/columns?

  42. Skip? Perceptual-motor adaptation to imagined hand position (Finke, 1979) • If you wear prism displacing lenses and repeatedly reach for objects in front of you for just a few minutes, you adapt to the erroneous feedback. When the lenses are removed you overshoot in the opposite direction. • If, instead of wearing lenses, you move your hand invisibly while you imagine that your hidden hand is at the displaced location, you get the same adaptation phenomena • Does this show that both your imagined hand and other properties of the scene are displayed somewhere in your visual system? • All you need are indexes to several objects in the visual scene, together with a distinct label for each (e.g., hand, block). This allows attention or even gaze to move to them. • No visual details (e.g. hand properties) need to be imagined • Some real visual objects (e.g., texture) needs to be visible to bind indexes – just a blank background will not work (c.f., Rossetti)

  43. S-R Compatibility effect with a visual displayThe Simon effect: It is faster to make a response in the direction of an attended objects than in another direction Response for A is faster when YES in on the left in these displays

  44. S-R Compatibility effect with a recalled (mental) display The same RT pattern occurs for a recalled display as for a perceived one RT is faster when the A is recalled (imagined) as being on the left

  45. In all these examples you only need to index a few visual objects located in appropriate places • In all examples that we have seen, the results can be explained without appealing to a global spatial display, by assuming that: • Vision can index a few visible objects (including texture elements on an apparently plain surface) and • Vision can treat indexed objects as distinct or visually labeled

  46. Reminder of the constraints to be met by a system of spatial representations • Represent magnitudes • Represent configurations • Capture connectedness and continuity • Represent spatial relations across modalities and must be able to engage the motor system • Represent 3D distances and relations By anchoring mental particulars to a few perceived objects in a scene, the visual system is able to exploit the above properties of the perceived world

  47. Visual indexes can anchor spatial representations to a scene containing visual objects: But how does this work without vision (e.g., in the dark)? • We must rely on our remarkable capacity to orient to (point to, navigate towards, …) perceived and recalled objects (including proprioceptive ‘objects’) in space without vision  Call this general capacity our location- or spatial-sense • How can the projection hypothesis account for this apparently world-centered spatial sense without assuming a global allocentric frame of reference? • Answer: Just as it does with vision, by binding represented objects to (non-visually) perceived objects in the world • Indexing non-visual ‘objects’ must exploit auditory and general proprioceptive signals, and perhaps even preparatory motor programs(Anderson & Bruneo, 2002; Duhamel, Colby & Goldberg, 1992)

  48. The real problem of our sense of space • In order to solve the problem of how we index generalized ‘objects’ in the world using proprioceptive inputs we need to solve the problem of how we recognize two such inputs as corresponding to (reaching) the same object in space • This is the problem of the computing the equivalence of movements, or of proprioceptive inputs, that correspond to reaching the same object. Solving this problem requires solving the problem of coordinating signals between different afferent and efferent frames of reference • That’s why mechanisms of coordinate transformation are of central importance – they make it possible to compute the relevant equivalence classes • Such mechanisms are ubiquitous in PPC, SC and elsewhere

  49. Proprioception, coordinate transformations and the allocentric frame of reference • Coordinate transformations provide the basis for computing the equivalence classes of proprioceptive signals {S} associated with reaching or sensing individual objects in space (S ≡ S′ iff there is an appropriate coordinate transformation from S to S′) • Because of the ability to compute the set {S} corresponding reaching/sensing to places in the world, proprioception is able to provide allocentric information (c.f., Rossetti’s point that we should not equate proprioception with egocentric and vision with allocentric frames of reference) • Computing {S} is the problem that Henri Poincaré recognized as central to understanding our sense of space (see Poincaré’s “Why space has three dimensions” in Les Dernier Penseés, 1913). Without this we could not reach objects in the dark or from memory!

  50. Coordinate transformations are the basis for the illusory “global frame of reference” • A coordinate transformation operation takes a representation of an object relative to one coordinate system – say retinal coordinates – and produces a representation of that object relative to another frame of reference – say relative to the location of a hand in proprioceptive or kinematical coordinates. • An important consequence of these mechanisms is that, as (Colby & Goldberg, 1999, p319) put it, “Direct sensory-to-motor coordinate transformation obviates the need for a single representation of space in environmental coordinates”

More Related