The Illusion of Mental Pictures

The Illusion of Mental Pictures Zenon Pylyshyn Rutgers University, Center for Cognitive Science http:/ruccs.rutgers.edu/faculty/pylyshyn.html

The illusion about the causal role of mental pictures in thought The term “image” and the associated experience of “imaging” give rise to many tacit assumptions. The most damaging tacit assumption may be this one: When we engage in what we call imaging or visualizing, there is, somewhere in our head, something that we see more or less the way we see the world, and which corresponds to a possible or actual visual scene. This, in turn leads to a very widespread error: The Intentional Fallacy or, as it was know in early (Wundt) psychology; The stimulus error.

The most common (maybe universal) mistake in thinking about mental imagery The intentional or phenomenological fallacy. It is hard not to confound properties of the imagined world with properties of our representation (or image) of it. Consider such properties as these, and ask yourself whether the imagined world and the image both have the property (e.g., if you imagine something hot, does your representation also have the property of being hot):

Intuitions about which property in the world maps onto the same property in its representation • Shape  • Relative Size  • Orientation  (can you rotate it?) • Motion  (does it move continuously through intervening positions?) • Duration  (reaction time?) • Relative distances  (configurations) • Colour, brightness ? ? • Weight, density  • Temperature 

Examples to probe your intuition and your tacit knowledge Imagine seeing these events unfolding… • Drop a rubber ball on the pavement. Plot bounce height vs time. Suppose you get this pattern: • Drop a heavy ball and a light ball from, say, the leaning tower of Pisa. Indicate when they hit the groundand which one hits first.(It turns out that people’s tacit beliefs are Aristotelian rather than Galilean). What is responsible for this pattern in your image? height Time since first drop

What color do you see when two colored light beams overlap? ? What would actually happen if you witnessed this display? Two complementary colored light beams => white Two complementary colored filters or paint => black A More important question: Why did you imagine seeing the particular colour you reported?

Where would the water go if you poured into a full beaker full of sugar? Is there conservation of volume in your image? If not, why not?

What do these observations tell us about images? • Finding that your image mimics nature is not a discovery about images. It is a discovery about your tacit beliefs of what would happen in the world. • Asking someone to imagine some event φ is asking them what would they see if they were to observe φ actually happen. So it’s not surprising that one finds e.g., it takes longer to see tiny details on a small object than on a large object. • The question of what one can make of reports of conscious experiences is problematic →

Aside: What can we conclude from the contents of conscious experience? • What is the status of Tammet’s description of how he does multiplication? • Is he lying about what he does, or is he just mistaken? Is there a correct answer? • Does his description constitute an explanation of how he does multiplications?

I turn now to what I think is a deeper and more interesting topic in imagery research • Representing Space The more basic question Space, the next frontier?

The most interesting questions about mental imagery come together in the problem of representing spatial properties Representation of Space in Mental Images The intuition that images are laid out in space is very strong. WHY?

What is assumed when we claim that images are spatial? • While it is intuitive to say that our image seems to present some scene as if it were laid out on a spatial display, it’s not clear what that commits us to. • A quick look at what spatial properties we need the image to have reveals that the requirements are both extensive and not obviously attainable without displaying the image on a real spatial display or blackboard.

Some conditions on a system of codes for representing spatial relations What are some constraints that must be met by a form of representation adequate for representing spatial properties (i.e., by the architecture of a spatial representation system)? • Does the form or medium of representation place constraints on the possible representation of the movement of objects, so that in getting from A to B objects mustpass through ‘intermediate’ locations? • Must the space between A and B be explicitly represented as being empty? In rotating the image of an object from θ1to θn does the representation of the object go through a continuous series of representations of the object at <θ1, θ2, θ3, …θn>? • Does the format itself dictate that the axioms of measure theory hold? i.e. D(a,b) + D(b,c) ≥ D(a,c) and D(a,b,) = D(b,a), D(a,a) = 0. • Must the representation also meet Euclideanaxioms so that D(a,c) = D(b,c) + D(c,a) when abc = 90

Spatial/Pictorial character of mental images • People have claimed that observations of mental images show that images have spatial properties, just as pictures have spatial properties, which is why I refer to these claims as Picture Theories • Picture theorists have claimed that images “Preserve Metrical Spatial Information” (Kosslyn, 1978) • One of the most explicit statements of this Picture Theory is the following quotation from Kosslyn (1994), in which he introduces the notion of depiction as the defining character of visual mental images:

Images as depictive representations (Kosslyn, 1994, p 5) “A depictive representation is a type of picture, which specifies the locations and values of configurations of points in a space. … In a depictive representation, not only is the shape of the represented parts immediately available…, but so is the shape of the empty space … one cannot represent a shape in a depictive representation without also specifying a size and orientation….” • This is the claim that the form of image representation compels certain properties of images. This is the main assumption entailed by the mental picture theory. • I will return to this notion of obligatory properties that are compelled by the nature of the image representation. That issue is at the heart of the imagery misunderstanding (aka ‘debate’) and appears in the very next paragraph quoted.

Images as displayed in “functional space” “The space in which the points appear need not be physical…, but can be like an array in a computer, which specifies spatial relations purely functionally. That is, the physical locations in the computer of each point in an array are not themselves arranged in an array; it is only by virtue of how this information is ‘read’ and processed that it comes to function as if it were arranged into an array (with some points being close, some far, some falling along a diagonal, etc).” (Kosslyn, 1994, p5) • But it is crucial whythe information is ‘read’ in one way rather than in another. If it’s just that this comports with how we consciously experience images this is not an explanation. The Functional space/ Real space distinction helps to make clear what is involved in claiming that images have spatial properties. Here are three examples in which the appeal to inherent spatial properties of images is cited in explaining experimental findings.

Example 1. Do images have(or just represent) size? • Basic Experimental Finding: It takes longer to detect tiny features in a ‘small’ image than in a ‘large’ image. What does this tell us about representation of size? • What other explanation can there be for this result? • What if the experiment found that it was faster to detect details in a small image? What would you conclude? • Suppose you were asked to report details in a large blurred low-definition image as opposed to a small high definition image? What would you expect in that case? Why?

Example 2: Do Mental Scanning experiments support the claim that images have spatial properties • Experiments have shown that under certain conditions it takes longer to imagine an object moving between two objects the further apart they are imagined to be. The same is true to imagine attention moving between them. • These have been reviewed and described in: • Denis, M., & Kosslyn, S. M. (1999). Scanning visual mental images: A window on the mind. Cahiers de Psychologie Cognitive / Current Psychology of Cognition, 18(4), 409-465. • Rarely cited are experiments by Liam Bannon and me (described in Pylyshyn, 1981) which I will summarize for you. A window on the mind!

Studies of mental scanningDoes it show that images have metrical space? (Pylyshyn & Bannon. Described in Pylyshyn, 1981) • The image scanning effect is Cognitively Penetrable • i.e., it depends on Tacit Knowledge.

 The central problem with imagistic explanations… To what can we attribute the linear relation between imagined distance and time? • There is a natural law that holds in the represented (possible) world: Time =distance speed • But no such law that holds of the representation. The equation does not apply to represented (functional) space, since there is no ‘law’ Time =representation of distance representation of speed • The relation holds because either the observer has tacit knowledge that it is true in the represented world, or the architecture of the imagery system embodies this lawlike generalization.

Example 3. One of the least controversial examples of image transformation: Mental rotation Time to judge whether (a)-(b) or (b)-(c) are the same except for orientation increases linearly with the angle between them (Shepard & Metzler, 1971)

The most common assumption is that to judge whether these two figures are the same shape is to mentally rotate one of them into congruence Is this how the process looked to you? When you make it rotate in your mind, does it seem to retain its rigid 3D shape without having to recomputeit?

The obligatory constraint • What does the “rotate” explanation assume? Philosopher Jesse Prinz describes an assumption involving functional space, which he refers to as “a spatial medium”; “If visual-image rotation uses a spatial medium [functional space], then images must traverse intermediate positions when they rotate from one position to another. A [symbolic] system can be designed to represent intermediate positions during rotation, but that is not obligatory.” J Prinz (2002, p 118) • This is a very important observation. But it needs to answer the further question: What makes it obligatory that the object must ‘pass through intermediate positions’ when rotating in ‘functional space’? These terms apply to the represented world, not to the representation!

The important distinction between fixed properties or format and represented content • It is only obligatory that a certain pattern must occur if the pattern is caused by fixed properties of the architecture as opposed to being due to properties of what is represented (i.e., what the observer tacitly knows about the behavior of that which is represented) • If it is obligatory only because the theorist says it is, then score that as a free empirical parameter (a wild card). • The important consequence is that if we allow one theory to stipulate what is obligatory without there being a principle that mandates it, then any other theory can stipulate the same thing. Such theories are unconstrained and explain nothing. • This failure of image theories is quite general – all picture theories suffer from the same lack of principled constraints.

How are ‘obligatory’ constraints realized? • Obligatory constraints must be built into the architecture • How can you tell whether the constraints are architectural or based on known regularities in the represented domain. • What kind of architecture could possibly enforce rigidity of shape and continuous rotation? • Neither a spatial display nor a functional spacewill do. • Such properties could not be part of the architecture because we can easily imagine objects for which rigidity does not hold (e.g. imagine rotating a snake!). • There is also evidence that ‘mental rotation’ is incrementaland depends on conceptual complexity of the shape and the comparison task.

An aside on the parable of the mystery box • A Cognitive Scientist, out walking in the forest one day, comes upon a black box which happens to have a meter and recording tape (as in an EKG or EEG). • Curious as to how the box works, the Cognitive Scientist examines lots of tape generated by the box and finds the following regular pattern:

An illustrative example: Behavior of Mystery Box • What does this behavior pattern tell us about the nature of the box?

An illustrative example: Mystery Code Box • Careful study revealed that pattern #2 only occurs in this special context when it is preceded by pattern A • What does this behavior pattern tell us about the nature of the box?

The Code Box • The mystery revealed: The current function of this box is to transmit messages in English, using International Morse Code. • In Morse Code: i = e = c = • The observed pattern comes not from a property of the box (not from its architecture) but from a spelling rule in English: i before e except after c. -- - - - ̶ ̶

The Moral: Regularities in behavior may be due to either: The inherent nature of the system or its structure or the way it is “wired”. Call these relatively fixed properties, the system’s architecture. The content of the system’s representational states: What the system represents

Apply this idea to mental scanning • We noted earlier that the relation between imagined distance and scanning time holds either because the observer has tacit knowledge that it is true in the represented world, or because the architecture of the imagery system somehow embodies this lawlike generalization. • These are the same options we faced in the mystery box example. There we concluded that it was not because of the architecture of the box but because of what its states represent – and because the regularities hold in that represented world. The same is true here for the same reason: The system is capable of representing different relations – in other words as we showed, it is cognitively penetrable.

Now apply this idea to mental rotation • In the case of mental rotation – as Prinz noted – we would have a principled explanation if we had an account of why it was obligatory for the imagined object to pass through the sequence of angles that Shepard and others had observed. As with mental scanning, just saying that they do, or to cite something called a functional space, is not an explanation. • We are still left with the two types of explanation illustrated in the mystery box parable: Either the brain is built so that in order to match objects at different orientations it has to deploy the mental operation of rotation, or else the observer chooses to match the objects by an operation experienced as “rotation” because he or she believes that this is how one would compare shapes of rigid bodies at different orientations in the world.

What next? • Given the failure to explain spatial results in terms of a functional space, we turn now to an obvious way in which we might try to explain the imagery results; We locate the picture in the space in the brain – because it is the only place where there is a literal physical space. Moreover there are reasons to think that the primary visual cortex may be where the picture that we consciously experience might be spatially displayed.

The good news for picture theories What are some plausible reasons why we might find a mechanisms of imagery in visual cortex • There is neuroanatomical evidence for a retinotopic layout in the earliest visual area of the brain (V1). • Neural imaging data shows that V1 is more active during mental imagery than during other forms of thought. • Transcranial magnetic stimulation (TMS) of visual areas interferes more with imagery than other forms of thought. • Clinical cases of visual agnosia show that some impairments of vision have associated impairments of imagery (Bisiach, Farah) • Recent psychophysical observations of imagery show parallels with corresponding observations of vision, and these can be related in both cases to certain cells in V1 (e.g., oblique effect)

Neuroscience evidence shows that the retinal pattern of activation is displayed on the surface of the cortex There is a topographical projection of retinal activity on the visual cortex of the cat and monkey. Tootell, R. B., Silverman, M. S., Switkes, E., & de Valois, R. L. (1982). Deoxyglucose analysis of retinotopic organization in primate striate cortex. Science, 218, 902-904.

A more detailed look at two examples where neuroscience evidence is used • Claims that fMRI and PET evidence supports the assumption that larger mental images have correspondingly different regions of cortical excitation. • Claims that the Oblique Effect in imagery supports the assumption that images are laid out on the visual cortex.

1. Image size and the visual cortex • There is evidence that when imagining “large” objects that overflow one’s phenomenal image, a different pattern of activation in visual cortex occurs than when imagining a small object. • This in itself is not remarkable since all scientists accept that a difference in mental experience must be accompanied by some difference in the neural state – this is called thesupervenience assumption: no mental differences without physical differences. This also follows from materialism.

Image size and neural encoding • In vision:cells in the parafoveal area of the retina project onto the more frontal parts of the visual cortex. Thus when objects are large enough so that they fall onto the parafovea, they will activate frontal parts of the visual cortex. • In imagery: it is claimed that imagining large objects (which fill the visual field) leads to increased activity in the frontal part of the visual cortex. Some have taken this as prima facie evidence that perceived (large) size is neurally encoded the same way as imagined (large) size.

Image size and the visual cortex… But the explanation for why large visual objects activate more frontal parts of the visual cortex depends on the fact that fibers from parafoveal cells connect to these frontal areas. This can’t be the case with mental images unless they are also on the retina! How does the fact that large mental images activate frontal parts of the visual cortex explain why tiny details are easier to detect in large mental images? Imagery theorists make exactly the same mistake of citing activation patterns that arise from connections between the retina and the visual cortex, and which therefore do not work unless mental images are projected onto the retina. I will give just one more example of a such a neural explanation because the error in that case is particularly egregious, yet missed by the theorists.

2. The oblique effect and visual cortex • In vision, when a set of lines is to be discriminated (distinguished from a single blur) the discrimination is better when the lines are vertical or horizontal than when they are at a 45° angle. This is called the Oblique Effect. It is a low-level effect that occurs in the early vision module. • Does the Oblique effect occur with mental images?

Do images have low-level visual properties? • Imagine a grating in which the bars are: • Horizontal • Vertical • Oblique (45°) • Imagine the bars getting closer and closer together. In which of these displays do the bars blur together first? • In vision, the oblique bars blur sooner (called the oblique effect) • In imagery, a similar result was reported by Kosslyn et al. (1999) (3) (2) (1)

Same Neurological explanation for both cases? • An accepted explanation of the psychophysical case (where lines are seen) is that in primary visual cortex (V1) there are more cells tuned to horizontal and vertical orientations than to oblique orientations, so horizontal and vertical discrimination is more sensitive. Can this fact also explain why imagined bars show the same pattern? • This argument rests on a misunderstanding of how the orientation-specific cells are tuned to specific orientations: the tuning comes from the way they are connected to photoreceptive cells on the retina. Vertical cells are more often connected to vertical columns of photocells while horizontal cellsare more often connected to horizontal rows of photocells (relative to the retina).

Neurological explanations for both cases? • If activation patterns of bars were somehow projected onto the surface of cortex by mental imagery, as assumed by picture-theorists, then no overall bias toward vertical-horizontal bars would occur. Horizontal cells would be no more likely to be activated by horizontal patterns on the surface of the visual cortex than by vertical patterns. The only way that images of horizontal bars would preferentially activate horizontal cells is if the images were on the retina!

What happens when horizontal/vertical cells are activated by means other than retinal patterns? 9 vertical 9 horizontal 5 oblique The proportion of Vertical, Horizontal & Oblique cells remains the same in all cases – they are located at random on the surface of visual cortex!

Many reasons to reject the idea that mental images are patterns of activity projected onto visual cortex • Size of mental image  size of imagined object • Orientation of imagined objects objects orientation of mental images (see Shepard) • Other properties of the cortical image are unsuitable for carrying out the functions aleged for mental images.

Summary of the bad news for picture theories: Drawing conclusions about the form of visual images from neuroscience data faces many hurdles • The capacity for imagery and for vision are independent. All imagery results are observed in the blind as well as in patients with no visual cortex. So there is nothing visual about them. • Cortical topography is 2-D, but mental images are 3-D – all phenomena (e.g. rotation) occur in depth as well as in the plane. • Patterns in the visual cortex are in retinal coordinates whereas images are primarily in world-coordinates • Unless you make a special effort, your image of parts of the room stays fixed in room coordinates when you move your eyes or turn your head or walk around the room.

…Problems with drawing conclusions about mental imagery from neuroscience data • Accessing information from an image is very different from accessing it from the perceived world. Order of access from images is highly constrained. [eg., spell a familiar word backwards; or close your eyes andimagine this square matrix and name the numbers along a diagonal.] • Some have tried to explain this by postulating rapid decay of images, but the times involved in these demonstrations are not consistent with the data – they are no longer than scanning or rotation. • Conceptual rather than graphical properties are relevant to image complexity (e.g., mental rotation) suggesting that image representations are conceptual. • If images consist in patterns on visual cortex then they behave differently when the same patterns are acquired from vision. For example the important Emmert’s law applies to retinal and cortical images but not to mental images, a fact largely unnoticed.

…Problems with drawing conclusions about mental imagery from neuroscience data • The signature properties of vision (e.g., spontaneous 3D interpretation, automatic reversals, apparent motion, motion aftereffects, etc) are absent in images;Imagine this figure, then an exact copy beside it, then connect each vertex of the two copies.What do you see? • A cortical display account of most imagery findings is incompatible with the cognitive penetrability of mental imagery phenomena, such as scanning and image size effects; • The fact that the Mind’s Eye is so much like a real eye (e.g., oblique effect, visual angle, resolution fall-off) should serve to warn us that we may be studying what observers know about how the world looks to them, rather than what form their images take.

…Problems with drawing conclusions about mental imagery from neuroscience data • Many clinical cases cited by picture theorists can be explained by appeal to tacit knowledge and attention • The ‘tunnel effect’ found in vision and imagery (Farah) is plausibly due to the patient knowing how things looked to her after her surgery (Experiments were done a year after her surgery so she had time to experience how things looked). • Hemispatial neglect seems to be an attention deficit, which explains the neglect in imagery reported by Bisiach. A recent study shows that image neglect does not appear if patients have their eyes closed (Bartolomeo & Chokron, 2002). This fits well with the account I have offered in which the spatial character of mental images derives from concurrently perceived space (I will give examples later).

Where do we stand? • It seems that a literal picture-in-the-brain theory is untenable for many reasons – including the major empirical differences between mental images and cortical images. The pictorial quality of images may be an illusion that arises from the similarity of the experience of imaging and of seeing. • So how do we explain the similarity of the experience of imagining and of seeing – the fact that they both seem to involve a pictorial panoramic display? • Maybe neither the visual nor the imagery experience reveals the form of the stored information.

The Illusion of Mental Pictures