Cognition – 2/e Dr. Daniel B. Willingham

What Makes Visual Perception Hard? • Vision is hard as the pattern of light that falls on your retina is consistent with with many different scenes. For example, is the figure to the left a square or a cube face on? Maybe it is the base of a four sided pyramid?

Visual Perception Hard - Continued? • We will have to deal with ambiguities. • This is done by the visual system making assumptions. • So what is the visual system for? 1) It identifies objects 2) It helps us navigate in the world

Visual Perception Hard - Continued? • Chief problem of the visual system is the inverse projection problem :The problem of recovering three-dimensional shape from a two-dimensional projection, like the projection on the retina. A two dimensional project may represent different three dimensional objects. Can you see the Necker Cube on the Left as either of the cubes below? ? OR • Thus, visual system must deal with indeterminacy in shape and orientation.

Visual Perception Hard - Continued? • The Second Important Problem - Surface Features • The Visual System must deal with an object’s surface features: color, how dark or light it is, etc. • How does the visual system deal with luminance, the amount of light the eye receives? • There are indeterminacies in • Light Source • Reflectance • Shadow The Mach Card as an Example of Indeterminacy? Is the gray part a gray surface of a two part object or is a shadow of the white part?

Visual Perception Hard - Continued? • The Third Important Problem - Object size and distance are indeterminate in a two dimensional representation. • For example, the images of the sun and moon seem the same size but they are at different distances and are different sizes. Could you tell which is what by visual inspection alone? • Is Face Perception Special? • Prosopagnosia - syndrome that affects face recognition • A sheep farmer lost his ability to recognize faces but still could recognize his sheep (MacNeil and Warrington, 1983) • However, imaging of the fusiform gyrus suggests that the loss is really one of visual expertise (Gauthier, et al, 1999) • The issue is still debated.

How Are Visual Ambiguities Resolved? • Shape • Brightness • Distance and Size • Top-Down Influences in Vision • An Alternative: The Ecological Approach

Shape • The visual system uses assumptions in processing objects • Object orientations are assumed not to be unusual • Likelihood principle (Helmhotz):Suggestion that among the many ways of interpreting an ambiguous visual stimulus, the visual system will interpret it as the stimulus that is most likely to occur in the worldExample – lines parallel in a 2-dimensional representation are likely to be parallel in the 3-dimensional world • Frame of Reference:The position or orientation of an object is defined relative to something else. For example, which object is a square and which is a diamond? Palmer, et al 1988

Brightness • Assumptions: • Surfaces are uniformly colored • Gradual changes could be caused by shadow • Three Factors Contribute to Luminance • Light Source • Shading • Shadow – see Photo 2.1 for a fine example • Visual system needs to analyze complex scenes to find simple meaningful components. This is depicted in: • Adelson's (1998) illusion in Fig. 2.5 • Gilchrist’s (1997) demonstration of the effects of local contrast tested binocularly & monocularly – see Fig. 2.6

Distance and Size • How can we determine the true size and distance of objects? Through the use of cues. • Three Cues in the Visual System • Accommodation – lens shape changes as you focus on objects • Convergence – angle of the eyes as you focus on objects • Stereopsis – based on retinal disparity, the difference in position of an object’s image on each retina • For example – see Fig 2.8 and the discussion of the correspondence problem. Random stereograms are a prime example. • Correspondence problemTo use disparity as a cue to depth, one must match up the left and right retinal images. The correspondence problem refers to the difficulty that retinal images may contain many possible matches • Random dot stereograms Special stimuli with no cues to depth except retinal disparity. Based on shifting patterns of random dot elements to the left or right in two versions of identical dot matrices. Developed by Julesz.

Distance and Size - Continued Cues in the Environment • Familiar Size • Using one’s knowledge of the typical size of an object as a cue to the likely size and distance of an object. • For example, if a child appears larger than an adult, it is likely that the child is closer to the observer. • Pictorial Cues – Cues to distance that can be used in 2-dimensional pictures: • Occlusion: An object that occludes another is closer • Texture Gradient :A field is assumed to have a uniform texture gradient, so if more detail is visible in part of the field, it is assumed to be closer • Linear Perspective: Parallel lines converge in the distance, so the closer they are to converging, the farther away the location • Relative Height: Objects higher in the picture plane are farther away – Photo 2.2 • Atmospheric Perspective: Objects in the distance look less distinct with a bluish tinge as they are viewed through dust & water particles that scatter light – Photo 2.2

Top-Down Influences in Vision • Parsing Problem • One important issue is the parsing problem. • For some ambiguous figures, it seems impossible to identify the figure without knowing what its parts are, but its parts cannot be identified unless one knows what the figure is. • Palmer suggests it is resolved through using top-down and bottom -up information simultaneously • Bottom-up processing (as previously discussed) can’t handle all of vision. • Top down processing is needed in which conceptual knowledge influences the processing or interpretation of lower-level perceptual processes. • For Example – What is This and is this 15? What are these from? Click on them!

It’s a Face! Click on the face to go back if you want to!

An Alternative: The Ecological Approach • Previously, the discussion was about the computational approach Dominant approach, it assumes that information provided in environment is impoverished. Cognitive system must do computation to derive the richness of environment. • Gibson proposes the Ecological Approach Emphasizes that the environment has rich sources of information in it and that the computations the visual system needs to perform are probably not that extensive. Two examples are: • Object Size – Gibsonians suggest the use of Eyeheight over the previously discussed cues. Eyeheight is based on The height of the observer’s eyes from the ground. Wraga (1999 a & b) support this. • Distance for Navigation – How do we catch a baseball. Through computation or using a simple environmental cue. McBeath, et al (1995) found that people catch a fly ball by running so the trajectory of the ball looks like a straight line.

What is Visual Perception For? • Identifying Objects • Navigation There seem to be separate psychological and physiological systems for these two processes

Identifying Objects • Core Question: What does the memory representation that supports the recognition of objects look like? • Two Families of Answers • Viewer-centered presentations • A mental representation of what an object looks like relative to the observer. • Object- centered representation • A mental representation of what an object looks like relative to the object itself. The representation can support recognition of the object when it is viewed from any perspective.

Identifying Objects - Continued • Viewer-Centered Theories • Template Theory – an older theory • A simple template matching theory of object recognition says that you compare what you see to templates stored in memory • Problem is you need a tremendous number of templates • Feature-matching Theories • A theory of visual object identification proposing a memory representation of an object is list of features • Example: The letter T is conceptualized as having two features, a horizontal line __ and a vertical line | . • Theory have advantages over Templates • Can handle transformations better • Supported by neurophysiological discover of seeming “edge” and “line” detectors by Hubel and Wiesel (1979) • Disadvantage – feature theories have problems with natural objects

Identifying Objects - Continued • Object Centered Theories • In case, you don’t remember. A mental representation of what an object looks like relative to the object itself. The representation can support recognition of the object when it is viewed from any perspective. • Relies on ability to recognize object parts as they are relative to each other. Possible processes and related theories are: • Good continuation: Points that can be interpreted connecting a straight or smoothly curving line will be interpreted that way rather than as connecting sharply angled lines.See likelihood principle also. Do you the figure below as two crossed lines ( X ) or as the two angles ( > <) to the right?

Identifying Objects - Continued • Object Centered Theories - Continued • Applications of sophisticated versions of this principle, based on boundaries properties (convexities, maximum curvature) can be applied to object center feature theories (Hoffman and Richards) • Influential version is Biederman’s Recognition by Components (1987) using Geons (36 basic shapes – ex. cylinders, bricks, etc.) that act like a visual alphabet. They are easily distinguished. • Pro – Biederman found that obscuring vertices impairs objects recognition while obscuring other parts of objects has a lesser effect. Which is easiest to recognize as a cup? The left or right?

Identifying Objects - Continued • Object Centered Theories - Continued • Con – Biederman – Not all natural objects can be decomposed into geons. What about a shoe? • Viewer Center Theories - Again • Compromise position suggested – we store multiple viewer centered representations of objects and apply transformations • Supporting work: • Shepard and Cooper’s (1986) mental rotation of letters suggests transformations are used • Tarr (1995) found if subjects are trained with multiple views of the standard mental rotation block figures, they access the best view for the task • Object vs. Viewer – who wins? • Suggested that the mind uses two methods of recognition • Decomposition-into-parts for objects that have telltale geons • Multiple Views for objects that are closely related

Navigation • What/where hypothesis:Visual system segregates analysis of what objects are (object recognition) and where they are (spatial location). • Ungerleider and Mishkin (1982) tested monkeys on two tasks: • Location task (called the landmark task) – Disrupted by Parietal Lesions • Object Identity task (matching task) – Disrupted by Temporal Lesions • What/how Hypothesis (Goodale & Milner, 1992):Alternative to the what/ where hypothesis, this proposal holds that the visual system segregates analysis of what objects are (object recognition and location) and how to manipulate them (visual information dedicated to the motor system).

Navigation - Continued • What/how Hypothesis (Goodale & Milner, 1992) - continued • Haffenden and Goodale (1998) – Fooling one system and not the other • Subjects reach for the center object in both displays. • Do they look the same size? One looks bigger • When subjects reach do they alter their grasp based on the perceptual effect? Answer: Subjects don’t alter their grip. The “what” system was fooled but the “how” system wasn’t. See a similar result by Proffitt, et al (1995) with hill slopes – Fig. 2.22

Functional Imaging and the Representation of Space • Ungerleider and Mishkin (1982): spatial info was the province of dorsal stream, parietal lobe • Goodale & Milner (1992): spatial info in both streams • Ventral – layout of objects in space • Dorsal – object’s position for acting on those objects • Epstein & Kanwisher (1988): Ventral stream contains area sensitive to spatial aspects of local environment – parahippocampal place area (PPA) as seen in Box 2-2

Cognition – 2/e Dr. Daniel B. Willingham