Perceptual Grouping and Gestalt Laws Law of Similarity.Items that look similar will be perceived as being part of the same form Law of Good continuation.This is perceived as a square and triangle, not as a combination of strange shapes Law of Proximity.Items that look are nearby are grouped together
Common Fate Johannson point light displays From: Emily Grossman Collection of point light displays: http://astro.temple.edu/~tshipley/mocap/dotMovie.html http://www.tutkie.tut.ac.jp/~mich/kitazaki.hm.html
Figure–Ground Segregation • An ambiguous drawing which can be seen either as two faces or as a goblet
Task: are the two x’s on the same shape?Result: faster performance with upright letters • The study by Vecera and Farah (1997) suggests the Gestaltists exaggerated the role of bottom-up processes in segmentation
Gestaltists de-emphasised the complexities involved when laws of grouping are in conflict. • (a) Display involving a conflict between proximity and similarity; (b) display with a conflict between shape and colour; (c) a different display with a conflict between shape and colour.
Geisler et al. (2001) • Is our perceptual grouping mechanism tuned to the natural visual environment? Gestalt mechanisms should have a basis in the statistics of the natural world • Analyzed pairs of edges in natural images. Angle distance
Geisler et al. (2001) • Color shows how likely pairs of edges are to belong to same physical contour when varying distance and angle difference • Finding: adjacent edges with similar orientations are likely to belong to same contour • Study validates Gestalt law of good continuation Figure 3D from Geisler et al. (2001) paper
Geisler et al. (2001) Psychophysical experiment: detect which image contains a winding contour: Performance could be predicted by assuming observers uses the statistics of the natural world
Theories of Object Recognition • Template matching models • Marr’s theory • Recognition-by-components
Some Challenges for Object Recognition Theories • The binding problem: binding different features (color, orientation, etc) to yield a unitary percept. • Bottom-up vs. top-down processing: how much is assumed top-down vs. extracted from the image? • Viewpoint invariance: a major issue is to recognize objects irrespectively of the viewpoint from which we see them.
Template matching Detect patterns by matching visual input with a set of templates – see if any template matches. What about invariance to translation, scaling and rotation? Solution: Find template that best aligns to image (using translation, rotation, scaling)
Figure 2-15 (p. 58)Examples of the letter M. Problem: template matching is not powerful enough for general object recognition
Marr’s Theory (1982) 1) pixel-based (light intensity) 2) primal sketch (discontinuities in intensity) 3) 2 ½ D sketch (oriented surfaces, relative depth between surfaces) 4) 3D model (shapes, spatial relationships, volumes)
The hierarchical organisation of the human figure (from Marr & Nishihara, 1978) at various levels: (a) axis of the whole body; (b) axes at the level of arms, legs, and head; (c) arm divided into upper and lower arm; (d) a lower arm with separate hand; and (e) the palm and fingers of a hand.
Biederman’s Recognition-by-Components Theory • Adapted from Biederman (1987)
Recognition-by-Components Theory • Biederman (1987): five invariant properties of edges • Curvature: points on a curve • Parallel: sets of points in parallel • Cotermination: edges terminating at a common point • Symmetry: versus asymmetry • Collinearity: points sharing a common line
Recognition by Components • Complex objects are made up of arrangements of basic, component parts: geons. • “Alphabet” of 36 geons • Recognition involves recognizing object elements (geons) and their configuration
Why these 36 geons? • Choice of shape vocabulary seems a bit arbitrary • However, choice of geons was based on non-accidental properties. The same geon can be recognized across a variety of different perspectives: except for a few “accidental” views:
Biederman (1987). Participants were presented with degraded line drawings of objects. Recognition was much harder to achieve when parts of the contour were omitted than when other parts of the contour were deleted. This confirms the assumption that information about concavities is important for object recognition.
Potential difficulties • Structural description not enough, also need metric info • Difficult to extract geons from real images • Ambiguity in the structural description: most often we have several candidates • For some objects, deriving a structural representation can be difficult Edelman, 1997
Viewpoint-dependent and Viewpoint-invariant Theories • Biederman (1987). Recognition by components predicts that ease of object recognition is not affected by the observer’s viewpoint if all geons can be identified and their configuration • However: Tarr (1995), Tarr and Bülthoff (1995, 1998) find that changes in viewpoint often do reduce the speed and/or accuracy of object recognition
Examples of “Greebles”. In the top row, four different “families” are represented. For each family, two members of different “genders” are shown (e.g., Ribu is one gender and Pila is the other). The bottom row shows a new set of Greeble figures constructed on the same logic but asymmetrical in structure.
Speed of Greeble matching as a function of stage training and difference in orientation between successive Greeble stimuli. Based on data in Gauthier and Tarr (2002).
Task dependent effects (Vanrie et al. 2002) Non-matching stimuli in same-different task invariance condition: features are slighty rotated. This results in view-invariant features rotation condition: using mirror images. requires mental rotation and viewpoint dependent processes
Speed of performance in (a) the invariance condition and (b) the rotation condition as a function of angular difference and trial type (matching vs. non-matching). Based on data in Vanrie et al. (2002).
Problem for many object recognition theories. How to model role of context? Context can often help in identification of an object Later identification of objects is more accurate when object is embedded in coherent context
Associative Agnosia • Failure or deficit in recognizing objects. Patients can draw and copy objects, but cannot recognize them or understand the purpose of some objects
Apperceptive Agnosia • Inability to integrate features of an object into an overall pattern. Patients can not distinguish between objects, despite clear differences in color and shape.
Prosopagnosia • Inability to recognize faces http://www.radicalface.com/mediac/400_0/media/
Riddoch and Humphreys (2001).A hierarchical model of object recognition and naming, specifying different component processes which, when impaired, can producevarieties of apperceptive and associative agnosia.