Object Recognition Using Alignment Brian J. Stankiewicz
Approaches to Human Object Recognition • Alignment Approach • Store image(s) in memory • Use image transformations to bring new view into alignment with viewed image.
Approaches to Human Object Recognition • Alignment Approach Template matching Failures
Approaches to Human Object Recognition • Alignment Approach Many different exemplars of category of object. How does one handle this type of variability?
Approaches to Human Object Recognition • Structural Description • Pre-process image before storing in memory • Decompose object into simple parts • Describe the object’s shape in terms of their parts • Parts are described using specific non-accidental properties
Structural Descriptions • Objects are decomposed into “parts”. • Objects are described by specifying configuration of parts and their relations.
Structural Descriptions • Each part is describe by specifying the values of particular shape parameters. • Varying parameter varies the shape.
Structural Descriptions • Challenge. • How do you decompose image into objects and objects into parts? • How do you determine the shape parameters of a part given an image. • This topic will be covered next week in Biederman and Biederman & Cooper papers.
Today… • Begin by investigating the effect of viewpoint on object recognition. • Look for evidence of alignment approach • Shepard & Metzler • Mental rotation of 3d shapes • Picture Plane and Depth rotations • Tarr & Pinker • Mental rotation of 2d shapes • Picture plane rotation only • Multiple-Views Hypothesis
Shepard & Metzler • Wanted to understand how humans recognize different views of the same object. • Different images of same 3D shape can be produced by manipulating viewpoint • Investigated the effect of depth and picture-plane rotations.
Shepard & Metzler: Stimuli • “Novel” stimuli: Not a lot of previous experience • Fairly difficult task • Cannot simply use simple features • Able to carefully control view information.
Shepard & Metzler: Procedure • Two images presented simultaneously • Images of identical or “mirror reflected” objects • Subjects indicated whether two images depicted same object • Responded by pulling a “lever” • Record response times
Shepard & Metzler: Results • Response times increased linearly with orientation • Suggests that subjects are “mentally rotating” images to determine match. RT To “Same” Responses Angle of Rotation
Shepard & Metzler: Results • Reaction times increased linearly with depth orientation • Suggests a similar mechanism
Shepard & Metzler: Results • Not only are both depth and picture-plane rotations linearly increasing, but they have very similar slopes. • Suggestive of a single “mental rotation” mechanism.
Object recognition • Two fundamental approaches to human object recognition • Alignment approaches • Object recognition through alignment process • Structural description approach • Decomposition of features included in an object • Describe the objects’ shape in terms of their parts and relation among the parts.
What is alignment • Definition • A process that transform stored images to bring new view into alignment with viewed image. • Why we need alignment? • We cannot recognize object exactly only by template matching • Need for some process which transform input images or data alignment
2 studies in alignment approaches • Shepard & Metzler • Mental rotation of 3D objects shapes • A single mental rotation mechanism • Evidence*: same results from rotated depth and picture-plane pairs. • Tarr & Pinker • Multiple view hypothesis (?)
Tarr & Pinker • Wanted to investigate “mental rotation” in more detail • Two hypotheses • Single canonical image stored in memory and all new images are aligned to that single representation • Multiple-Views stored in memory. • Align new view to closest stored view
Tarr & Pinker: Method • Train subjects to recognize small set of novel, letter-like objects. • Did a “handedness” task • Is the image the trained image (standard)or its mirror reversal?
Tarr & Pinker: Stimuli • Novel, letter-like images. • Subjects trained on 3 of the images • Reduce stimuli specific effects
Tarr & Pinker: Procedure • Trained subjects on 4 different orientations • (0°,45°,-90°,135°) • Tested on trained and “surprise orientations” • Measured response times
Initial reaction times similar to S&M Performance improves after 13 blocks Surprise orientations slower than trained Tarr & Pinker: Exp. 1 Results Block 1~12: practice Block 13: practice + surprise
Tarr & Pinker: Exp. 1 Results Compute best fittingline to compute slope Surprise orientations’ required degree to be rotated 90 : 45 - 135: 45 - 45 : 45 but 180: 90 “4 different orientation- images stored in memory?”
Tarr & Pinker: Exp. 1 Results High slope = much rotation = single canonical image
Tarr & Pinker: Exp. 1 Summary • Stimuli showed a similar result to previous findings • Increased RT with disparate orientations from training • Subjects showed improvement following training • Even after training, subjects were slower on non-trained (intermediate) orientations
Tarr & Pinker: Exp. 2 Motivation • Demonstrated an improvement in recognition times with training. • Not a demonstration of canonical or multiple views. • Experiment 2, train on a few orientations and test on multiple orientations. • See if there is evidence for rotating to the “nearest” trained orientation.
Tarr & Pinker: Methods • Similar to Experiment 1 • However, classification task rather than “handedness” task. • Three objects: “Kip”, “Kef”, “Kor”, and distractors • Record response times
Tarr & Pinker: Exp. 2 Procedure • Train on 3 orientations • Test on multiple intervening orientations • Look for rotation functions to nearest trained orientation
Tarr & Pinker: Exp. 2 Summary • Investigated whether subjects show a linearly increasing RT to canonical view or closest trained view. • Showed mixed evidence. • For 0° and 210° it appears that there is a dip in the surrounding RTs • Suggests rotation to nearest orientation • For 105° no evidence of alignment.
Mental Rotation in Block 1 By block 13 trained orns are fast Mental rotation rate for untrained orns slower. Tarr & Pinker: Exp. 2 Results
Tarr & Pinker: Study 3 • Wanted to see if “handedness” played a role in recognition times. • Experiment 1 showed effect for handedness judgment. • Subjects might engage in handedness judgment unnecessarily. • Trained on both “standard” and “reversed” images • Tested on both set of images • No handedness judgment required
90 -135 180 - 45