Efficient Object Recognition using Feature Triplets

Feature triplets for object recognition Larry Zitnick

Text analogies Words Topics Features Categories wood the should beam doorway be reinforce placed structure above to blue for car jump shampoo this bag = Politics Biology Mathematics Nature (Baeza-Yates and B. Ribeiro-Neto, 99, Squire et al. 00, Sivic et al. 03, Sivic et al. 05)

Text analogies A smaller window is desirable to avoid unwanted smoothing in a disparity map. Home improvement? Computer vision? Home improvement or computer vision?

Background clutter Real images have background clutter… a smaller window is desirable to avoid unwanted smoothing in a disparity map year for hat blue a as sky in of a smaller window is desirable to avoid unwanted smoothing in a disparity map nail up we forest draw stereo object solve matrix rectify train edge stereo object solve matrix rectify train edge

Spatial relationships Real images are 2D… up rain draw sky unwanted avoid in to for blue as of smoothing forest a hat rectify out desirable smaller year is a in window map disparity nail a

N-gram model for vision? Local model: Global model: Constellation model (Weber et al, 00 Fergus et al. 03) 1-gram

“Words” should be: Discriminative – belong only to a few objects/categories Predictive – informative of occurrence and position of neighboring words.

Possible words SIFT: Harris/Hessian-Affine, MSER, etc. Lowe, 2004 Mikolajczyk and Schmid, 2004 Matas et al., 2004 Two features: Rotation + scale = affine Doublets, Sivic et al., 2005

Feature triplets Group neighboring features into sets of 3 Lazebnik et al., 2003 100 features ≈ 1,000 triplets

Advantages More discriminative – 3x more descriptors “computer vision” more discriminative than “computer” and “vision” More predictive – robust affine transformation

Outline • Object instance recognition in large databases • Jie Sun (Georgia Tech) • Object category recognition • Xiangyang Lan (Cornell)

Object instance recognition J. Sun, C. L. Zitnick, R. Szeliski • Efficient recognition with large databases (> million objects) • Image centric Query image Training image

Affine Feature invariance SIFT feature space Rotation + Scale

Sampling descriptor patches Canonical frame Image Similar to Brown and Lowe, 2002

f1 f1 f0 f0 f1 f0 f1 f0 Triplet centric Scale & rotation invariant

Triplet centric Affine Feature invariance SIFT feature space Rotation + Scale

Feature vocabulary K-means clustering 1,000 clusters = 1,000,000,000 possible triplets Increased redundancy = higher computational cost Each feature has a different descriptor for each triplet Verification Geometric hashing technique

Results

Object category recognition X. Lan, C. L. Zitnick, R. Szeliski • Feature-based approach to object category recognition • Model based Bag of words model (Sivic et al. 03) Constellation model (Weber et al, 00 Fergus et al. 03) 3D model from features (Rothganger et al. 03)

t2 t1 Our Approach • Use local representation • Model spatial relationships between neighboring triplets • Allows global deformations

Spatial Relations: Triplet Tree • Design decision - use triplet tree for inference efficiency.

Two Essential Problems • Compute Probability of Objects given tree • Find the “topic” • Find Tree Structure • Find “grammatically correct” structure

Scoring the object G – tree graph Ok – object ti – triplet P(ti ) – triplet’s parent

Transition Probability fij ti fij Canonical ? tj

Face Data Set: Triplet Data

Creating Vocabulary • K-means (Sivic et al. 03) • Two features are corresponding if they’re assigned to same cluster.

? ? ? ? ? Expanding tree • Rank potential triplets in queue. • Find most likely triplet given the probability of all the objects.

Example: Known man Caltech 101

Example: Unknown woman Caltech 101

Example: Harder classes Caltech 101

Example: 3D object instance UIUC database

Average warped images

UIUC Data Set: multiple objects

UIUC Data Set: Multi Objects

Discussion • Local constellation model • Triplet approach = tri-gram model? • Tree object models (greedy and efficient) • Alternative: Model interactions between neighboring triplets using MRF and BP. • Model subclasses of objects • How to model background? • Further experiments needed

Efficient Object Recognition using Feature Triplets

Efficient Object Recognition using Feature Triplets

Presentation Transcript

OBJECT RECOGNITION

Invariant Local Feature for Object Recognition

Object recognition

Facial Feature Recognition

Classical Methods for Object Recognition

Object Recognition

Gene Feature Recognition

Discriminative Feature Optimization for Speech Recognition

Object Recognition

Visual Object Recognition

Object Recognition

Visual Object Recognition

Object recognition

Object Recognition

Object Recognition

Feature Selection for Pattern Recognition

Object Recognition

Object recognition

Object Recognition

Object recognition

Object Recognition