1 / 43

Efficient Object Recognition using Feature Triplets

This article explores the use of feature triplets for efficient object recognition in large databases. It discusses the importance of spatial relationships between neighboring triplets and proposes a triplet tree model for inference efficiency. It also explores different approaches for creating vocabulary and expanding the tree structure. Examples and discussions are provided to demonstrate the effectiveness of the proposed methods.

rfrick
Download Presentation

Efficient Object Recognition using Feature Triplets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Feature triplets for object recognition Larry Zitnick

  2. Text analogies Words Topics Features Categories wood the should beam doorway be reinforce placed structure above to blue for car jump shampoo this bag = Politics Biology Mathematics Nature (Baeza-Yates and B. Ribeiro-Neto, 99, Squire et al. 00, Sivic et al. 03, Sivic et al. 05)

  3. Text analogies A smaller window is desirable to avoid unwanted smoothing in a disparity map. Home improvement? Computer vision? Home improvement or computer vision?

  4. Background clutter Real images have background clutter… a smaller window is desirable to avoid unwanted smoothing in a disparity map year for hat blue a as sky in of a smaller window is desirable to avoid unwanted smoothing in a disparity map nail up we forest draw stereo object solve matrix rectify train edge stereo object solve matrix rectify train edge

  5. Spatial relationships Real images are 2D… up rain draw sky unwanted avoid in to for blue as of smoothing forest a hat rectify out desirable smaller year is a in window map disparity nail a

  6. N-gram model for vision? Local model: Global model: Constellation model (Weber et al, 00 Fergus et al. 03) 1-gram

  7. “Words” should be: Discriminative – belong only to a few objects/categories Predictive – informative of occurrence and position of neighboring words.

  8. Possible words SIFT: Harris/Hessian-Affine, MSER, etc. Lowe, 2004 Mikolajczyk and Schmid, 2004 Matas et al., 2004 Two features: Rotation + scale = affine Doublets, Sivic et al., 2005

  9. Feature triplets Group neighboring features into sets of 3 Lazebnik et al., 2003 100 features ≈ 1,000 triplets

  10. Advantages More discriminative – 3x more descriptors “computer vision” more discriminative than “computer” and “vision” More predictive – robust affine transformation

  11. Outline • Object instance recognition in large databases • Jie Sun (Georgia Tech) • Object category recognition • Xiangyang Lan (Cornell)

  12. Object instance recognition J. Sun, C. L. Zitnick, R. Szeliski • Efficient recognition with large databases (> million objects) • Image centric Query image Training image

  13. Affine Feature invariance SIFT feature space Rotation + Scale

  14. Sampling descriptor patches Canonical frame Image Similar to Brown and Lowe, 2002

  15. f1 f1 f0 f0 f1 f0 f1 f0 Triplet centric Scale & rotation invariant

  16. Triplet centric Affine Feature invariance SIFT feature space Rotation + Scale

  17. Feature vocabulary K-means clustering 1,000 clusters = 1,000,000,000 possible triplets Increased redundancy = higher computational cost Each feature has a different descriptor for each triplet Verification Geometric hashing technique

  18. Results

  19. Object category recognition X. Lan, C. L. Zitnick, R. Szeliski • Feature-based approach to object category recognition • Model based Bag of words model (Sivic et al. 03) Constellation model (Weber et al, 00 Fergus et al. 03) 3D model from features (Rothganger et al. 03)

  20. t2 t1 Our Approach • Use local representation • Model spatial relationships between neighboring triplets • Allows global deformations

  21. Spatial Relations: Triplet Tree • Design decision - use triplet tree for inference efficiency.

  22. Two Essential Problems • Compute Probability of Objects given tree • Find the “topic” • Find Tree Structure • Find “grammatically correct” structure

  23. Scoring the object G – tree graph Ok – object ti – triplet P(ti ) – triplet’s parent

  24. Transition Probability fij ti fij Canonical ? tj

  25. Face Data Set: Triplet Data

  26. Face Data Set: Triplet Data

  27. Face Data Set: Triplet Data

  28. Creating Vocabulary • K-means (Sivic et al. 03) • Two features are corresponding if they’re assigned to same cluster.

  29. ? ? ? ? ? Expanding tree • Rank potential triplets in queue. • Find most likely triplet given the probability of all the objects.

  30. Example: Known man Caltech 101

  31. Example: Unknown woman Caltech 101

  32. Example: Harder classes Caltech 101

  33. Example: 3D object instance UIUC database

  34. Example: 3D object instance UIUC database

  35. Average warped images

  36. Average warped images

  37. UIUC Data Set: multiple objects

  38. UIUC Data Set: multiple objects

  39. UIUC Data Set: multiple objects

  40. UIUC Data Set: multiple objects

  41. UIUC Data Set: multiple objects

  42. UIUC Data Set: Multi Objects

  43. Discussion • Local constellation model • Triplet approach = tri-gram model? • Tree object models (greedy and efficient) • Alternative: Model interactions between neighboring triplets using MRF and BP. • Model subclasses of objects • How to model background? • Further experiments needed

More Related