1 / 30

Accounting for the relative importance of objects in image retrieval

BMVC 2010 Sung Ju Hwang and Kristen Grauman University of Texas at Austin. Accounting for the relative importance of objects in image retrieval. Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search

fay
Download Presentation

Accounting for the relative importance of objects in image retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BMVC 2010 Sung Ju Hwang and Kristen Grauman University of Texas at Austin Accounting for the relative importance of objects in image retrieval

  2. Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search International Journal of Computer Vision, 2011 Sung Ju Hwang and Kristen Grauman

  3. Relative importance of objects An image can contain many different objects, but some are more “important” than others. architecture sky mountain bird cow water

  4. Relative importance of objects Some objects are background architecture sky mountain bird cow water

  5. Relative importance of objects Some objects are less salient architecture sky mountain bird cow water

  6. Relative importance of objects Some objects are more prominent or perceptually define the scene architecture sky mountain bird cow water

  7. Our goal Goal: Retrieve those images that share important objects with the query image. versus How to learn a representation that accounts for this?

  8. Idea: image tags as importance cue The order in which person assigns tags provides implicit cues about object importance to scene. TAGS: Cow Birds Architecture Water Sky

  9. Approach overview: Building the image database Cow Grass Horse Grass … Car House Grass Sky Tagged training images Learn projections from each feature space into common “semantic space” Extract visual and tag-based features

  10. Approach overview: Retrieval from the database Untagged query image Retrieved images Cow Tree Grass Cow Tree Image database Tag list query Retrieved tag-list • Image-to-image retrieval • Image-to-tag auto annotation • Tag-to-image retrieval

  11. Visual features Color Histogram Visual Words Gist captures local appearance (k-means on DoG+SIFT) captures the total scene structure captures the HSV color distribution [Torralba et al.]

  12. Tag features Traditional bag-of-(text)words Word Frequency tagcount Cow 1 Bird 1 Water 1 Architecture 1 Mountain 1 Sky 1 Car 0 Person 0 Cow Bird Water Architecture Mountain Sky

  13. Tag features Absolute rank in this image’s tag-list Absolute Rank tagvalue Cow 1 Bird 0.63 Water 0.50 Architecture 0.43 Mountain 0.39 Sky 0.36 Car 0 Person 0 Cow Bird Water Architecture Mountain Sky

  14. Tag features Percentile rank obtained from the rank distribution of that word in all tag-lists. Relative Rank tagvalue Cow 0.9 Bird 0.6 Water 0.8 Architecture 0.5 Mountain 0.8 Sky 0.8 Car 0 Person 0 Cow Bird Water Architecture Mountain Sky

  15. Learning mappings to semantic space Canonical Correlation Analysis (CCA): choose projection directions that maximize the correlation of views projected from same instance. View 1 View 2 Semantic space: new common feature space

  16. Kernel Canonical Correlation Analysis [Akaho 2001, Fyfe et al. 2001, Hardoon et al. 2004] Linear CCA Given paired data: Select directions so as to maximize: Given pair of kernel functions: Kernel CCA , Same objective, but projections in kernel space: ,

  17. Recap: Building the image database Visual feature space tag feature space Semantic space

  18. Experiments We compare the retrieval performance of our method with two baselines: Words+Visual Baseline Visual-Only Baseline Query image 1st retrieved image KCCA semantic space 1st retrieved image Query image [Hardoon et al. 2004, Yakhenenko et al. 2009]

  19. Evaluation We use Normalized Discounted Cumulative Gain at top K (NDCG@K) to evaluate retrieval performance: Reward term score for pth example Sum of all the scores (normalization) Doing well in the top ranks is more important. [Kekalainen & Jarvelin, 2002]

  20. Evaluation We present the NDCG@k score using two different reward terms: Object presence/scale Ordered tag similarity Cow Tree Grass Person Cow Tree Fence Grass Rewards similarity of query’s objects/scales and those in retrieved image(s). Rewards similarity of query’s ground truth tag ranks and those in retrieved image(s). scale presence relative rank absolute rank

  21. Dataset LabelMe Pascal • 6352 images • Database: 3799 images • Query: 2553 images • ~23 tags/image • 9963 images • Database: 5011 images • Query: 4952 images • ~5.5 tags/image

  22. Image-to-image retrieval We want to retrieve images most similar to the given query image in terms of object importance. Visual kernel space Tag-list kernel space Untagged query image Image database Retrieved images

  23. Image-to-image retrieval results Query Image Visual only Words + Visual Our method

  24. Image-to-image retrieval results Query Image Visual only Words + Visual Our method

  25. Image-to-image retrievalresults Our method better retrieves images that share the query’s important objects, by both measures. 39% improvement Retrieval accuracy measured by object+scale similarity Retrieval accuracy measured by ordered tag-list similarity

  26. Tag-to-image retrieval We want to retrieve the images that are best described by the given tag list Visual kernel space Tag-list kernel space Cow Person Tree Grass Image database Retrieved images Query tags

  27. Tag-to-image retrieval results Our method better respects the importance cues implied by the user’s keyword query. 31% improvement

  28. Image-to-tag auto annotation We want to annotate query image with ordered tags that best describe the scene. Visual kernel space Tag-list kernel space Untagged query image Cow Tree Grass Field Cow Fence Cow Grass Image database Output tag-lists

  29. Image-to-tag auto annotationresults Tree Boat Grass Water Person Boat Person Water Sky Rock Person Tree Car Chair Window Bottle Knife Napkin Light fork k = number of nearest neighbors used

  30. Thank you

More Related