1 / 35

Image Ranking and Retrieval based on Multi-Attribute Queries CVPR2011

Image Ranking and Retrieval based on Multi-Attribute Queries CVPR2011. Behjat Siddiquie 1 Rogerio S. Feris 2 Larry S. Davis 1 1 University of Maryland, College Park 2 IBM T. J. Watson Research Center. Outline. 1 . Introduction 2. Multi Attribute Retrieval and Ranking

sook
Download Presentation

Image Ranking and Retrieval based on Multi-Attribute Queries CVPR2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Image Ranking and Retrieval based on Multi-Attribute QueriesCVPR2011 Behjat Siddiquie1 RogerioS. Feris2 Larry S. Davis1 1University of Maryland, College Park 2IBM T. J. Watson Research Center

  2. Outline • 1. Introduction • 2. Multi Attribute Retrieval and Ranking • 3. Experiments and Results • 4. Conclusion

  3. Outline • 1. Introduction

  4. Introduction • A person who has a mustache is almost definitely a male, or a person who is Asian is unlikely to have blonde hair • A new framework for multi-attribute image retrieval and ranking, which retrieves images based not only on the words that are part of the query, but also considers the remaining attributes within the vocabulary that could potentially provide information about the query

  5. Introduction • There are three key contributions: • 1. Deals with ranking and retrieval within the sameformulation • 2. This is non- trivial, as the number of possible multi-labelqueries for a vocabulary of size L is • 3. Demonstrate that attributes within a single object category and even across multiple object categories are interdependent

  6. Outline • 2. Multi Attribute Retrieval and Ranking - 2.1. Retrieval - 2.2. Ranking

  7. Retrieval • Given a set of labels X, and a set of training images y • Corresponding to each label xi (xi ϵX) a mapping is learned to predict the set of images y (y ∁ y) that contain the label xi • Given a multi-attribute query Q, where Q ∁ X, our goal is to retrieve images from the set ythat are relevant to Q

  8. Retrieval • The prediction function fw : Q y returns the set y* which maximizes the score over the weight vector w • w : composed of two components • : for modeling the appearance of individual attributes • :for modeling the dependencies between them

  9. Retrieval • a(xi , yk)︰the feature vector representing image yk for attribute xi • 𝜑p(xj , yk)︰indicates the presence of attribute xjin image yk • a standard linear model for recognizing attribute xibased on the feature representation 𝜑a(xi , yk) • a potential function encoding the correlation between the pair of attributes xiand xj

  10. Retrieval Train a model w which given a multi-label query Q X can correctly predict the subset of images in a test set which contain all the labels ϵQ C : a parameter controlling the trade-off between the training error and regularization (𝑄𝑡 ϵQ) : the training queries ξt : the slack variable corresponding to query Qt Δ( , ) : the loss function

  11. Retrieval • Δ(𝑦𝑡 * , 𝑦𝑡)︰optimizing training error based on different performance metrics

  12. Ranking • The prediction function fw : Q z, is a permutation z* , of the set of images y: • (y)is the set of all possible permutations of the set of images y

  13. Ranking • A(r) ︰any non-increasing function • r(zk) ︰the rank of image zk • Suppose we care only about the ranks of the top K images, we can define A(r)as:

  14. Ranking • Given a queryQ, we divide the training images into │Q│ + 1 sets basedon their relevance. The most relevant set consists of imagesthat contain all the attributes in the query Q, and areassigned a relevance rel(j) = |Q| • YoungAsianWomanWearing sunglasses • Rel(j) = 0 ~ 4

  15. Ranking • Ensures that, in case there are no images containing all the query attributes, images that contain the most number of attributes are ranked highest • While we have assigned equal weights to all the attributes, one can conceivably assign -higher weights (race or gender) difficult to modify -lower weights (wearing sunglasses) easily changed

  16. Ranking • Amax-margin framework, for training our ranking model: • Δ (z*, z) is a function denoting the loss incurred in predicting the permutation z instead of the correct permutation z*

  17. Ranking • Δ (z*, z) = 1 - NDCG@100(z*, z) • The normalized discount cumulative gain(NDCG) score is a standard measure used for evaluating rankingalgorithms • rel(j): the relevance of the ranked image • Z : a normalization constant to ensure that the correct ranking results in an NDCG score of 1

  18. Outline • 3. Experiments and Results - 3.1. Evaluation - 3.2. Labeled Faces in the Wild (LFW) - 3.3. FaceTracerDataset - 3.4. PASCAL

  19. Evaluation • Retrieval: -1. Reverse Multi-Label Learning (RMLL) [19] -2. TagProp[9] [19] J. Petterson and T. S. Caetano. Reverse multi-label learning.NIPS, 2010. [9] M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Discriminative metric learning in nearest neighbor models for image auto-annotation. ICCV, 2009.

  20. Evaluation • Ranking: -1. rankSVM[12] -2. rankBoost[7] -3. Direct Optimization of Ranking Measures(DORM) [18] -4. TagProp[9] [12] T. Joachims. Optimizing search engines using clickthrough data. KDD, 2002. [7] Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. JMLR, 2003. [18] Q. V. Le and A. J. Smola. Direct optimization of ranking measures. http://arxiv.org/abs/0704.3359, 2007. [9] M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Discriminative metric learning in nearest neighbor models forimageauto-annotation. ICCV, 2009.

  21. Evaluation • Datasets: -1. Labeled Faces in the Wild(LFW) [11] -2. FaceTracer[15] -3. PASCAL VOC 2008 [4] [11] G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report,2007. [15] N. Kumar, P. Belhumeur, and S. Nayar. Facetracer: A search engine for large collections of images with faces. ECCV, 2008. [4] M. Everingham, L. Van Gool, C. K. I.Williams, J.Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2008 (VOC2008) Results.

  22. Labeled Faces in the Wild (LFW) • A subset consisting of 9992 images from LFW was annotated with a set of 27 attributes (Table 1). We randomly chose 50% of these images for training and the remaining were used for testing

  23. Labeled Faces in the Wild (LFW) • The attribute detector for hat or bald will give higherweights to features extracted from the topmostgrids in the configurations horizontal parts and layout

  24. Retrieval Performance on the LFW dataset

  25. Ranking Performance on the LFW dataset

  26. Mutually exclusive -(White , Asian) -(Eyeglasses , No-Eyewear) -(Short-Hair , Long-Hair) Rarely co-occur -(Kid , Beard) -(Lipstick , Male) Commonly co-occur -(Middle-aged , Eyeglasses) -(Senior , Gray-Hair)

  27. Ranking Performance on the FaceTracer dataset FaceTracer contains many more images of babies and small children compared to LFW

  28. Retrieval Performance on the PASCAL dataset

  29. Ranking Performance on the PASCAL dataset

  30. Outline • 4. Conclusion

  31. Conclusion • Presented an approach for ranking and retrieval of images based on multi-attributequeries. We utilize a structured prediction framework to integrate ranking and retrieval within the same formulation • Furthermore, our approach models the correlations between different attributes leading to improved ranking/retrieval performance

  32. In future • Plan to explore image retrieval for more complex queries such as scene descriptions consisting of the objects present, along with their attributes and the relationships among them

More Related