1 / 18

Suman Karthik and C.V.Jawahar

Discriminative Relevance Feedback With Virtual Textual Representation For Efficient Image Retrieval. Suman Karthik and C.V.Jawahar. Introduction. Multimedia data is growing exponentially. Cheap high quality digital imaging devices Sharing of multimedia data on the internet

irene-wade
Download Presentation

Suman Karthik and C.V.Jawahar

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discriminative Relevance Feedback With Virtual TextualRepresentation For Efficient Image Retrieval Suman Karthik and C.V.Jawahar

  2. Introduction • Multimedia data is growing exponentially. • Cheap high quality digital imaging devices • Sharing of multimedia data on the internet • Content based organization and retrieval is a viable way of accessing this data.

  3. What do users look for? • *In CBIR systems users look for ‘things’ not ‘stuff’. • More than global image properties • Traditional object recognition won’t work • Two choices: • Rely on text to identify objects • Look at regions (objects or parts of objects) • The region based image retrieval paradigm was successfully applied by Carson et al. in Blobworld[99-2004] and Wang et al. in SIMPLIcity[2001] *Chad Carson: Blobworld

  4. Practical CBIR systems • Practical large scale deployment of CBIR systems require. • Efficient indexing and retrieval of thousands of documents. • Flexible framework for retrieval based on various types of features. For example Specialized features for highly specific sub domains like faces, vehicles, monuments. • Ability to scale up to millions of images without a significant performance trade off.

  5. Lessons From Text Retrieval • Large scale text retrieval systems have been successfully deployed. • Search Engines like • Efficient indexing and retrieval of millions of documents has been achieved. • The text retrieval frameworks are adaptive enough to be applied to specialized domains.

  6. Virtual Textual Representation • Images as text documents. • Color (YUV), compactness and location of segment are used to encode the segment as text. Document Words Transformation Segmentation Segments Text Image

  7. Transformation Feature Space Bins represented by strings or words Quantization Compactness Position Color

  8. Transformation • Quantization can be achieved in a number of ways. • Uniform vector space quantization for data set with a uniform feature point distribution. • Density based quantization of the feature space can be achieved with simple k-means quantization. • Irrespective of the quantization applied each cell in the vector space has a representative string. • Each image segment is assigned to a cell and is assigned the representative string of the cell.

  9. Discriminative Relevance Feedback • Discriminative regions are given higher weight than representative regions. • Image segments that can differentiate between roses and other flowers are given higher weight with respect to the class roses. • Regions aiding classification rather than clustering are chosen. • Image segments containing humans are able to differentiate between ‘surfer’ and ‘wave’ images.

  10. Intuitive way of learning content • Over segmentation and subsequent deduction of content can be achieved if the problem is modeled like this. Document Words Transformation Segmentation Segments Text Image

  11. Perfromance • Discriminative relevance feedback consistently out performed Region based importance method. • Given are the precision data for discriminative relevance feedback and Bayesian region importance relevance feedback.

  12. Indexing • CBIR systems usually use spatial databases to index and retrieve data. • Blobworld uses variants or R-trees. *[Megan Thomas, Chad Carson, Joseph M. Hellerstein Creating a Customized Access Method for Blobworld (2000) ICDE] • Relevance feedback skews the feature space rendering spatial databases inefficient. *[peng et al. Kernel Indexing for Relevance Feedback Image Retrieval]

  13. Elastic Bucket Trie Null Insert BBC Query CAB CBA A C B Nodes A B A B Overflow Split B A B Buckets Retrieved Bucket

  14. Spatial data structures VS EBT • Spatial Data Structures • Become inefficient when used with relevance feedback. • Requires costly arithmetic operations. • Number of splits of the spatial data structure is not fixed • Strictly adheres to spatial characteristics of the feature space. • Elastic Bucket Trie • Not effected by relevance feedback schemes. • Requires bit opertations • Number of splits of EBT is limited. • The trie need not adhere to an underlying spatial structure though that is also possible.* Suman Karthik, C.V. Jawahar, Efficient Region Based Indexing and Retrieval for Images with Elastic Bucket Tries, ICPR(2006)

  15. Relevance feedback and EBT • Typical relevance feedback algorithms need to be modified to work with text. • Keywords emerge with relevance feedback signifying association between key segments. • EBT can be used without any modifications with discriminative relevance feedback.

  16. Bag of words • The scheme is very similar to contemporary bag of words approaches. • Interest point based bag of words approaches can also be adapted to work within our framework.* • Any type of vector quantization of the feature space used by these schemes can use EBT. *R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from google's image search. ICCV, 2005.

  17. Future Work • Usage of heterogeneous strings to describe an image. • Text encoding of images that is highly distinctive. • Text encoding of images that is robust to segmentation inaccuracies. • Text based image mining to discover concepts and their features.

  18. THE END

More Related