660 likes | 674 Views
This study explores efficient techniques for content-based image retrieval in large-scale dynamic image databases. It addresses the challenges of scalability and efficiency in retrieving images from dynamic databases and proposes novel methods for quantization and semantic indexing. The study also introduces relevance feedback methods to improve retrieval performance.
 
                
                E N D
Efficient Image Retrieval Methods For Large Scale Dynamic Image Databases Suman Karthik 200407013 Advisor: Dr. C.V.Jawahar
Images • Cheap Imaging Hardware • Plummeting Storage costs • User Generated Content
Image Databases • Large Scale • Millions to billions of images • Dynamic • Highly dynamic in nature Number of Images on Flickr from December 2005 to November 2007 In millions
CBIR • Content Based IR • Uses image content • Pros • Good Quality • Annotation agnostic • Cons • Inefficient • Not scalable shape color texture
z d w N D Bag Of Words Index Vector Quantization Feature Extraction Semantic Indexing Words W Compute SIFT descriptors [Lowe’99] D1 D2 D3 Inverted Index PLSA, Hoffman, 2001 *J Sivic & Zisserman,2003; Nister & Henrik,2006; Philbin,Sivic,Zisserman et la,2008;
Dynamic Databases • Large scale • New images added continuously • High rate of change • Nature of data not known apriori Internet Videos Images
Vocabulary known Rate of change of vocabulary low Stable vocabulary Vocabulary unknown Rate of change of vocabulary high Unstable vocabulary Text vs Images Dynamic databases
Quantization and Semantic indexingIn Dynamic Databases • As DB changes vocabulary is outmoded • Updating vocabulary is too costly • Not incremental • Cannot keep up with rate of change • As DB changes semantic index is invalid • Updating semantic index is resource intensive • Not incremental • Cannot keep up with rate of change or scale
Dynamic Databases Internet Dynamic Database Index Vector Quantization Feature Extraction Semantic Indexing Videos Quantization and semantic indexing methods are a bottleneck Images
A. Motivation CBIR is inefficient and not scalable B. Objective Develop methods to improve efficiency and scalability of CBIR C. Contributions C 1.1 – Virtual Textual Representation C 1.2 – A new efficient indexing structure C 1.3 – Relevance feedback methods that improves performance Objective 1
A. Motivation Quantization is bottleneck for BoW when dealing with dynamic image databases B. Objective Develop incremental quantization method for BoW model to successfully deal with dynamic image databases C. Contributions C 2.1 – Incremental Vector Quantization C 2.2 – Comparison of retrieval performance with existing methods C 2.3 – Comparison of incremental quantization with existing methods Objective 2
Objective 3 A. Motivation Semantic Indexing is not scalable for BoW when dealing with dynamic image databases B. Objective Develop incremental semantic indexing method for BoW model to successfully deal with dynamic image databases C. Contributions C 3.1 – Bipartite Graph Model C 3.2 – An algorithm for semantic indexing on BGM C 3.3 – Search engines for images
Literature • Global image retrieval • Region based image retrieval • Region Based Relevance feedback • Costly nearest neighbor based retrieval • Spatial Indexing • Relevance feedback heavily used * Blobworld: A System for Region-Based Image Indexing and Retrieval, Chad Carson , Megan Thomas , Serge Belongie , Joseph M. Hellerstein , Jitendra Malik In Third International Conference on Visual Information Systems 1999 * Region-Based Relevance Feedback In Image Retrieval, Feng Jing , Mingjing Li , Hong-jiang Zhang , Bo Zhang, Proc. IEEE International Symposium on Circuits and Systems 2002 * Image retrieval: Past, present, and future, Yong Rui, Thomas S. Huang, Shih F. Chang In International Symposium on Multimedia Information Processing 1997
Transformation Feature Space Bins represented by strings or words Quantization Compactness Position Color
Document Words Transformation Segmentation Segments Text Image Virtual Textual Representation • Quantization • Uniform quantization (grid) • Density based quantization(kmeans) • Each cell is a string
CBIR Indexing • Spatial Databases • Relevance feedback skews the feature space rendering spatial databases inefficient*. details * Indexing for Relevance Feedback Image Retrieval, Jing Peng , Douglas R. Heisterkamp, In Proceedings of the IEEE International Conference on Image Processing (ICIP’03)
Elastic Bucket Trie Null Insert BBC Query CAB CBA A C B Nodes A B A B Overflow Split B A B Buckets Retrieved Bucket
Relevance Feedback Retrieved Query Relevance Feedback
Region importance based relevance feedback KEYWORDS Pseudo Image for next iteration Keyword Selection Relevant Images Extracted Words Errors In Retrieval
Classification is given precedence over clustering. Discriminative segments become the keywords. Non-discriminative segments are ignored. Discriminative Relevance Feedback FLOWERS ROSES SURFERS WAVES
Discriminative Relevance Feedback KEYWORDS Pseudo Image for next iteration Keyword Selection Relevant Images Extracted Words Irrelevant Images No Errors In Retrieval
Discriminative Relevance Feedback consistently out performs Region Based Importance method. Performance High Fscore Low Fscore
Region based Relevance feedback Blobworld, (no indexing) Our work Non Spatial indexing Simplicity (no indexing) Global image retrieval Local Image retrieval Early CBIR Spatial Indexing Global relevance Feedback or No relevance feedback
Analysis • Relevance feedback algorithms need to be modified to work with text. • Keywords emerge with relevance feedback signifying association between key segments. • EBT can be used without any modifications with discriminative relevance feedback. • Advent of Bag of Words model for image retrieval
Literature • Kmeans • Hierarchical Kmeans • Kmeans, Soft assignment • Time consuming offline quantization • Representative data available apriori • Quantization is not incremental * Video Google: A Text Retrieval Approach to Object Matching in Videos, Josef Sivic, Andrew Zisserman, ICCV 2003 * Scalable Recognition with a Vocabulary Tree, D. Nistér and H. Stewénius, CVPR 2006 * Lost in quantization: Improving particular object retrieval in large scale image databases, James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, Andrew Zisserman, CVPR 2008
Perceptual Loss Under quantization Synonymy Poor precision Binning Loss Over quantization Polysemy Poor recall Quantization Losses
Incremental Vector Quantization • Control perceptual loss • Minimize binning loss • Create quality code books • Data dependent • Incremental in nature
Algorithm Puts a upper bound on perceptual loss Soft BinAssignment: Minimizes binning loss Builds quality codebooks by ignoring noise r L = 2 L: minimum cardinality of a cell
An experiment • Given • All possible feature points in a feature space that could be generated by natural processes. • Quantize • K-means with apriori knowledge of entire data • IVQ with no apriori information. • Performance • F-score • Time taken for incremental quantization Details
Fscore IVQ outperforms Kmeans IVQ: 1115 bins Kmeans: 1000 bins
Time • IVQ quantizes in 0.1 seconds • IVQ time complexity is linear • Kmeans takes 1000 seconds • Time complexity exponential IVQ outperforms Kmeans
Holiday Dataset • Datasets • Holiday dataset • 1491 images • 500 categories • Pre-processing • sift feature extraction. • quantization using k-means. • quantization using ivq
Incremental Quantization • Datasets • ALOI dataset • 100,000 images • 1000 batches of 100 image each • Pre-processing • sift feature extraction. • quantization using k-means/online kmeans. • quantization using IVQ S = seconds, D = Days Batch = 100 images of 100,000 image ALOI dataset Added sequentially
Analysis • IVQ bins higher than Kmeans (constant perceptual loss) • IVQ efficient due to local changes • LSH used to accelerate IVQ • Semantic indexing can improve mAP More
Incremental IVQ (local) Adaptive Vocabulary Tree (global) Density Based Online Kmeans Kmeans Offline quantization Online quantization Non density based Regular Lattice Non incremental
Semantic Indexing Words clustered around latent topics Visual Words clustered around latent topics Animal Whippet GSD doberman d Whippet daffodil w GSD tulip doberman P(w|d) rose daffodil LSI, pLSA, LDA tulip rose Flower • * Hoffman 1999; Blei, Ng & Jordan, 2004; R. Lienhart and M. Slaney,2007
Literature • Visual pLSA • Visual LDA • Spatial semantic indexing • High space complexity due to large matrix operations. • Slow, resource intensive offline processing. * Discovering Objects and Their Location in Images , Josef Sivic, Bryan Russell, Alexei A. Efros, Andrew Zisserman, and Bill Freeman, ICCV 2005 * Image Retrieval on Large-Scale Image Databases, Eva Horster, Rainer Lienhart, Malcolm Slaney, CIVR 2007 * Spatial Latent Dirichlet Allocation, X. Wang and E. Grimson, in Proceedings of Neural Information Processing Systems Conference (NIPS) 2007
Bipartite Graph Model Cash Flow Algorithm subprime w1 Financial Crisis d1 reforms w2 d2 Bush Popularity TF 11.7 w3 war 25 8.3 50 d3 Saddam Captured 5 100 12.5 50 Iraq w4 25 d4 IDF 12.5 Iraq Pullout elections w5 d5 Obama Elected democrats w6 • Vector space model is encoded as bipartite graph of words and document. • TF values retained as edge weights. • IDF values retained as term weights words Documents
BGM with BoW … • Feature extraction • Local detectors, SIFT • Vector quantization • K-means • BGM insertion • Words, Documents • TF • IDF
Why BGM is Superior ? Query image w1 w2 Inverted Index Cash Flow w5 w1 w2 w3 w4 Result : Result :
Naïve vs BGM • Datasets • 9000 images of flickr. • 9 Sports Categories • 5 Animal Categories • Pre-processing • sift feature extraction. • quantization using k-means. • F-score • 2*(p*r)/(p+r)
BGM vs pLSA, IpLSA Number Of Concepts Known • Datasets • Holiday dataset • 1491 images • 500 categories • Pre-processing • sift feature extraction. • quantization using k-means. Number Of Concepts unknown • pLSA • Cannot scale for large databases. • Cannot update incrementally. • Latent topic intialization difficult • Space complexity high • IpLSA • Cannot scale for large databases. • Cannot update new latent topics. • Latent topic intialization difficult • Space complexity high • BGM+Cashflow • Efficient • Low space complexity
Near Duplicate Retrieval • Dataset: 500,000 movie frames • SIFT vectors • Kmeans quantization • Indexed using text search library Ferret. • Efficient Indexing and retrieval • Effectively scalable to large data. • Query frame given as query to Ferret index. • Cash propagated to every node until cut-off.
Sample Retrieval Query Retrieval Fastest Indian Fight Club Harry Potter
Analysis Low index insert time for new images Less than 200 seconds to insert 1000 images in a million image index Marginally higher retrieval time Due to multiple levels of graph traversal Memory usage minimal Works without concept number apriori BGM is a hybrid model Generative discriminative