180 likes | 263 Views
This project report details the implementation of a Vocabulary Tree and Inverted Index for scalable image matching, using SURF features and library choices such as VLFeat and OpenSURF. Methods, challenges, and schedule for compression and analysis are also discussed.
E N D
Scalable Image Matching Mid Project Report David Strickland ENGN 256 Spring 2013
Reference Paper: Inverted Index Compression for Scalable Image Matching • Chen, D.M.; Tsai, S.S.; Chandrasekhar, V.; Takacs, G.; Vedantham, R.; Grzeszczuk, R.; Girod, B., "Inverted Index Compression for Scalable Image Matching," Data Compression Conference (DCC), 2010 , vol., no., pp.525,525, 24-26 March 2010
Vocabulary Tree + Inverted Index • The Vocabulary Tree is a tree-structured vector quantizer constructed by hierarchical k-means clustering of feature descriptors. • Inverted Index: Each node has two lists • Image IDs • Array of counts 1Image from Chen et al.
Process Recap • Detect Features • Extract Feature Locations and Descriptors • Quantize Descriptors into a Vocabulary Tree • Score Database Images using Inverted Index • Pairwise Match on top scoring Images
Schedule I: VT/II Implementation • Week 1: Research Vocabulary Tree / Inverted Index, Determine which libraries to use • Week 2: Implement Feature Locator/Descriptors • Week 3: Implement Quantization of Descriptors in VT • Week 4: Implement Database scoring scheme using Inverted Index • Week 5: Milestone: Mid Project Presentation, Combine Previous parts, Pairwise Match to retrieve a single image
Library Choices • VLFeat • Includes hierarchical integer k means methods • VLFeat is available for MATLAB or C • Also includes SIFT and dense SIFT feature detection • FreeImage • C++ library that handles image input/output • OpenSURF • C++ SURF implementation • OpenCV • Required by OpenSURF • Dirent.h • Provides POSIX bindings for windows C++, useful for file I/O
Language Choice: MATLAB or C++ • MATLAB • + I/O is simple • + Integration is easy • + VLFeat tutorials are all for the MATLAB bindings • + Data is easy to handle, array manipulation is simple • - Proprietary, would require MATLAB to use/modify • - Not as fast as C++ • C++ • + No license required • + Extremely fast, as you control everything • - Integration is difficult (different data structure schemes) • - You have to control everything • Language Choice: C++ • Builds character
Feature Detection • SURF • C++: OpenSURF • Requires OpenCV, which handles the image I/O • SURF features used in the paper (Chen et al.) • SIFT • Dense SIFT feature detection • Included in VLFeat library • Same # of features for every image, but large # of features • Analysis/Comparison of DTII results when using SURF features vs DSIFT features would provide new/useful information
Pairwise Image Matching • Simple Nearest Neighbor algorithms can be applied to the features of the two images to calculate the number of matching images • Whichever image has the highest fraction of matches is the best match • Methods for this are included in some of the libraries • The Dictionary Tree + Inverted Index scoring method will produce the similarity scores for all the database images • The highest scoring images can then be scored using pairwise matching to find the best image match
Dictionary Tree Creation • HIKM – hierarchical k means clustering is done via VLFeat’s library • All the feature descriptors are used to create hierarchical clusters • Creates the dictionary tree, each node is a cluster center 1Image from http://www.vlfeat.org/overview/hikm.html
Inverted Index Creation • Inverted Index • The descriptors of each feature of each image traverse the tree to build the inverted index • Each leaf node visited adds to the associated image’s array of counts • Each leaf node has its own inverted index 1Image from Chen et al.
Similarity Scoring & Memory Usage • When matching an image, each image ik1 in the database of N images is given a similarity score • For each node visited by query descriptors the node’s inverted list of images all have the scores incremented : Where:
Current Progress • Implemented: • Image I/O, SURF feature detection • Read/Write SURF features to file • HIKM (Dictionary Tree) Creation • Inverted Index Creation • Image Scoring Methods • Pairwise matching for SURF method • Combined everything together for Image Matching via DTII • To Do: • Read/Write HIKM tree to file • Read/Write Inverted Index to file • Compression • Add additional feature types (e.g. dsift) + analysis
Challenges • File I/O needed to be handled manually in C++ • Different libraries format data differently • Very little documentation available for some libraries • Dictionary Tree & Inverted Index creation times are large • Finding SURF features for every image is time consuming • DT + II is very large, memory management is important • Saving Information (variables etc.) to disk is non-trivial
Schedule II: Compression & Analysis • Week 6: Inverted Index Image ID storage • Week 7: DSIFT feature version of the DT+II • Week 8: Soft Binned Tree, Analysis • Week 9: Final Project Presentation
Inverted Index Compression • Encode each inverted index’s Image IDs by consecutive differences • Inverted index compression techniques can significantly reduce memory usage by up to 5X without any loss in recognition accuracy • Reorder database to minimize differences • Minimize:
Soft-binned feature descriptor histograms • Classify a feature descriptor to k nearest tree nodes instead of just nearest tree node • Soft-binned tree gives improvement in classification accuracy • Disadvantage: • Each database feature now appears in k different inverted lists • Results in larger lists
References • 1David M. Chen, Sam S. Tsai, Vijay Chandrasekhar, Gabriel Takacs, Ramakrishna Vedantham, Radek Grzeszczuk, Bernd Girod, “Inverted Index Compression for Scalable Image Matching”