Enhancing Object Discovery via Contextual Object Graphs in Unlabeled Images
This research explores innovative techniques for discovering unfamiliar objects within unlabeled images by combining contextual cues from known categories with appearance descriptors. The proposed approach utilizes a novel object-graph descriptor to model spatial layouts and enhance category discovery amidst intra-category variations and occlusions. By performing multiple segmentations and employing unsupervised learning methods, the study aims to improve the accuracy of identifying unknown objects based on their context in relation to familiar categories, paving the way for advancements in visual recognition.
Enhancing Object Discovery via Contextual Object Graphs in Unlabeled Images
E N D
Presentation Transcript
Object-Graphs for Context-Aware Visual Category Discovery Cheng-Ming Chiang Advisor: Sheng-Jyh Wang 2012/7/9 Reference: L. Yong Jae and K. Grauman, "Object-Graphs for Context-Aware Visual Category Discovery," PAMI, 2012.
Outline • Introduction • Related Work • Approach • Results • Conclusion and Future Work • Reference
Introduction • How to discover unfamiliar objects in unlabeled images? • Unsupervised visual category discovery • Existing unsupervised techniques usually use appearance alone to detect visual themes, but it may suffer from • Occluded objects • Large intra-category variations • Low-resolution data
Introduction • A new idea: How could visual discovery benefit from familiar objects? • Model the interaction between a set of detected known categories and the unknown to-be-discovered categories • Object-level context cues + Appearance descriptors • Introduce a novel object-graph descriptor to encode the 2D and 3D spatial layout
Related Work • State-of-the-art discovery method, appearance alone • B. C. Russell, W. T. Freeman, A. A. Efros, J. Sivic, and A. Zisserman, "Using Multiple Segmentations to Discover Objects and their Extent in Image Collections," CVPR, 2006 Reference: B. C. Russell, W. T. Freeman, A. A. Efros, J. Sivic, and A. Zisserman, "Using Multiple Segmentations to Discover Objects and their Extent in Image Collections," CVPR, 2006
Related Work • Is it possible to learn visual object classes and their segmentations simply from looking at images? • Challenges: • How to recognize visually similar objects? • How to segment them from their background? • In fact, both object recognition and image segmentation can be thought of as parts of one large grouping problem • Projecting groups onto a particular image gives segmentation • Projecting groups onto the image index gives recognition Reference: B. C. Russell, W. T. Freeman, A. A. Efros, J. Sivic, and A. Zisserman, "Using Multiple Segmentations to Discover Objects and their Extent in Image Collections," CVPR, 2006
Related Work • AlgorithmGiven a large, unlabeled collection of images • For each image in the collection, compute multiple candidate segmentations • For each segment in each segmentation, compute a histogram of “visual words” • Perform topic discovery on the set of all segments in the image collection (using Latent Dirichlet Allocation) • For each discovered topic, sort all segments by how well they are explained by this topic Reference: B. C. Russell, W. T. Freeman, A. A. Efros, J. Sivic, and A. Zisserman, "Using Multiple Segmentations to Discover Objects and their Extent in Image Collections," CVPR, 2006
Related Work • Generating multiple segmentations • Produce sufficient segmentations to have a high chance of obtaining “good” segments that will contain potential objects • Obtaining visual words • SIFT descriptors for each image and quantized into 2000 visual words • Each image segmentis represented by a histogram of visual words contained within the segment Reference: B. C. Russell, W. T. Freeman, A. A. Efros, J. Sivic, and A. Zisserman, "Using Multiple Segmentations to Discover Objects and their Extent in Image Collections," CVPR, 2006
Related Work • The topic discovery models • To analyze the collection of segments and discover ‘topics’ • Sorting the soup of segments • Find good segments within each topic Reference: B. C. Russell, W. T. Freeman, A. A. Efros, J. Sivic, and A. Zisserman, "Using Multiple Segmentations to Discover Objects and their Extent in Image Collections," CVPR, 2006
Related Work Reference: B. C. Russell, W. T. Freeman, A. A. Efros, J. Sivic, and A. Zisserman, "Using Multiple Segmentations to Discover Objects and their Extent in Image Collections," CVPR, 2006
Outline- Approach • Identifying unknown objects • Object graphs: modeling the topology of category predictions • Three-dimensional object graphs • Category discovery amid familiar objects
Approach • Goal • Discover categories in unlabeled image collections using appearance and object-level semantic context cues • Generate multiple segmentation for each image and classify each region as known or unknown • Model the unknown regions’ surrounding contextual information in terms of object-graph • Group the unknown regions based on their appearance similarity and relationship to the surrounding known regions
Identifying Unknown Objects • Predict which regions are likely instances of the previously learned categories • Learn classifiers for N categories, • Generate multiple segmentations per image • Given the region s, calculate posterior for each class • The “known ” object will have only one peak value among all ⇒ Lowerentropy • The “unknown” object will have multiple peak values among the posteriors ⇒ Higher entropy
Identifying Unknown Objects • Select a cutoff threshold equal to the midpoint in the entropy rang • Lighter/darker color indicate higher/lower entropy
Object Graphs: Modeling the Topology of Category Predictions • Model the unknown regions’ surrounding contextual information in the form of graph representation • Regions with similar surrounding context would have similar graphs • Generate superpixels for each image, except for the unknown region Roughly 50 superpixels for each image
Object Graphs: Modeling the Topology of Category Predictions • From stage 1, we have the posteriors foreach segment • Then, map the per-region posteriors to per-pixel posteriors • Calculate posteriors for each superpixel regions
Object Graphs: Modeling the Topology of Category Predictions • For each unknown segment s, we compute a series of histograms using the posterior computed within its neighboring superpixels • Each histogram records the posteriors within ’s spatially nearest segments for each of two orientations, above and below the segment
Object Graphs: Modeling the Topology of Category Predictions • Concatenate the component histograms for to produce the final object-graph descriptor • Use R=20 in the example An dimensional vector
Object Graphs: Modeling the Topology of Category Predictions Similar object graphs for the unknown regions
Three-Dimensional Object Graphs • Is 2D object-graph a reliable descriptor? • Relationship between a car and the road • Introduce a 3D variant of the object graph • Use a depth information to estimate the proximity and relative orientations of surrounding familiar objects • Use regions rather than superpixels for 3D object-graph nodes • Employ the method of Hoiem et al. to estimate depth • D. Hoiem, A.N. Stein, A.A. Efros, and M. Hebert, “Recovering Occlusion Boundaries from a Single Image,” ICCV, 2007.
Three-Dimensional Object Graphs More robust to camera pose variations
Category Discovery amid Familiar Objects • Combine object-level context with region-based appearance to form groups from unknown regions • Object-level context: 2D or 3D object graph descriptors • Appearance descriptor : • Texton Histograms(TH) Edge filters + Gaussian filter + Laplacian-of-Gaussian filter • Color Histograms(CH) Lab color space • Pyramid HOG(pHOG) Three pyramid level with eight bins
Category Discovery amid Familiar Objects • Similarity measure • Compute the affinities between all pairs of unknown regions to generate an affinity matrix • Use the spectral clustering method to group the regions ,where denote a kernel function for two histogram inputs:
Algorithm Summarization • Offline training • Unlabeled novel images as the input
Algorithm Summarization • Generate multiple segmentations • Compute the posteriors and classify each segment as either known or unknown
Algorithm Summarization • Generate superpixel regions and compute the posteriors
Algorithm Summarization • Build an object-graph descriptor for each unknown region
Algorithm Summarization • Compute affinities between all pairs of unknown regions • Cluster using those affinities to group the objects
Outline- Result • Unsupervised discovery accuracy • Comparison to the state of the art • Discovered categories: qualitative results
Unsupervised Discovery Accuracy • Appearance + object graph V.S. appearance alone Different known objects # of unknowns increase, the accuracy of object-graph decreases
Unsupervised Discovery Accuracy • Greater improvement for high appearance variance
Comparison to the State of the Art • Compare to "Using Multiple Segmentations to Discover Objects and their Extent in Image Collections" • Use a bag-of-features representation with SIFT features
Reference [1] L. Yong Jae and K. Grauman, "Object-Graphs for Context- Aware Visual Category Discovery," PAMI2012. [2] D. Hoiem, A. N. Stein, A. A. Efros, and M. Hebert, "Recovering Occlusion Boundaries from a Single Image," ICCV 2007 [3] B. C. Russell, W. T. Freeman, A. A. Efros, J. Sivic, and A. Zisserman, "Using Multiple Segmentations to Discover Objects and their Extent in Image Collections," CVPR 2006 [4] http://nlp.stanford.edu/IRbook/html/htmledition/ evaluation-of-clustering-1.html