1 / 13

Group Sparse Coding

Group Sparse Coding. Samy Bengio , Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009). Presented by Miao Liu July-23-2010. *Figures and formulae are directly copied from the original paper. Outline. Introduction Group Coding Dictionary Learning

Sharon_Dale
Download Presentation

Group Sparse Coding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Group Sparse Coding SamyBengio, Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009) Presented by Miao Liu July-23-2010 *Figures and formulae are directly copied from the original paper

  2. Outline • Introduction • Group Coding • Dictionary Learning • Results and Discussion

  3. Introduction • Bag-of-words document representations • Encode document by a vector of the counts of descriptors (words) • Widely used in text, image, and video processing • Easy to determine a suitable word dictionary for text documents. • For images and videos • No simple mapping from the raw document to descriptor counts • Require visual descriptors (color, texture, angles, and shapes) extraction • Measure descriptors at appropriate locations (regular grids, special interest points, multiple scales) • More carful design of dictionary is needed

  4. Dictionary Construction • Unsupervised vector quantization (VQ), often k-means clustering • Pro: maximally sparse per descriptor occurrence • Cons: • Does not guarantee sparse coding whole image • Not robust w.r.to descriptor variability • regularized optimization • Encode each visual descriptor as a weighted sum of dictionary elements • Mixed-norm regularizers • Take into account the structure of bags of visual descriptors in images • Presenting sets of images from a given category

  5. Problem Statement • The main goal : encode groups of instances (e.g. image patches) in terms of dictionary code words (some kind of average patches) • Notations • The m’th group • the subscript m is removed for single group operation. • Sub goals • Encoding ( ) • Learning a good dictionary from a set of training groups

  6. Group Coding • Given and , group coding is achieved by solving where • . • is the • balances fidelity and reconstruction complexity. • Coordinate descent is applied to solve the above problem. • Finally, compress into a single vector by taking p-norm of each .

  7. Group coding • Define • Optimum for p=1 • Optimum for p=2

  8. Dictionary Learning • Good Dictionary should balances between • Reconstruction error • Reconstruction complexity • Overall complexity relative to the given training set • Seeking learning method facilitates both • induction of new dictionary words • removal of dictionary words that have low predictive power • Applying • Let • Objective

  9. Dictionary Learning • In this paper p=2 • Define auxiliary variables • Define vector (appearing in the gradient of objective function) • Similar to the argument in group coding, one can obtain

  10. Experimental Setting • Compare with previous sparse coding method by measuring impact on classification the PASCAL VOC (Visual Object Classes) 2007 dataset • image from 20 classes, including people, animals, vehicles and indoor objects etc. • around 2500 images for respective training and validation; 5000 images for testing. • Extract local descriptors based on Gabor wavelet response at • Four orientations ( ) • Spatial scales and offsets (27 combination) • The 27 (scale, offset) pairs were chosen by optimizing a previous image recognition task, unrelated to this paper.

  11. Results and Discussion

  12. Results and Discussion

  13. Results and Discussion

More Related