Group sparse coding
1 / 13

Group Sparse Coding - PowerPoint PPT Presentation

  • Updated On :

Group Sparse Coding. Samy Bengio , Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009). Presented by Miao Liu July-23-2010. *Figures and formulae are directly copied from the original paper. Outline. Introduction Group Coding Dictionary Learning

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Group Sparse Coding' - Sharon_Dale

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Group sparse coding l.jpg

Group Sparse Coding

SamyBengio, Fernando Pereira,

Yoram Singer, Dennis Strelow


Mountain View, CA


Presented by Miao Liu


*Figures and formulae are directly copied from the original paper

Outline l.jpg

  • Introduction

  • Group Coding

  • Dictionary Learning

  • Results and Discussion

Introduction l.jpg

  • Bag-of-words document representations

    • Encode document by a vector of the counts of descriptors (words)

    • Widely used in text, image, and video processing

  • Easy to determine a suitable word dictionary for text documents.

  • For images and videos

    • No simple mapping from the raw document to descriptor counts

    • Require visual descriptors (color, texture, angles, and shapes) extraction

    • Measure descriptors at appropriate locations (regular grids, special interest points, multiple scales)

    • More carful design of dictionary is needed

Dictionary construction l.jpg
Dictionary Construction

  • Unsupervised vector quantization (VQ), often k-means clustering

    • Pro: maximally sparse per descriptor occurrence

    • Cons:

      • Does not guarantee sparse coding whole image

      • Not robust descriptor variability

  • regularized optimization

    • Encode each visual descriptor as a weighted sum of dictionary elements

  • Mixed-norm regularizers

    • Take into account the structure of bags of visual descriptors in images

    • Presenting sets of images from a given category

Problem statement l.jpg
Problem Statement

  • The main goal : encode groups of instances (e.g. image patches) in terms of dictionary code words (some kind of average patches)

  • Notations

    • The m’th group

    • the subscript m is removed for single group operation.

  • Sub goals

    • Encoding ( )

    • Learning a good dictionary from a set of training groups

Group coding l.jpg
Group Coding

  • Given and , group coding is achieved by solving


    • .

    • is the

    • balances fidelity and reconstruction complexity.

  • Coordinate descent is applied to solve the above problem.

  • Finally, compress into a single vector by taking p-norm of each .

Group coding7 l.jpg
Group coding

  • Define

  • Optimum for p=1

  • Optimum for p=2

Dictionary learning l.jpg
Dictionary Learning

  • Good Dictionary should balances between

    • Reconstruction error

    • Reconstruction complexity

    • Overall complexity relative to the given training set

  • Seeking learning method facilitates both

    • induction of new dictionary words

    • removal of dictionary words that have low predictive power

  • Applying

  • Let

  • Objective

Dictionary learning9 l.jpg
Dictionary Learning

  • In this paper p=2

  • Define auxiliary variables

  • Define vector (appearing in the gradient of objective function)

  • Similar to the argument in group coding, one can obtain

Experimental setting l.jpg
Experimental Setting

  • Compare with previous sparse coding method by measuring impact on classification the PASCAL VOC (Visual Object Classes) 2007 dataset

    • image from 20 classes, including people, animals, vehicles and indoor objects etc.

    • around 2500 images for respective training and validation; 5000 images for testing.

  • Extract local descriptors based on Gabor wavelet response at

    • Four orientations ( )

    • Spatial scales and offsets (27 combination)

  • The 27 (scale, offset) pairs were chosen by optimizing a previous image recognition task, unrelated to this paper.