shared features and joint boosting n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Shared features and Joint Boosting PowerPoint Presentation
Download Presentation
Shared features and Joint Boosting

Loading in 2 Seconds...

play fullscreen
1 / 59

Shared features and Joint Boosting - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

Shared features and Joint Boosting. Sharing visual features for multiclass and multiview object detection A. Torralba, K. P. Murphy and W. T. Freeman PAMI. vol. 29, no. 5, pp. 854-869, May, 2007. Yuandong Tian. Outline. Motivation to choose this paper Motivation of this paper

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Shared features and Joint Boosting' - linnea


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
shared features and joint boosting

Shared features and Joint Boosting

Sharing visual features for multiclass and multiview object detectionA. Torralba, K. P. Murphy and W. T. Freeman PAMI. vol. 29, no. 5, pp. 854-869, May, 2007.

Yuandong Tian

outline
Outline
  • Motivation to choose this paper
  • Motivation of this paper
  • Basic ideas in boosting
  • Joint Boost
  • Feature used in this paper
  • My results in face recognition
motivation to choose this paper
Motivation to choose this paper

Axiom:

Computer vision is hard.

Assumption: (smart-stationary)

Equally smart people are equally distributed over time.

Conjure:

If computer vision cannot be solved in 30 years, it won’t be solved forever!

slide4
Wrong!

Because we are standing on

the Shoulder of Giants.

where are the giants
Where are the Giants?
  • More computing resources?
  • Lots of data?
  • Advancement of new algorithm?
  • Machine Learning?

What I believe

cruel reality
Cruel Reality

Why ML seems not to help much in CV (at least for now)?

My answer: CV and ML are

weakly coupled

a typical question in cv
A typical question in CV

Q:

Why do we use feature A instead of feature B?

A1: Feature A gives better performance.

A2: Feature A has some fancy properties.

A3:

The following step requires the feature to have

a certain property that only A has.

A strongly-coupled answer

typical cv pipeline
Typical CV pipeline

Preprocessing Steps (“Computer Vision”)

Feature/Similarity

ML black box

Have some domain-specific structures

Design for generic structures

contribution of this paper
Contribution of this paper
  • Tune the ML algorithm in a CV context
  • A good attempt to break the black box and integrate them together
outline1
Outline
  • Motivation to choose this paper
  • Motivation of this paper
  • Basic ideas in boosting
  • Joint Boost
  • Feature used in this paper
  • My results in face recognition
this paper
This paper
  • Object Recognition Problem
    • Many object category.
    • Few images per category
  • Solution—Feature sharing
    • Find common features that distinguish a subset of classes against the rest.
feature sharing
Feature sharing

Concept of Feature Sharing

slide13

Typical behavior

of feature sharing

Template-like features

100% accuracy for a single object

But too specific.

Wavelet-like features,

weaker discriminative power

but shared in many classes.

why feature sharing
Why feature sharing?
  • ML: Regularization—avoid over-fitting
    • Essentially more positive samples
    • Reuse the data
  • CV: Utilize the intrinsic structure of object category
    • Use domain-specific prior to bias the machine learning algorithm
outline2
Outline
  • Motivation to choose this paper
  • Motivation of this paper
  • Basic ideas in boosting
  • Joint Boost
  • Feature used in this paper
  • My results in face recognition
basic idea in boosting
Basic idea in Boosting
  • Concept: Binary classification
    • samples, labels(+1 or -1)
  • Goal: Find a function (classifier) H which
    • maps positive samples to the positive value
  • Optimization: Minimize the

exponential loss w.r.t the classifier H

basic idea in boosting 2
Basic idea in boosting(2)
  • Boosting: Assume H is additive
  • Each is a “weak” learner (classifier).
    • Almost random but uniformly better

than random

  • Example:
    • Single feature classifier:

make decision only on a single dimension

how weak learner looks like
How weak learner looks like

Key point:

The addition of weak classifiers gives a strong classifier!

basic idea in boosting 3
Basic idea in boosting(3)
  • How to minimize?
    • Greedy Approach
      • Fix H, add one h in each iteration
    • Weighting samples
      • After each iteration, wrongly classified samples (difficult samples) get higher weights
technical parts
Technical parts
  • Greedy -> Second-order Taylor Expansion in each iteration

weights

The weak learner

to be optimized

in this iteration

labels

Solved by

Least Square

outline3
Outline
  • Motivation to choose this paper
  • Motivation of this paper
  • Basic ideas in boosting
  • Joint Boost
  • Feature used in this paper
  • My results in face recognition
joint boost multiclass
Joint Boost—Multiclass
  • We can minimize a similar function using one-vs-all strategy
  • This doesn’t work very well, since it is separable in c.
  • Put constraints. -> shared features!
joint boost 2
Joint Boost (2)
  • In each iteration, choose
    • One common feature
    • A subset of classes that use this feature
  • So that the objective decreases most
sharing diagram
Sharing Diagram

#Iteration

#class

I

II

III

IV

V

Features 1 3 4 5 2 1 4 6 2 7 3

key insight
Key insight
  • Each class may have its own favorite feature
  • a common feature may not be any of them, however it simultaneously decreases errors of many classes.
computational issue
Computational issue
  • Choose the best subset is prohibitive
  • Use greedy approach
    • Choose one class and one feature so that the objective decreases the most
    • Iteratively add more classes until the objective increases again
      • Note the common feature may change
  • From O(2^C) to O(C^2)
slide36

0.95 ROC

29 objects, average over 20 training sets

outline4
Outline
  • Motivation to choose this paper
  • Motivation of this paper
  • Basic ideas in boosting
  • Joint Boost
  • Feature used in this paper
  • My results in face recognition
feature they used in the paper
Feature they used in the paper
  • Dictionary
    • 2000 random sampled patches
      • Of size from 4x4 to 14x14
      • no clustering
    • Each patch is associated with a spatial mask
the candidate features
The candidate features

position

template

Dictionary of 2000 candidate patches and position masks,

randomly sampled from the training images

features
Features
  • Building feature vectors
    • Normalized correlation with each patch to get response
    • Raise the response to some power
      • Large value gets even larger and dominate the response (max operation)
    • Use spatial mask to align the response to the object center (voting)
    • Extract response vector at object center
results
Results
  • Multiclass object recognition
    • Dataset: LabelMe
    • 21 objects, 50 samples per object
    • 500 rounds
  • Multiview car recognition
    • Train on LabelMe, test on PASCAL
    • 12 views, 50 samples per view
    • 300 rounds
slide47

12 views

50 samples per class

300 features

outline5
Outline
  • Motivation to choose this paper
  • Motivation of this paper
  • Basic ideas in boosting
  • Joint Boost
  • Feature used in this paper
  • My results in face recognition
simple experiment
Simple Experiment
  • Main point of this paper
    • They claimed shared feature helps in the situation of
      • many categories, only a few samples in each category.
  • Test it!
    • Dataset: face recognition
    • “Face in the wild” dataset.
      • Many famous figures
experiment configuration
Experiment configuration
  • Use Gist-like feature but
    • Only Gabor response
    • Use finer grid to gather histogram
      • Face is aligned in the dataset.
  • Feature statistics
    • 8 orientation, 2 scale, 8x8 grid
    • 1024 dimension
experiment
Experiment
  • Training and testing
    • Find 50 identities with most images
    • For each identity, random select 3 as training
    • The rest for testing
result on more images
Result on More images
  • 50 people, 7 images each
  • Chance rate = 2%
  • Nearest neighbor

L1 = 0.2856 (0.1868 in 50/3)

L2 = 0.2022

Chisqr = 0.2596

slide55

Joint Boost doubles

the accuracy of NN

Morefeature

is shared

Single->Pairwise

Pairwise->Joint

7% percent

result on more identities
Result on More Identities
  • 100 people, 3 images each
  • Chance rate = 1%
  • Nearest neighbor

L1 = 0.1656 (0.1868 in 50/3)

L2 = 0.1235

Chisqr = 0.1623

slide57

Joint Boost is still better than NN

yet the increment is less (~60%)

compared to the previous cases.

The performance of single Boost

is the same as NN

conclusion
Conclusion
  • Joint Boosting indeed works
    • Especially when the number of images per class is not too small (otherwise NN)
  • Better performance in the presence of
    • Many classes, each class has only a new samples
    • Introduce regularization that reduce overfitting
  • Disadvantages
    • Train slowly, O(C^2).
thanks
Thanks!
  • Any questions?