Sharing features for multi class object detection
This presentation is the property of its rightful owner.
Sponsored Links
1 / 69

Sharing features for multi-class object detection PowerPoint PPT Presentation


  • 76 Views
  • Uploaded on
  • Presentation posted in: General

Sharing features for multi-class object detection. Antonio Torralba, Kevin Murphy and Bill Freeman MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) Sept. 1, 2004. The lead author: Antonio Torralba. Goal.

Download Presentation

Sharing features for multi-class object detection

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Sharing features for multi class object detection

Sharing features for multi-class object detection

Antonio Torralba, Kevin Murphy and Bill Freeman

MIT

Computer Science

and Artificial Intelligence Laboratory (CSAIL)

Sept. 1, 2004


Sharing features for multi class object detection

The lead author: Antonio Torralba


Sharing features for multi class object detection

Goal

  • We want a machine to be able to identify thousands of different objects as it looks around the world.


Multi class object detection

Desired detector outputs:

Bookshelf

Screen

One patch

Classifier

P( car | vp)

Local

features

no car

Vp

Desk

Classifier

P(person | vp)

no person

Classifier

P( cow | vp)

no cow

Multi-class object detection:


Need to detect as well as to recognize

Recognition: name each object

In recognition, the problem is

to discriminate between

objects.

Detection: localize in the image screens, keyboards and bottles

In detection, the big problem

is to differentiate the sparse

objects against the background

Need to detect as well as to recognize


There are efficient solutions for

detecting a single object category and view:

Viola & Jones 2001; Papageorgiou & Poggio 2000, …

detecting particular objects:

.

Lowe, 1999

detecting objects in isolation

Leibe & Schiele, 2003;

There are efficient solutions for

But the problem of multi-class and multi-view object detection is still largely unsolved.


Why multi object detection is a hard problem

Object classes

viewpoints

Why multi-object detection is a hard problem

Styles, lighting conditions, etc, etc, etc…

Need to detect Nclasses * Nviews * Nstyles.

Lots of variability within classes, and across viewpoints.


Existing approaches for multiclass object detection vision community

  • Viola-Jones extension for dealing with rotations

- two cascades for each view

  • Schneiderman-Kanade multiclass object detection

a) One detector for

each class

Existing approaches for multiclass object detection (vision community)

Using a set of independent binary classifiers is the dominant strategy:


Promising approaches

  • Fei-Fei, Fergus, & Perona, 2003

  • Krempp, Geman, & Amit, 2002

Look for a vocabulary

of edges that reduces

the number of features

Promising approaches…


Characteristics of one vs all multiclass approaches cost

Characteristics of one-vs-all multiclass approaches: cost

Computational cost grows linearly with Nclasses * Nviews * Nstyles …

Surely, this will not scale well to 30,000 object classes.


Characteristics of one vs all multiclass approaches representation

~100%

detection rate

with 0 false alarms

Parts derived from

training a binary

classifier.

Characteristics of one-vs-all multiclass approaches: representation

What is the best representation to detect a traffic sign?

Very regular object: template matching will do the job

Some of these parts can only be used for this object.


Meaningful parts in one vs all approaches

Meaningful parts, in one-vs-all approaches

Part-based object representation (looking for meaningful parts):

  • A. Agarwal and D. Roth

  • M. Weber, M. Welling and P. Perona

Ullman, Vidal-Naquet, and Sali, 2004: features of intermediate complexity are most informative for (single-object) classification.


Multi class classifiers machine learning community

Multi-class classifiers (machine learning community)

  • Error correcting output codes (Dietterich & Bakiri, 1995; …)

  • But only use classification decisions (1/-1), not real values.

  • Reducing multi-class to binary (Allwein et al, 2000)

  • Showed that the best code matrix is problem-dependent; don’t address how to design code matrix.

  • Bunching algorithm (Dekel and Singer, 2002)

  • Also learns code matrix and classifies, but more complicated than our algorithm and not applied to object detection.

  • Multitask learning (Caruana, 1997; …)

  • Train tasks in parallel to improve generalization, share features. But not applied to object detection, nor in a boosting framework.


Our approach

Our approach

  • Share features across objects, automatically selecting the best sharing pattern.

  • Benefits of shared features:

    • Efficiency

    • Accuracy

    • Generalization ability


Algorithm goals for object recognition

Algorithm goals, for object recognition.

We want to find the vocabulary of parts that can be shared

We want share across different objects generic knowledge about detecting objects (eg, from the background).

We want to share computations across classes so that computational cost < O(Number of classes)


Sharing features for multi class object detection

Object class 1

Object class 4

Object class 2

Object class 3

Independent

features

Total number of hyperplanes (features): 4 x 6 = 24. Scales linearly with number of classes


Sharing features for multi class object detection

Shared

features

Total number of shared hyperplanes (features): 8

May scale sub-linearly with number of classes, and may generalize better.


Note sharing is a graph not a tree

3

b

3

Note: sharing is a graph, not a tree

Objects: {R, b, 3}

R

3

b

This defines a vocabulary of parts shared across objects


At the algorithmic level

At the algorithmic level

  • Our approach is a variation on boosting that allows for sharing features in a natural way.

  • So let’s review boosting (ada-boost demo)


Boosting demo

Boosting demo


Joint boosting outside of the context of images additive models for classification

classes

+1/-1 classification

feature responses

Joint boosting, outside of the context of images.Additive models for classification


Feature sharing in additive models

H1 = G1,2 + G1

H2 = G1,2 + G2

Feature sharing in additive models

  • Simple to have sharing between additive models

  • Each term hm can be mapped to a single feature


Flavors of boosting

Flavors of boosting

  • Different boosting algorithms use different loss

  • functions or minimization procedures

  • (Freund & Shapire, 1995; Friedman, Hastie, Tibshhirani, 1998).

  • We base our approach on Gentle boosting: learns faster than others

  • (Friedman, Hastie, Tibshhirani, 1998;

  • Lienahart, Kuranov, & Pisarevsky, 2003).


Joint boosting

At each boosting round, we add a function:

JointBoosting

We use the exponential multi-class cost function

classes

membership in class c,

+1/-1

classifier output for class c


Newton s method

Newton’s method

Treat hm as a perturbation, and expand loss J to second order in hm

classifier with perturbation

squared error

reweighting


Joint boosting1

Joint Boosting

Weight squared error over training data

weight

squared error


For a trial sharing pattern set weak learner parameters to optimize overall classification

a+b

q

vf

b

Given a sharing pattern, the decision stump parameters are obtained analytically

For a trial sharing pattern, set weak learner parameters to optimize overall classification

hm (v,c)

Feature output, v


Joint boosting select sharing pattern and weak learner to minimize cost

The constants k prevent sharing features just due to an asymmetry between positive and negative examples for each class. They only appear during training.

Response histograms for background (blue) and class members (red)

kc=1

kc=2

hm(v,c)

kc=5

Joint Boosting: select sharing pattern and weak learner to minimize cost.

Algorithm details in CVPR 2004, Torralba, Murphy & Freeman


Approximate best sharing

Approximate best sharing

  • But this requires exploring 2C –1 possible sharing patterns

  • Instead we use a first-best search:

  • S = []

  • 1) We fit stumps for each class independently

  • 2) take best class - ci

  • S = [S ci]

  • fit stumps for [S ci] with ci not in S

  • go to 2, until length(S) = Nclasses

  • 3) select the sharing with smallest WLS error


Effect of pattern of feature sharing on number of features required synthetic example

Effect of pattern of feature sharing on number of features required (synthetic example)


Effect of pattern of feature sharing on number of features required synthetic example1

Effect of pattern of feature sharing on number of features required (synthetic example)


2 d synthetic example

2-d synthetic example

3 classes + 1 background class


No feature sharing

No feature sharing

Three one-vs-all binary classifiers

This is with only 8 separation lines


With feature sharing

With feature sharing

Some lines can be shared across classes.

This is with only 8 separation lines


The shared features

The shared features


Comparison of the classifiers

Comparison of the classifiers

Shared features: note better isolation of individual classes.

Non-shared features.


Now apply this to images image features weak learners

Location of that patch within the 32x32 object

12x12 patch

Feature output

gf(x)

Mean = 0

Energy = 1

wf(x)

Binary mask

Now, apply this to images.Image features (weak learners)

32x32 training image of an object


The candidate features

The candidate features

position

template


The candidate features1

The candidate features

position

template

Dictionary of 2000 candidate patches and position masks,

randomly sampled from the training images


Database of 2500 images

Database of 2500 images

Annotated instances


Multiclass object detection

21 objects

Multiclass object detection

We use [20, 50] training samples per object, and about 20 times as many background examples as object examples.


Feature sharing at each boosting round during training

Feature sharing at each boosting round during training


Feature sharing at each boosting round during training1

Feature sharing at each boosting round during training


Example shared feature weak classifier

Example shared feature (weak classifier)

Response histograms for background (blue) and class members (red)

At each round of running joint boosting on training set we get

a feature and a sharing pattern.


Sharing features for multi class object detection

Shared feature

Non-shared feature


Sharing features for multi class object detection

Shared feature

Non-shared feature


How the features were shared across objects features sorted left to right from generic to specific

How the features were shared across objects (features sorted left-to-right from generic to specific)


Performance evaluation

Performance evaluation

Correct detection rate

Area under ROC (shown is .9)

False alarm rate


Performance improvement over training

Performance improvement over training

Significant benefit to sharing features using joint boosting.


Roc curves for our 21 object database

ROC curves for our 21 object database

  • How will this work under training-starved or feature-starved conditions?

  • Presumably, in the real world, we will always be starved for training data and for features.


70 features 20 training examples left

Shared features Non-shared features

70 features, 20 training examples (left)


15 features 20 training examples mid

70 features, 20 training examples (left)

Shared features Non-shared features

15 features, 20 training examples (mid)


15 features 2 training examples right

70 features, 20 training examples (left)

Shared features Non-shared features

15 features, 2 training examples (right)

15 features, 20 training examples (middle)


Scaling

Scaling

Joint Boosting shows sub-linear scaling of features with objects (for area under ROC = 0.9).

Results averaged over 8 training sets, and different combinations of objects. Error bars show variability.


Sharing features for multi class object detection

Red: shared features Blue: independent features


Sharing features for multi class object detection

Red: shared features Blue: independent features


Examples of correct detections

Examples of correct detections


What makes good features

What makes good features?

  • Depends on whether we are doing single-class or multi-class detection…


Generic vs specific features

Parts derived from training

a joint classifier with 20 more

objects.

Parts derived from

training a binary

classifier.

In both cases ~100% detection rate with 0 false alarms

Generic vs. specific features


Qualitative comparison of features for single class and multi class detectors

Qualitative comparison of features, for single-class and multi-class detectors


Multi view object detection train for object and orientation

Multi-view object detectiontrain for object and orientation

Sharing features is a natural approach to view-invariant object detection.

View

invariant

features

View

specific

features


Multi view object detection

Multi-view object detection

Sharing is not a tree. Depends also on 3D symmetries.


Multi view object detection1

Multi-view object detection


Multi view object detection2

Multi-view object detection

Strong learner H response for car as function of assumed view angle


Visual summary

Visual summary…


Sharing features for multi class object detection

Features

Object units


Summary

Summary

  • Argued that feature sharing will be an essential part of scaling up object detection to hundreds or thousands of objects (and viewpoints).

  • We introduced joint boosting, a generalization to boosting that incorporates feature sharing in a natural way.

  • Initial results (up to 30 objects) show the desired scaling behavior for # features vs # objects.

  • The shared features are observed to generalize better, allowing learning from fewer examples, using fewer features.


Sharing features for multi class object detection

end


  • Login