Shared features and Joint Boosting. Sharing visual features for multiclass and multiview object detection A. Torralba, K. P. Murphy and W. T. Freeman PAMI. vol. 29, no. 5, pp. 854-869, May, 2007. Yuandong Tian. Outline. Motivation to choose this paper Motivation of this paper
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Sharing visual features for multiclass and multiview object detectionA. Torralba, K. P. Murphy and W. T. Freeman PAMI. vol. 29, no. 5, pp. 854-869, May, 2007.
Computer vision is hard.
Equally smart people are equally distributed over time.
If computer vision cannot be solved in 30 years, it won’t be solved forever!
Because we are standing on
the Shoulder of Giants.
What I believe
Why ML seems not to help much in CV (at least for now)?
My answer: CV and ML are
Why do we use feature A instead of feature B?
A1: Feature A gives better performance.
A2: Feature A has some fancy properties.
The following step requires the feature to have
a certain property that only A has.
A strongly-coupled answer
Preprocessing Steps (“Computer Vision”)
ML black box
Have some domain-specific structures
Design for generic structures
Concept of Feature Sharing
of feature sharing
100% accuracy for a single object
But too specific.
weaker discriminative power
but shared in many classes.
exponential loss w.r.t the classifier H
make decision only on a single dimension
The addition of weak classifiers gives a strong classifier!
The weak learner
to be optimized
in this iteration
Features 1 3 4 5 2 1 4 6 2 7 3
29 objects, average over 20 training sets
Dictionary of 2000 candidate patches and position masks,
randomly sampled from the training images
50 samples per class
Chance rate = 0.02
L1 = 0.2856 (0.1868 in 50/3)
L2 = 0.2022
Chisqr = 0.2596
the accuracy of NN
L1 = 0.1656 (0.1868 in 50/3)
L2 = 0.1235
Chisqr = 0.1623
yet the increment is less (~60%)
compared to the previous cases.
The performance of single Boost
is the same as NN