Efficient Discriminative Learning of Parts-based Models

Aim:To efficiently learn parts-based models which discriminate between positive and negative poses of the object category

ISVMs run for twice as long

Efficient Reformulation

Results - Sign Language

Exponential in |V|

wT(f-ij) + ≤ -1 + -i, for all j

100 training images, 95 test images

b

Linear in |V|

Miba(k) ≥ wbb(l), for all l

Parts-based Model

ISVM-1

a

G = (V, E)

Restricted to Tree

Miba(k) ≥ wbb(l) + wab,

for all (k,l) Lab

Linear in h

f : V Pose of V (h values)

Linear in |Lab|

Q(f) = ∑ Qa(f(a)) + ∑ Qab(f(a), f(b))

ISVM-2

waa(k) + ∑b Miba(k) + ≤ -1 + -i

Qa(f(a)) : Unary potential for f(a)

Computed using features

Miba(k) analogous to messages in Belief Propagation (BP)

Efficient BP using distance transform: Felzenszwalb and Huttenlocher, 2004

Our

Qab(f(a), f(b)): Pairwise potential

for validity of (f(a),f(b))

Solving the Dual

Restricted to Potts

b

b

max abT1 - abTKabab

max baT1 - baTKbaba

Our: 86.4% Buehler et al.,2008: 87.7%

The Learning Problem

s.t. abTy = 0, ab ≥ 0

s.t. baTy = 0, ba ≥ 0

Qa(f(a)) : waT(f(a))

Qa(f(a),f(b)) : wabT(f(a),f(b))

Q(f) : wT(f)

a

a

0 ≤ ∑ iab(k) + ∑ iab(k,l)

0 ≤ ∑ iba(k) + ∑ iba(k,l)

min ||w||+ C∑i

= 1, if (f(a),f(b)) Lab,

= 0, otherwise.

(f(a),f(b))

∑ iab(k) + ∑ iab(k,l) ≤ C

∑ iba(k) + ∑ iba(k,l) ≤ C

wT(f+i) + ≥ 1 - +i

Results - Buffy

Problem (1)

Problem (2)

Maximize margin, minimize hinge loss

wT(f-ij) + ≤ -1 + -i

Problem (1) learns the unary weight vector wa and pairwise weight wab

High energy for all positive examples

196 training images, 204 test images

For all j (exponential in |V|)

Low energy for all negative examples

Problem (2) learns the unary weight vector wb and pairwise weight wab

Related Work

Constraint (3)

∑kiab(k) = ∑ liba(l)

Results in a large minimal problem

ISVM-1

Dual Decomposition

min ∑ i gi(x), subject to x P

Local Iterative Support Vector Machine (ISVM-1)

- Start with a small subset of negative examples (1 per image)

max min ∑ gi(xi) + i(xi - x), s.t. xi P

min ∑ gi(xi), s.t. xi P, xi = x

ISVM-2

Project

Solve min ∑ gi(xi) + ixi

KKT Condition: ∑ i = 0

i = i +xi*

- Replace negative examples with current MAP estimates

- Converges to local optimum

Master

Update Lagrange multiplier of (3)

minimal

problem

size = 2

Global Iterative Support Vector Machine (ISVM-2)

Our

SVM-like

problems

- Start with a small subset of negative examples (1 per image)

Modified SVMLight

Problem(1)

Problem (2)

Our: 39.2% Ferrari et al.,2008: 41.0%

- Add current MAP estimates to set of negative examples

Implementation Details

- Converges to global optimum

Features

Shape: HOG Appearance: (x,x2), x = fraction of skin pixels

Data

Positive examples: Provided by user Negative examples: All other poses

Drawback: Requires obtaining MAP estimate of each image at each iteration (computationally expensive)

Occlusion

Each putative pose can be occluded (twice the number of labels)