1 / 65

Tractable Higher Order Models in Computer Vision ( Part II ) - PowerPoint PPT Presentation

Tractable Higher Order Models in Computer Vision ( Part II ). Presented by Xiaodan Liang. Slides from Carsten Rother, Sebastian Nowozin , Pusohmeet Khli Microsoft Research Cambridge. Part II. Submodularity Move making algorithms Higher-order model : P n Potts model.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about ' Tractable Higher Order Models in Computer Vision ( Part II )' - vine

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Presented by Xiaodan Liang

Slides from Carsten Rother,Sebastian Nowozin, PusohmeetKhli

Microsoft Research Cambridge

• Submodularity

• Move making algorithms

• Higher-order model : Pn Potts model

Problem inherently combinatorial!

Selection A = {}

Selection B = {X2,X3}

Y“Sick”

Y“Sick”

X2“Rash”

X3“Male”

X1“Fever”

Theorem [Krause, Guestrin UAI ‘05]: Information gain F(A) in Naïve Bayes models is submodular!

New feature X1

+

s

B

Large improvement

Submodularity:

A

+

s

Small improvement

Why is submodularity useful?

Theorem [Nemhauser et al ‘78]

Greedy maximization algorithm returns Agreedy:

F(Agreedy) ¸ (1-1/e) max|A| k F(A)

• Greedy algorithm gives near-optimal solution!

• For info-gain: Guarantees best possible unless P = NP! [Krause, Guestrin UAI ’05]

• Many ML problems are submodular, i.e., for F submodular require:

• Minimization: A* = argmin F(A)

• Structure learning (A* = argmin I(XA; XV\A))

• Clustering

• MAP inference in Markov Random Fields

• Maximization: A* = argmax F(A)

• Feature selection

• Active learning

• Ranking

A [ B

AÅB

Submodular set functions

• Set function F on V is called submodular if

• Equivalent diminishing returns characterization:

+

¸

+

B

A

+

S

B

Large improvement

Submodularity:

A

+

S

Small improvement

Submodularity and supermodularity

F1,…,Fm submodular functions on V and 1,…,m > 0

Then: F(A) = ii Fi(A) is submodular!

Submodularity closed under nonnegative linear combinations!

Extremely useful fact!!

• F(A) submodular ) P() F(A) submodular!

• Multicriterion optimization: F1,…,Fm submodular, i¸0 )i i Fi(A) submodular

g(|A|)

|A|

Suppose F1(A) and F2(A) submodular.

Is F(A) = max(F1(A),F2(A))submodular?

F(A) = max(F1(A),F2(A))

F1(A)

F2(A)

|A|

max(F1,F2) not submodular in general!

Well, maybe F(A) = min(F1(A),F2(A)) instead?

F({b}) – F(;)=0

<

F({a,b}) – F({a})=1

min(F1,F2) not submodular in general!

But stay tuned

x{b}

2

1

x{a}

-1

0

1

-2

The submodular polyhedron PF

Example: V = {a,b}

x({b}) · F({b})

PF

x({a,b}) · F({a,b})

x({a}) · F({a})

Lovasz extension

w{b}

2

1

w{a}

-1

0

1

-2

Example: Lovasz extension

g(w) = max {wT x: x2PF}

g([0,1]) = [0,1]T [-2,2] = 2 = F({b})

g([1,1]) = [1,1]T [-1,1] = 0 = F({a,b})

[-2,2]

{b}

{a,b}

[-1,1]

w=[0,1]want g(w)

{}

{a}

Greedy ordering:e1 = b, e2 = a

 w(e1)=1 > w(e2)=0

xw(e1)=F({b})-F(;)=2

xw(e2)=F({b,a})-F({b})=-2

 xw=[-2,2]

Theorem [Lovasz ’83]:g(w) attains its minimum in [0,1]n at a corner!

If we can minimize g on [0,1]n, can minimize F…(at corners, g and F take same values)

g(w) convex (and efficient to evaluate)

F(A) submodular

Does the converse also hold?

No, consider g(w1,w2,w3) = max(w1,w2+w3)

{a}

{b}

{c}

F({a,b})-F({a})=0 < F({a,b,c})-F({a,c})=1

Ellipsoid algorithm

Interior Points algorithm

Example: Image denoising

Y1

Y2

Y3

X1

X2

X3

Y4

Y5

Y6

X4

X5

X6

Y7

Y8

Y9

X7

X8

X9

Example: Image denoising

Pairwise Markov Random Field

P(x1,…,xn,y1,…,yn) = i,ji,j(yi,yj) ii(xi,yi)

Wantargmaxy P(y | x) =argmaxy log P(x,y) =argminyi,j Ei,j(yi,yj)+i Ei(yi)

Ei,j(yi,yj) = -log i,j(yi,yj)

Xi: noisy pixels

Yi: “true” pixels

When is this MAP inference efficiently solvable(in high treewidth graphical models)?

MAP inference in Markov Random Fields[Kolmogorov et al, PAMI ’04, see also: Hammer, Ops Res ‘65]

• Submodularity

• Move making algorithms

• Higher-order model : Pn Potts model

expansions move and swap move for this problem

• if the pairwise potential functions define a metric then the energy function in equation (8) can be approximately minimized using alpha expansions.

• if pairwise potential functions defines a semi-metric, it can be minimized using alpha beta-swaps.

• Each move:

• A transformation function:

• The energy of a move t:

• The optimal move:

Submodular set functions play an important role in energy minimization as they can be minimized in polynomial time

• The class of higher order clique potentials

for which the expansion and swap moves can be computed in polynomial time

The clique potential take the form:

Can my higher order potential be solved using α-expansions?

• Form of the Higher Order Potentials

Clique Inconsistency function:

Pairwise potential:

xj

xi

xk

Sum Form

c

xm

xl

Max Form

• Move energy is always submodular if

non-decreasing concave.

proofs

Concave Function:

• all projections on two variables of any alpha beta-swap move energy are submodular.

• The cost of any configuration

Constraints 1:

Lema 1:

Constraints2:

The theorem is true

• Form of the Higher Order Potentials

Clique Inconsistency function:

Pairwise potential:

xj

xi

xk

Sum Form

c

xm

xl

Max Form

• Submodularity

• Move making algorithms

• Higher-order model : Pn Potts model

n = number of pixels

E(X) = ∑ ci xi + ∑dij|xi-xj|

E: {0,1}n→R

0 →fg, 1→bg

i

i,j

Image

Segmentation

Unary Cost

[Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rotheret al.`04]

Pn Potts Potentials

Patch Dictionary (Tree)

{

0 if xi = 0, i ϵ p

Cmax otherwise

h(Xp) =

Cmax 0

p

• [slide credits: Kohli]

Pn Potts Potentials

n = number of pixels

E: {0,1}n→R

0 →fg, 1→bg

E(X) = ∑ ci xi+ ∑dij|xi-xj| +∑hp(Xp)

i

i,j

p

{

0 if xi = 0, i ϵ p

Cmax otherwise

h(Xp) =

p

• [slide credits: Kohli]

• Move energy is always submodular if

increasing linear

See paper for proofs

PN Potts Model

c

PN Potts Model

c

Cost : g

PN Potts Model

c

Cost : gmax

Optimal moves for PN Potts

• Computing the optimal swap move

Label 1(a)

Case 1

Not all variables assigned label 1 or 2

Label 2 (b)

Label 3

Label 4

Move Energy is independent of tc and can be ignored.

c

Optimal moves for PN Potts

• Computing the optimal swap move

Label 1(a)

Case 2

All variables assigned label 1 or 2

Label 2 (b)

Label 3

Label 4

c

Optimal moves for PN Potts

• Computing the optimal swap move

Label 1(a)

Case 2

All variables assigned label 1 or 2

Label 2 (b)

Label 3

Label 4

Can be minimized by solving a st-mincut problem

c

add a constant K to all possible values of the clique potential without changing the optimal move

This transformation does not effect the solution

• Computing the optimal swap move

Source

Ms

v1

v2

vn

Mt

vi Source Set

ti= 0

vj Sink Set

tj= 1

Sink

• Computing the optimal swap move

Source

Ms

v1

v2

vn

Case 1: all xi= a(vi Source)

Mt

Cost:

Sink

• Computing the optimal swap move

Source

Ms

v1

v2

vn

Case 2: all xi= b(vi Sink)

Mt

Cost:

Sink

• Computing the optimal swap move

Source

Ms

v1

v2

vn

Case 3: all xi= a,b(vi Source, Sink)

Mt

Cost:

Recall that the cost of an st-mincut is the sum of weights of the edges included in the stmincut which go from the source set to the sink set.

Sink

Optimal moves for PN Potts

• The expansion move energy

• Similar graph construction.

• Texture Segmentation

Unary

(Colour)

Pairwise

(Smoothness)

Higher Order

(Texture)

Original

Pairwise

Higher order

Pairwise

Higher Order

Original

Swap (3.2 sec)

Swap (4.2 sec)

Expansion (2.5 sec)

Expansion (3.0 sec)

Pairwise

Higher Order

Original

Swap (4.7 sec)

Swap (5.0 sec)

Expansion (3.7sec)

Expansion (4.4 sec)