Focused Belief Propagation for Query-Specific Inference

Download Presentation

Focused Belief Propagation for Query-Specific Inference

Loading in 2 Seconds...

- 114 Views
- Uploaded on
- Presentation posted in: General

Focused Belief Propagation for Query-Specific Inference

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Focused Belief Propagation for Query-Specific Inference

Anton ChechetkaCarlos Guestrin

14 May 2010

query

known

not interesting

Using information about the queryto speed up convergence of belief propagation

for the query marginals

- Talk focus: pairwise Markov random fields
- Paper:arbitrary factor graphs (more general)
- Probability is a product of local potentials
- Graphical structure
- Compact representation
- Intractable inference
- Approximate inference often works well in practice

potential

k

r

h

i

variable

j

s

u

- Passing messages along edges
- Variable belief:
- Update rule:
- Result: all single-variable beliefs

k

r

h

i

j

s

u

- Update rule:
- Message dependencies are local:
- Round–robin schedule
- Fix message order
- Apply updates in that order until convergence

dependence

k

r

h

i

dep.

j

s

u

dependence

large change

small change

- Fixed update sequence is not the best option
- Dynamic update scheduling can speed up convergence
- Tree-Reweighted BP [Wainwright et. al., AISTATS 2003]
- Residual BP [Elidan et. al. UAI 2006]

- Residual BP apply the largest change first

large change

small change

1

2

large change

small change

informative update

wasted computation

- Update rule:
- Pick edge with largest residual
- Update

new

old

More effort on the difficult parts of the model

But no query

- Using weighted residuals to prioritize updates
- Define message weights reflecting the importance of the message to the query
- Computing importance weights efficiently
- Experiments: faster convergence on large relational models

query

residual < residualwhich to update??

- Residual BP updates
- no influence on the query
- wasted computation

- want to update
- influence on the queryin the future

Our work max approx. eventual effect on P(query)

Residual BP max immediate residual reduction

- Update rule:
- Pick edge with
- Update

new

old

edgeimportance

the only change!

Rest of the talk: defining and computing edge importance

- Pick edge with

approximate eventual update effect on P(Q)

query

Base case: edge directly connected to the queryAji=??

k

r

h

i

j

s

u

change in query belief

change in message

||P(NEW)(Q) P(OLD)(Q)||

||m(NEW) m(OLD)||

1

1

ji

ji

||m||

||P(Q)||

tight bound

ji

query

mji

mji

Edge one step away from the query: Arj=??

sup

sup

mrj

mrj

||P(Q)||

||m ||

k

r

over values ofall other messages

h

ji

i

change in query belief

j

s

u

||m||

rj

change in message

message importance

can compute in closed formlooking at only fji [Mooij, Kappen; 2007]

Base case: Aji=1

k

r

h

query

i

One step away:

Arj=

j

s

u

mji

sup

mrj

Generalization?

- expensive to compute
- bound may be infinite

||P(Q)||

||msh||

P(Q)

sup

msh

P(Q)

mhr

mrj

mji

sup

sup

sup

sup

msh

msh

mrj

mhr

sensitivity(): max impact along the path

2

1

query

k

r

h

sensitivity(): max impact along the path

i

P(Q)

mhr

mrj

mji

j

s

u

sup

sup

sup

sup

msh

- Ash= max all paths from to query sensitivity()

msh

mhr

mrj

h

There are a lot of paths in a graph,trying out every one is intractable

There are a lot of paths in a graph,trying out every one is intractable

always decreases as the path grows

Dijkstra’s(shortest paths) alg. will efficiently find max-sensitivity pathsfor every edge

always 1

always 1

always 1

sensitivity( hrji) =

decomposes into individual edge contributions

mhr

mrj

mji

sup

sup

sup

mhr

msh

mrj

- A= max all paths from to query sensitivity()

- Run Dijkstra’salgstarting at query to get edge weights
- Pick edge with largest weighted residual
- Update

Aji= max all paths from itoquery sensitivity()

More effort on the difficult parts of the model

and relevant

Takes into account not only graphical structure, but also strength of dependencies

- Using weighted residuals to prioritize updates
- Define message weights reflecting the importance of the message to the query
- Computing importance weights efficiently
- As an initialization step before residual BP

- Restore anytime behavior of residual BP
- Experiments: faster convergence on large relational models

Before

After

?

?

long initialization

all marginals

all marginals

anytimeproperty

Dijkstra

BP

query marginals

all marginals

This is broken!

- Dijkstra’sexpands the highest weight edges first
- Can pause it at any time and get the most relevant submodel

query

full model

Dijkstra

Dijkstra

When to stop BP and resume Dijkstra’s?

BP

BP

BP

Dijkstra

…

- Dijkstra’s expands the highest weight edges first

query

expanded onprevious iteration

not yet expanded

just expanded

minexpanded edgesA A

suppose

upper bound on priority

actual priority of

M minexpanded A

no need to expand further at this point

query

Query-specific BP:

Dijkstra’s alg.

BP updates

k

r

same BP update sequence!

i

j

s

Anytime QSBP:

- Relational model: semi-supervised people recognition in collection of images [with Denver Dash and MatthaiPhilipose @ Intel]
- One variable per person
- Agreement potentials for people who look alike
- Disagreement potentials for people in the same image
- 2K variables, 900K factors

- Error measure: KL from the fixed point of BP

agree

disagree

AnytimeQuery-Specific BP(with Dijkstra’s interleaving)

Residual BP

Query-Specific BP

better

- Prioritize updates by weighted residuals
- Takes query info into account

- Importance weights depend on both the graphical structure and strength of local dependencies
- Efficient computation of importance weights
- Much faster convergence on large relational models

Thank you!