Focused belief propagation for query specific inference
This presentation is the property of its rightful owner.
Sponsored Links
1 / 24

Focused Belief Propagation for Query-Specific Inference PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on
  • Presentation posted in: General

Focused Belief Propagation for Query-Specific Inference. Anton Chechetka Carlos Guestrin. 14 May 2010. Query-Specific inference problem. q uery . known. n ot interesting. Using information about the query to speed up convergence of belief propagation for the query marginals.

Download Presentation

Focused Belief Propagation for Query-Specific Inference

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Focused belief propagation for query specific inference

Focused Belief Propagation for Query-Specific Inference

Anton ChechetkaCarlos Guestrin

14 May 2010


Query specific inference problem

Query-Specific inference problem

query

known

not interesting

Using information about the queryto speed up convergence of belief propagation

for the query marginals


Graphical models

Graphical models

  • Talk focus: pairwise Markov random fields

  • Paper:arbitrary factor graphs (more general)

  • Probability is a product of local potentials

  • Graphical structure

  •  Compact representation

  •  Intractable inference

    •  Approximate inference often works well in practice

potential

k

r

h

i

variable

j

s

u


Loopy belief propagation

(loopy) Belief Propagation

  • Passing messages along edges

  • Variable belief:

  • Update rule:

  • Result: all single-variable beliefs

k

r

h

i

j

s

u


Loopy belief propagation1

(loopy) Belief Propagation

  • Update rule:

  • Message dependencies are local:

  • Round–robin schedule

    • Fix message order

    • Apply updates in that order until convergence

dependence

k

r

h

i

dep.

j

s

u

dependence


Dynamic update prioritization

Dynamic update prioritization

large change

small change

  • Fixed update sequence is not the best option

  • Dynamic update scheduling can speed up convergence

    • Tree-Reweighted BP [Wainwright et. al., AISTATS 2003]

    • Residual BP [Elidan et. al. UAI 2006]

  • Residual BP  apply the largest change first

large change

small change

1

2

large change

small change

informative update

wasted computation


Residual bp elidan et al uai 2006

Residual BP [Elidan et. al., UAI 2006]

  • Update rule:

  • Pick edge with largest residual

  • Update

new

old

More effort on the difficult parts of the model 

But no query 


Our contributions

Our contributions

  • Using weighted residuals to prioritize updates

  • Define message weights reflecting the importance of the message to the query

  • Computing importance weights efficiently

  • Experiments: faster convergence on large relational models


Why edge importance weights

Why edge importance weights?

query

residual < residualwhich to update??

  • Residual BP updates

  • no influence on the query

  • wasted computation

  • want to update

  • influence on the queryin the future

Our work  max approx. eventual effect on P(query)

Residual BP  max immediate residual reduction


Query specific bp

Query-Specific BP

  • Update rule:

  • Pick edge with

  • Update

new

old

edgeimportance

the only change!

Rest of the talk: defining and computing edge importance


Edge importance base case

Edge importance base case

  • Pick edge with

approximate eventual update effect on P(Q)

query

Base case: edge directly connected to the queryAji=??

k

r

h

i

j

s

u

change in query belief

change in message

||P(NEW)(Q)  P(OLD)(Q)||

||m(NEW)  m(OLD)||

1

1

ji

ji

||m||

||P(Q)||

tight bound

ji


Edge importance one step away

Edge importance one step away

query

mji

mji

Edge one step away from the query: Arj=??

sup

sup

mrj

mrj

||P(Q)||

||m ||

k

r

over values ofall other messages

h

ji

i

change in query belief

j

s

u

||m|| 

rj

change in message

message importance

can compute in closed formlooking at only fji [Mooij, Kappen; 2007]


Edge importance general case

Edge importance general case

Base case: Aji=1

k

r

h

query

i

One step away:

Arj=

j

s

u

mji

sup

mrj

Generalization?

  • expensive to compute

  • bound may be infinite

||P(Q)||

||msh||

P(Q)

sup

msh

P(Q)

mhr

mrj

mji

sup

sup

sup

sup

msh

msh

mrj

mhr

sensitivity(): max impact along the path 


Edge importance general case1

Edge importance general case

2

1

query

k

r

h

sensitivity(): max impact along the path 

i

P(Q)

mhr

mrj

mji

j

s

u

sup

sup

sup

sup

msh

  • Ash= max all paths from to query sensitivity()

msh

mhr

mrj

h

There are a lot of paths in a graph,trying out every one is intractable 


Efficient edge importance computation

Efficient edge importance computation

There are a lot of paths in a graph,trying out every one is intractable 

always decreases as the path grows

Dijkstra’s(shortest paths) alg. will efficiently find max-sensitivity pathsfor every edge 

always 1

always 1

always 1

sensitivity( hrji) =

decomposes into individual edge contributions

mhr

mrj

mji

sup

sup

sup

mhr

msh

mrj

  • A= max all paths from to query sensitivity()


Query specific bp1

Query-Specific BP

  • Run Dijkstra’salgstarting at query to get edge weights

  • Pick edge with largest weighted residual

  • Update

Aji= max all paths from itoquery sensitivity()

More effort on the difficult parts of the model

and relevant

Takes into account not only graphical structure, but also strength of dependencies


Outline

Outline

  • Using weighted residuals to prioritize updates

  • Define message weights reflecting the importance of the message to the query

  • Computing importance weights efficiently

    • As an initialization step before residual BP

  • Restore anytime behavior of residual BP

  • Experiments: faster convergence on large relational models


Big picture

Big picture

Before

After

?

?

long initialization

all marginals

all marginals

anytimeproperty

Dijkstra

BP

query marginals

all marginals

This is broken!


Interleaving bp and dijkstra s

Interleaving BP and Dijkstra’s

  • Dijkstra’sexpands the highest weight edges first

    • Can pause it at any time and get the most relevant submodel

query

full model

Dijkstra

Dijkstra

When to stop BP and resume Dijkstra’s?

BP

BP

BP

Dijkstra


Interleaving

Interleaving

  • Dijkstra’s expands the highest weight edges first

query

expanded onprevious iteration

not yet expanded

just expanded

minexpanded edgesA  A

suppose

upper bound on priority

actual priority of

 M  minexpanded A

no need to expand further at this point


Any time query specific bp

Any-Time query-specific BP

query

Query-specific BP:

Dijkstra’s alg.

BP updates

k

r

same BP update sequence!

i

j

s

Anytime QSBP:


Experiments the setup

Experiments – the setup

  • Relational model: semi-supervised people recognition in collection of images [with Denver Dash and MatthaiPhilipose @ Intel]

    • One variable per person

    • Agreement potentials for people who look alike

    • Disagreement potentials for people in the same image

    • 2K variables, 900K factors

  • Error measure: KL from the fixed point of BP

agree

disagree


Experiments convergence speed

Experiments – convergence speed

AnytimeQuery-Specific BP(with Dijkstra’s interleaving)

Residual BP

Query-Specific BP

better


Conclusions

Conclusions

  • Prioritize updates by weighted residuals

    • Takes query info into account

  • Importance weights depend on both the graphical structure and strength of local dependencies

  • Efficient computation of importance weights

  • Much faster convergence on large relational models

Thank you!


  • Login