- 113 Views
- Uploaded on
- Presentation posted in: General

Performing Bayesian Inference by Weighted Model Counting

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Performing Bayesian Inference by Weighted Model Counting

Tian Sang, Paul Beame, and Henry Kautz

Department of Computer Science & Engineering

University of Washington

Seattle, WA

- Extend success of “compilation to SAT” work for NP-complete problems to “compilation to #SAT” for #P-complete problems
- Leverage rapid advances in SAT technology
- Example: Computing permanent of a 0/1 matrix
- Inference in Bayesian networks (Roth 1996, Dechter 1999)

- Provide practical reasoning tool
- Demonstrate relationship between #SAT and conditioning algorithms
- In particular: compilation to DNNF (Darwiche 2002, 2004)

- Simple encoding of Bayesian networks into weighted model counting
- Techniques for extending state-of-the-art SAT algorithms for efficient weighted model counting
- Evaluation on computationally challenging domains
- Outperforms join-tree methods on problems with high tree-width
- Competitive with best conditioning methods on problems with high degree of determinism

- Model counting
- Encoding Bayesian networks
- Related Bayesian inference algorithms
- Experiments
- Grid networks
- Plan recognition

- Conclusion

- Given a CNF formula,
- SAT: find a satisfying assignment n
- #SAT: count satisfying assignments

- Example: (x y) (y z)
- 5 models:
(0,1,0), (0,1,1), (1,1,0), (1,1,1), (1, 0, 0)

- Equivalently: satisfying probability = 5/23
- Probability that formula is satisfied by a random truth assignment

- 5 models:
- Can modify Davis-Putnam-Logemann-Loveland to calculate this value

DPLL for SAT

DPLL(F)

if F is empty, return 1

if F contains an empty clause, return 0

else choose a variable x to branch

return (DPLL(F|x=1) V DPLL(F|x=0))

#DPLL for #SAT

#DPLL(F)// computes satisfying probability of F

if F is empty, return 1

if F contains an empty clause, return 0

else choose a variable x to branch

return 0.5*#DPLL(F|x=1 )+ 0.5*#DPLL(F|x=0)

- Each literal has a weight
- Weight of a model = Product of weight of its literals
- Weight of a formula = Sum of weight of its models

WMC(F)

if F is empty, return 1

if F contains an empty clause, return 0

else choose a variable x to branch

return weight(x) * WMC(F|x=1) +

weight(x) * WMC(F|x=0)

- State of the art model counting program (Sang, Bacchus, Beame, Kautz, & Pitassi 2004)
- Key innovation: sound integration of component caching and clause learning
- Component analysis(Bayardo & Pehoushek 2000): if formulas C1 and C2 share no variables,
BWMC (C1 C2) = BWMC (C1) * BWMC (C2)

- Caching (Majercik & Littman 1998; Darwiche 2002; Bacchus, Dalmao, & Pitassi 2003; Beame, Impagliazzo, Pitassi, & Segerland 2003): save and reuse values of internal nodes of search tree
- Clause learning(Marquis-Silva 1996; Bayardo & Shrag 1997; Zhang, Madigan, Moskewicz, & Malik 2001): analyze reason for backtracking, store as a new clause

- Component analysis(Bayardo & Pehoushek 2000): if formulas C1 and C2 share no variables,

- State of the art model counting program (Sang, Bacchus, Beame, Kautz, & Pitassi 2004)
- Key innovation: sound integration of component caching and clause learning
- Naïve combination of all three techniques is unsound
- Can resolve by careful cache management (Sang, Bacchus, Beame, Kautz, & Pitassi 2004)
- New branching strategy (VSADS) optimized for counting (Sang, Beame, & Kautz SAT-2005)

- Task: In one counting pass,
- Compute number of models in which each literal is true
- Equivalently: compute marginal satisfying probabilities

- Approach
- Each recursion computes a vector of marginals
- At branch point: compute left and right vectors, combine with vector sum
- Cache vectors, not just counts

- Reasonable overhead: 10% - 40% slower than counting

B

B

A

0.2

0.8

A

0.6

0.4

A

A

0.1

B

B

B

A

0.2

0.8

A

0.6

0.4

A

A

0.1

Chance variable P added with weight(P)=0.2

B

B

B

A

0.2

0.8

A

0.6

0.4

A

A

0.1

and weight(P)=0.8

B

B

B

A

0.2

0.8

A

0.6

0.4

A

A

0.1

Chance variable Q added with weight(Q)=0.6

B

B

B

A

0.2

0.8

A

0.6

0.4

A

A

0.1

and weight(Q)=0.4

B

B

B

A

0.2

0.8

A

0.6

0.4

A

A

0.1

B

- Let:
- F = a weighted CNF encoding of a Bayes net
- E = an arbitrary CNF formula, the evidence
- Q = an arbitrary CNF formula, the query

- Then:

- Junction tree algorithm (Shenoy & Shafer 1990)
- Most widely used approach
- Data structure grows exponentially large in tree-width of underlying graph

- To handle high tree-width, researchers developed conditioning algorithms, e.g.:
- Recursive conditioning (Darwiche 2001)
- Value elimination (Bacchus, Dalmao, Pitassi 2003)
- Compilation to d-DNNF (Darwiche 2002; Chavira, Darwiche, Jaeger 2004; Darwiche 2004)

- These algorithms become similar to DPLL...

- Our benchmarks: Grid, Plan Recognition
- Junction tree - Netica
- Recursive conditioning – SamIam
- Value elimination – Valelim
- Weighted model counting – Cachet

- ISCAS-85 and SATLIB benchmarks
- Compilation to d-DNNF – timings from (Darwiche 2004)
- Weighted model counting - Cachet

S

T

- CPT’s are set randomly.
- A fraction of the nodes are deterministic, specified as a parameter ratio.
- T is the query node

10 problems of each size, X=memory out or time out

- Task:
- Given a planning domain described by STRIPS operators, initial and goal states, and time horizon
- Infer the marginal probabilities of each action

- Abstraction of strategic plan recognition: We know enemy’s capabilities and goals, what will it do?
- Modified Blackbox planning system (Kautz & Selman 1999) to create instances

- Bayesian inference by translation to model counting is competitive with best known algorithms for problems with
- High tree-width
- High degree of determinism

- Recent conditioning algorithms already make use of important SAT techniques
- Most striking: compilation to d-DNNF

- Translation approach makes it possible to quickly exploit future SAT algorithms and implementations