1 / 109

Compiling Graphical Models

Compiling Graphical Models. Adnan Darwiche University of California, Los Angeles UAI’06 Tutorial. Compilation: Historical Motivation. Separate inference into two phases: Offline : Compile model into a structure Online : Use structure to answer queries

tanuja
Download Presentation

Compiling Graphical Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiling Graphical Models Adnan Darwiche University of California, Los Angeles UAI’06 Tutorial

  2. Compilation: Historical Motivation • Separate inference into two phases: • Offline: Compile model into a structure • Online: Use structure to answer queries • Goal: Push as much work into offline phase to optimize online inference time • Best initial example: • Offline: Compile a Bayesian network into a jointree • Online: Use jointree to answer multiple queries efficiently

  3. Compilation: Modern Motivation • Exploit model structure in inference: • Global structure: • Exhibited in model topology • Measured by treewidth • Exploited by most (non-compilation) algorithms • Local structure: • Exhibited in model parameters • Type 1: Determinism • Type 2: Context-specific independence • Local structure is best exploited in the context of compilation: main theme

  4. Compilation: Theoretical Implications • Unifies inference paradigms • Variable elimination • Jointree (Tree clustering) • Conditioning • Compilation as a trace of classical inference

  5. Local Knowledge Bayesian Networks Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge Engine Start Lights Engine Turn Over Radio

  6. Lights ON OFF OK .99 .01 Battery Power .80 .20 WEAK 0 1 DEAD Bayesian Networks Battery Age Alternator Fan Belt If Battery Power = OK, thenLights = ON (99%) …. Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge Engine Start Lights Engine Turn Over Radio

  7. Bayesian Networks Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge Engine Start Lights Engine Turn Over Radio

  8. Global Structure:Treewidth w

  9. Local Structure:CSI and Determinism Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge Engine Start Lights Engine Turn Over Radio

  10. Local Structure:CSI and Determinism Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge Engine Start Lights Engine Turn Over Radio Context Specific Independence (CSI)

  11. Lights ON OFF OK .99 .01 If Battery Power = Dead, thenLights = OFF Battery Power .80 .20 WEAK 0 1 DEAD Determinism Local Structure:CSI and Determinism Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge Engine Start Lights Engine Turn Over Radio

  12. Today’s Models … • Characterized by: • Richness in local structure (determinism, CSI) • Massiveness in size (100,000’s variables not uncommon) • High connectivity (treewidth > 50, > 100) • Enabled by: • High level modeling tools: relational, first order • New application areas (synthesis): • Bioinformatics (e.g. linkage analysis) • Sensor networks • Exploiting local structure a must!

  13. High Order Specifications:Relational Models… burglary(v)=0.005; alarm(v)=(burglary(v):0.95,0.01); calls(v,w)= (neighbor(v,w): (prankster(v)): (alarm(w):0.9,0.05), (alarm(w):0.9,0)),0);alarmed(v)= n-or{calls(w,v)|w:neighbor(w,v)} Primula

  14. M Networkparams Treewidth* w CNF Vars Cnf Clauses AC Edges Online Time (sec) Offline Time (min) 1 34 3 12 31 18 0.02 0 4 1,552 13 414 1,390 293 0.03 0.01 7 7,714 36 1,995 6,916 1,295 0.08 0.01 10 21,760 70 5,565 19,525 3,512 0.34 0.02 13 46,930 118 11,934 42,133 7,430 1.07 0.03 16 86,464 172 21,912 77,656 13,535 3.21 0.04 19 143,602 244 36,309 129,010 22,313 9.04 0.05 22 221,584 316 55,935 199,111 34,250 23.56 0.07 25 323,650 412 81,600 290,875 49,832 48.32 0.09 28 453,040 528 114,114 407,218 69,545 105.74 0.13 29 502,802 560 126,614 451,965 77,118 130 0.14 Friends and Smokers(Richardson & Domingos, 2004) • M individuals • Relations such as smokes(p), cancer(p), friend(p1,p2) • Logical constraints such as: if one of p's friends smokes, then p smokes. • Sample Query: probability that given person has cancer

  15. Students-Profs Networkparams Treewidthw CNF Vars Cnf Clauses AC Edges Online Time (sec) Offline Time (min) 04-08 11,566 72 3,099 11,099 445,410 0.0530 2 04-16 21,070 101 5,859 21,115 815,461 0.0930 3 05-10 20,688 128 5,624 20,279 2,531,230 0.2885 3 05-20 38,168 148 10,734 38,889 5,236,257 1.8439 7 06-12 33,454 176 9,209 33,353 16,936,504 3.2120 14 06-24 62,302 233 17,693 64,325 36,450,231 12.9663 33 Students(Pasula & Russell, 2001) • P professors • S students • Various relations, such as famous(p), well-funded(p), success(s), advises(p,s) • Sample Query: probability a professor is well-funded given success of advised students

  16. Genetic Linkage Analysis • Ordering genes on a chromosome and determining distance between them • Useful for predicting and detecting diseases • Associating functionality of genes with their location on the chromosome Gene 1 Gene 2 Gene 3

  17. Pedigrees + Phenotype + Genotype

  18. DBNs from Speech Applications

  19. Coding Networks

  20. Tutorial Outline • Theoretical foundations • Online query answering algorithms • Offline compilation algorithms • Applications • Concluding remarks

  21. Theoretical Foundations • Graphical Model (Bayesian, Markov Networks): • Is a Multi-Linear Function (MLF) • Compiled Model: • Is an Arithmetic Circuit (AC) • Compilation process: • Factoring MLF into AC

  22. Factoring + * * + + * * * * * * Multi-Linear Functions Arithmetic Circuits A B A Differential Approach to Inference in Bayesian NetworksJACM-03 (Darwiche)

  23. * Arithmetic Circuit (AC) + * MLF: a + ad + abd + abcd + a b c d 1 Factoring Multi-linear Functions (MLFs) Circuit Complexity: Size of smallest AC that computes the MLF • A graphical model defines an MLF An MLF has an exponential number of terms, yet it may be represented by an AC with polynomial size! • Evaluating the MLF for a given evidence gives the probability of evidence • The inference problem can be formulated as factoring the MLF of a graphical model

  24. Graphical Models as MLFs A B Pr(.) true true .03 true false .27 false true .56 false false false .14 false Pr(a) = .03 + .27 = .3

  25. Graphical Models as MLFs A B Pr(.) true true .03 true false .27 false true .56 false false false .14 false Pr(~b) = .27 + .14 = .41

  26. Graphical Models as MLFs A B Pr(.) λa*λb * .03 .03 true true .27 λa*λ~b * .27 true false λ~a*λb * .56 .56 false true false λ~a*λ~b* .14 .14 false false false F(λ~a, λ~b, λa, λb) = .03λaλb + .27λaλ~b + .56λ~aλb + .14λ~a λ~b

  27. F(λ~a, λ~b, λa, λb) =.03λaλb + .27λaλ~b + .56λ~aλb + .14λ~a λ~b Pr(a,~b) = F(λ~a:0, λ~b:1, λa:1 , λb:0) = .27 Pr(a) = F(λ~a:0, λ~b:1, λa:1 , λb:1) = .03+.27

  28. C B A θc|a θa θb|a

  29. C B A θc|a θa θb|a

  30. C B A F = λa λb λc θa θb|a θc|a + λa λb λ~c θa θb|a θ~c|a + λa λ~b λc θa θ~b|a θc|a + λa λ~b λ~c θa θ~b|a θ~c|a ….

  31. B D A θc|a C θa θd|bc θb|a F = λa λb λc λd θa θb|a θc|a θd|bc + λa λb λc λ~d θa θb|a θc|a θ~d|bc + …. Each term has 2n variables (n indicators, n parameters) Each variable has degree one (multi-linear function)

  32. Factoring + * * + + * * * * * * Multi-Linear Functions Arithmetic Circuits A B

  33. Online Query Answering • Complexity: • Time and space linear in the AC size • Queries: • Probability of evidence, with evidence flipping/fast retraction • Variable and family marginals • MPE: most probable explanation • Sensitivity analysis (derivatives)

  34. Evaluating the Polynomial

  35. PR: Probability of Evidence Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge Engine Start Lights Engine Turn Over Radio Pr(e)

  36. The Partial Derivatives

  37. PR: Probability of Evidence Flips Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge X Engine Start Lights Engine Turn Over Radio Pr(e)

  38. PR: Probability of Evidence Flips Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge X Engine Start Lights Engine Turn Over Radio Pr(e-X,x)

  39. The Partial Derivatives

  40. PR: Family Marginals U Battery Age Alternator Fan Belt X Charge Delivered Battery Fuel Pump Fuel Line Starter Distributor Gas Battery Power Spark Plugs Gas Gauge Engine Start Lights Engine Turn Over Radio Pr(e,x,u)

  41. Factoring + * * + + * * * * * * Multi-Linear Functions Arithmetic Circuits A B

  42. 1 .3 .3 .03 .3 0 .27 0 .7 0 Circuit Evaluation and Differentiation: Marginals .3 + 1 .3 0 * * 1 1 • Two passes only: • probability of evidence (with evidence flipping) • Node marginals • Family marginals • Sensitivity + + 1 1 .3 .1 .9 .8 .2 0 * * * * * * 1 .3 .3 0 0 1 .3 1 .1 1 .9 .8 1 .2 0 .7

  43. Efficient Eval/Diff Schemes Assume alternating levels of +/* nodes, with one parent per *node • Method A:Two registers per +node (no registers for *nodes) • Method B:One register per node (use for values in upward pass, then override with derivatives in downward pass) • Method C:One register per node, one bit per *node

  44. * * * Circuit Optimization: MPE .27 m .27 0 * * .9 .8 m m .3 .1 .9 .8 .2 0 * * * * * * .3 1 .1 1 .9 .8 1 .2 0 .7

  45. m * m * * Circuit Optimization: MPE

  46. Custom Hardware for Evaluating ACs Adharapurapu, Ercegovac (2004)

  47. Offline Compilation • Factoring MLFs into ACs: • Jointree: Embeds AC • Variable Elimination: Trace is an AC • Recursive Conditioning: Trace is an AC • Reduction to Logic: CNF to d-DNNFcompilation

  48. Compiling using Jointrees • Classical Jointree Algorithm: • Convert model into jointree • Jointree propagation (two-passes) • Modern interpretation: • Jointree embeds an AC that factors MLF • Jointree propagation is evaluating/differentiating embedded AC

More Related