1 / 37

Knowledge Repn. & Reasoning Lec #24: Approximate Inference in DBNs

Knowledge Repn. & Reasoning Lec #24: Approximate Inference in DBNs. UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2004. (Some slides by X. Boyen & D. Koller, and by S. H. Lim; Some slides by Doucet, de Freitas, Murphy, Russell, and H. Zhou ). Dynamic Systems.

lerato
Download Presentation

Knowledge Repn. & Reasoning Lec #24: Approximate Inference in DBNs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Knowledge Repn. & ReasoningLec #24: Approximate Inference in DBNs UIUC CS 498: Section EA Professor: Eyal AmirFall Semester 2004 (Some slides by X. Boyen & D. Koller, and by S. H. Lim; Some slides by Doucet, de Freitas, Murphy, Russell, and H. Zhou)

  2. Dynamic Systems • Filtering in stochastic, dynamic systems: • Monitoring freeway traffic (from an autonomous driver or for traffic analysis) • Monitoring patient’s symptoms • Models to deal with uncertainty and/or partial observability in dynamic systems: • Hidden Markov Models (HMMs), Kalman Filters etc • All are special cases of Dynamic Bayesian Networks (DBNs)

  3. Previously • Exact DBN inference • Filtering • Smoothing • Projection • Explanation

  4. DBN Myth • Bayesian Network: a decomposed structure to represent the full joint distribution • Does it imply easy decomposition for the belief state? • No!

  5. Tractable, approximate representation • Exact inference in DBN is intractable • Need approximation • Maintain an approximate belief state • E.g. assume Gaussian processes • Today: • Factored belief state apx [Boyen & Koller ’98] • Particle filtering (if time permits)

  6. Idea • Use a decomposable representation for the belief state (pre-assume some independency)

  7. Problem • What about the approximation errors? • It might accumulate and grow unbounded…

  8. Contraction property • Main result: • If the process is mixing, then every state transition results in a contraction of the distance between the two distributions by a constant factor • Since approximation errors from previous steps decrease exponentially, the overall error remains bounded indefinitely

  9. Basic framework • Definition 1: • Prior belief state: • Posterior belief state: • Monitoring task:

  10. Simple contraction • Distance measure: • Relative entropy (KL-divergence) between the actual and the approximate belief state • Contraction due to O: • Contraction due to T (can we do better?):

  11. Simple contraction (cont) • Definition: • Minimal mixing rate: • Theorem 3 (the single process contraction theorem): • For process Q, anterior distributions φ and ψ, ulterior distributions φ’ and ψ’,

  12. Simple contraction (cont) • Proof Intuition:

  13. Compound processes • Mixing rate could be very small for large processes • The trick is to assume some independence among subprocesses and factor the DBN along these subprocesses • Fully independent subprocesses: • Theorem 5: • For L independent subprocesses T1, …, TL. Let γl be the mixing rate for Tl and let γ = minlγl. Let φ and ψ be distributions over S1(t), …, SL(t), and assume that ψ renders the Sl(t) marginally independent. Then:

  14. Compound processes (cont) • Conditionally independent subprocesses • Theorem 6 (the main theorem): • For L independent subprocesses T1, …, TL, assume each process depends on at most r others, and each influences at most q others. Let γl be the mixing rate for Tl and let γ = minlγl. Let φ and ψ be distributions over S1(t), …, SL(t), and assume that ψ renders the Sl(t) marginally independent. Then:

  15. Efficient, approximate monitoring • If each approximation incurs an error bounded by ε, then • Total error • =>error remains bounded • Conditioning on observations might introduce momentary errors, but the expected error will contract

  16. Approximate DBN monitoring • Algorithm (based on standard clique tree inference): • Construct a clique tree from the 2-TBN • Initialize clique tree with conditional probabilities from CPTs of the DBN • For each time step: • Create a working copy of the tree Y. Create σ(t+1). • For each subprocess l, incorporate the marginal σ(t)[X(t)l] in the appropriate factor in Y. • Incorporate evidence r(t+1) in Y. • Calibrate the potentials in Y. • For each l, query Y for marginal over Xl(t+1) and store it in σ(t+1).

  17. Conclusion of Factored DBNs • Accuracy-efficiency tradeoff: • Small partition => • Faster inference • Better contraction • Worse approximation • Key to good approximation: • Discover weak/sparse interactions among subprocesses and factor the DBN along these lines • Domain knowledge helps

  18. Agenda • Factored inference in DBNs • Sampling: Particle Filtering

  19. A sneak peek at particle filtering

  20. Introduction • Analytical methods • Kalman filter: linear-Gaussian models • HMM: models with finite state space • Stat. approx. methods for non-parametric distributions and large discrete DBN • Diff. names: • Sequential Monte Carlo (Handschin and Mayne 1969, Akashi and Kumamoto 1975) • Particle filtering (Doucet et all 1997) • Survival of the fittest (Kanazawa, Koller and Russell 1995) • Condensation in computer vision (Isard and Blake 1996)

  21. Outline • Importance Sampling (IS) revisited • Sequential IS (SIS) • Particle Filtering = SIS + Resampling • Dynamic Bayesian Networks • A Simple example: ABC network • Inference in DBN: • Exact inference • Pure Particle Filtering • Rao-Blackwellised PF • Demonstration in ABC network • Discussions

  22. Importance Sampling Revisited • Goal: evaluate the following functional • Importance Sampling (batch mode): • Sample from • Assign as weight of each sample • The posterior estimation of is:

  23. Sequential Importance Sampling • How to make it sequential? • Choose Importance function to be: • We get the SIS filter • Benefit of SIS • Observation ykdon’t have be given in batch

  24. Sequential Importance Sampling

  25. Resampling • Why need to resample • Degeneracy of SIS • The variance of the importance weights (y0:k is r.v.) increases in each recursion step • Optimal importance function • Need to sample from and evaluate • Resampling: eliminate small weights and concentrate on large weights

  26. Resampling • Measure of degeneracy: effective sample size

  27. Resampling Step Particle filtering = SIS + Resampling

  28. Rao-Blackwellisation for SIS • A method to reduce the variance of the final posterior estimation • Useful when the state can be partitioned as in which can be analytically marginalized. • Assuming can be evaluated analytically given , one can rewrite the posterior estimation as

  29. Example: ABC network

  30. Inference in DBN n

  31. Exact inference in ABC network

  32. Particle filtering

  33. Rao-Blackwellised PF

  34. Rao-Blackwellised PF (2)

  35. Rao-Blackwellised PF (3)

  36. Rao-Blackwellised PF (4)

  37. Discussions • Structure of the network: • A, C dependent on B • yt can be also separated into 3 indep. parts

More Related