1 / 27

Udo v. Toussaint

Introduction into Bayesian Probability Theory. Udo v. Toussaint. Outline. Bayesian Probability Theory:. Basics. Parameter Estimation. Model Comparison. Outlook. Acknowledgment. (T). Basics. Likelihood optimization P(D| ,I).

Download Presentation

Udo v. Toussaint

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction into Bayesian Probability Theory Udo v. Toussaint

  2. Outline Bayesian Probability Theory: Basics Parameter Estimation Model Comparison Outlook Acknowledgment (T)

  3. Basics Likelihood optimization P(D|,I) Confusion of direct and indirect probabilities (see later) (F,T)

  4. Example Example: surface temperature? Available information and measurements: (1) Coworker tells you: temperature is known to be between 1000-1100K (2) Photon counter: 2 measurements (t=1s) with c1=9 and c2=8 photons, #of counts = a*(T^4)*t, a = 10^(-11) cts/( K^4 s) (3) 2 temperature measurements: T1=1030K and T2=997K, uncertainty: =15 K What is your best temperature estimate? And how reliable is your estimate (uncertainty)? (T)

  5. Basics: The origin (F)

  6. Basics Sometimes also p(A|B,I) is used: The >I< is a kind of placeholder for the continuum of other possible but not yet specified conditionals , + (F,T)

  7. Basics: How to use sum and product rules (F)

  8. Basics: How to use sumand product rules (F)

  9. Basics: The key message Bayes Probability Theory processes information (it updates probability density distributions (=prior) for the parameters  taking into account new information (=measured data D)) Bayes theorem relates directand inverse probabilities: Posterior Likelihood Prior Evidence Posterior Maximization: returns parameters which are most probable given the data but Chi2-Fit (Likelihood-Maximization) : returns parameters which give most probable the data Important: direct (Chi2-Fit ) and inverse (posterior) probabilities are nearly always different! You need thePosterior distribution since you condition on the data! (T)

  10. Basics Chi2-Fit (Likelihood-Maximization) : returns parameters which give most probable the data But: Posterior Maximization: returns parameters which are most probable given the data Important: direct and inverse probabilities are nearly always different! Example: Q: What is the probability of the street being wet when it rains? A: p(wet|rain) =100% Q: What is the probability for rain when the street is wet: p(rain|wet) =? A: There are many other possible reasons for a wet street (street cleaning, somebody has washed his car, floods, …). Therefore: p(rain|wet)<p(wet|rain) ! Conclusion: Specify likelihood and prior and use Bayes theorem to compute the inverse probability instead of using only the Chi2-Fit ! (T)

  11. Basics: Recipe In parameter estimation problems the normalization constant p(d) is often ignored to avoid the integration over the parameter space (F,T)

  12. Basics: Likelihood Remark: Given only the first two moments of a distribution the Maximum Entropy Principle yields the Gaussian distribution (F,T)

  13. Basics: Priors A constant prior is most likely not what you want or it can have unexpected implications, even for “simple” tasks like straight-line fits. See eg “hyperplane priors”. Therefore use your physical knowledge to derive an informative prior! (F,T)

  14. Basics: Priors • Principles to derive prior distributions for subsequent use in a Bayesian computation: • eg • Transformation Invariance • Maximum Entropy Distributions (F,T)

  15. Basics: Display of Results Display full probability distribution whenever possible (no ambiguities) (F,T)

  16. Parameter Estimation Example 1: Single data point + prior information Prior information (eg from previous measurements: Parameter m=73 (black line) Measurement d=3 2 (green line) Posterior (without normalization): red line. (T)

  17. Parameter Estimation Example2 : estimate pdf for count rate r/t Data: two photon counts: c1=5,c2=7 (counting experiment -> poisson distribution) Prior: - 1) flat - 2)exponential ( incorp. prior knowledge for expected mean number of counts to be around 1: Posterior (2nd case): (T)

  18. Parameter Estimation: Example from slide 4 Answer to questions of slide 4: The most likely value is T1012K10K This value (the untruncated, unnormalized posterior pdf p(T|c1,c2,…) is displayed as blue line) is mostly determined by the direct temperature measurements (see pink dashed line), but slightly shifted to lower temperatures by the (weak) influence of the photon measurements (green line). The prior term p(T|I) truncates the posterior outside of the interval 1000K-1100K. Data: c1=9cts, c2=8cts, T1=997K, T2=1030K, =15K, a=10-11/(K4s) p(T|c1,c2,a,T1,T2,,I)  pPoisson(c1|a,T,I)pPoison(c2|a,T,I)pGauss(T1| ,T,I)pGauss(T2| ,T,I)p(T|I) (T)

  19. Model Comparison Model Comparison: Now (multi-parametric) models instead of parameters: (L)

  20. Model Comparison (L)

  21. Outlook: Outliers (F)

  22. Outlook: Outliers (F)

  23. Outlook: Markov Chain Monte Carlo: How to do the integrals (P)

  24. Outlook: Markov Chain Monte Carlo (P)

  25. Outlook: Markov Chain Monte Carlo (P)

  26. Outlook: Resources Websites: http://www.ipp-mpg.info/bda -> Publications ( >100 accessible publications from various areas of physics) http://bayes.wustl.edu (by L. Bretthorst) (including tutorials by T. Loredo) Books: - D.S. Sivia: Data Analysis: A Bayesian Tutorial (good!) (2006) - E.T. Jaynes: Probability Theory – The logic of science (2003) - Proceedings of the Maximum Entropy and Bayesian Methods Conference (yearly series since 1986?) Conferences: - MaxEnt 2007 in Albany/NY (USA) http://www.maxent2007.org - ISBA-Conference (mostly for statisticians) (T)

  27. Acknowledgement This presentation relied heavily on slides from T. Loredo, Cornell (see: http://bayes.wustl.edu) R. Preuss, IPP R. Fischer, IPP and on some own contributions  The origin of the slides is indicated at the bottom of each slide with (L,P,F,T) Many thanks to all of them! (T)

More Related