1 / 82

Bayesian Inference and Networks: Why should a biologist care?

Bayesian Inference and Networks: Why should a biologist care?. Paul E. Anderson, Ph.D. Why should we care?. It let’s us answer the questions we really want to know!. http://www.sciencemag.org/content/294/5550/2310.full.pdf. Introduction.

lucas-craig
Download Presentation

Bayesian Inference and Networks: Why should a biologist care?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Inference and Networks: Why should a biologist care? Paul E. Anderson, Ph.D.

  2. Why should we care? • It let’s us answer the questions we really want to know! http://www.sciencemag.org/content/294/5550/2310.full.pdf P. Anderson, College of Charleston

  3. Introduction • Suppose you are trying to determine if a patient has pneumonia. You observe the following symptoms: • The patient has a cough • The patient has a fever • The patient has difficulty breathing

  4. Introduction You would like to determine how likely the patient has pneumonia given that the patient has a cough, a fever, and difficulty breathing We are not 100% certain that the patient has pneumonia because of these symptoms. We are dealing with uncertainty!

  5. Introduction Now suppose you order a chest x-ray and the results are positive. Your belief that that the patient has pneumonia is now much higher.

  6. Introduction • In the previous slides, what you observed affected your belief that the patient has pneumonia • This is called reasoning with uncertainty • Wouldn’t it be nice if we had some methodology for reasoning with uncertainty? Why in fact, we do...

  7. Bayesian Networks • Bayesian networks help us reason with uncertainty • In the opinion of many AI researchers, Bayesian networks are the most significant contribution in AI in the last 10 years • They are used in many applications eg.: • Spam filtering / Text mining • Speech recognition • Robotics • Diagnostic systems • Syndromic surveillance

  8. Bayesian Networks (An Example) From: Aronsky, D. and Haug, P.J., Diagnosing community-acquired pneumonia with a Bayesian network, In: Proceedings of the Fall Symposium of the American Medical Informatics Association, (1998) 632-636.

  9. The intuition behind the statistics Rephrase the questions in ways we can answer! P. Anderson, College of Charleston

  10. Answering questions about _____ • Fruit on an assembly line • Oranges, grapefruit, lemons, cherries, apples • Sensors measure: • Red intensity • Yellow intensity • Mass (kg) • Approximate volume • At the end of the line, a gate switches to deposit the fruit into the correct bin

  11. Training the algorithm Sensors, scales, etc… Red = 2.125Yellow = 6.143Mass = 134.32Volume = 24.21 Apple

  12. Red = 2.125Yellow = 6.143Mass = 134.32Volume = 24.21 Apple Training (2) Classifier M. Raymer – WSU, FBS

  13. Red = 2.125Yellow = 6.143Mass = 134.32Volume = 24.21 ?? Testing Classifier ! M. Raymer – WSU, FBS

  14. Pattern Matrix M. Raymer – WSU, FBS

  15. Distributions • Bayesian classifiers start with an estimate of the distribution of the features Gaussian Distribution (Continuous) Binomial Distribution (Discrete) M. Raymer – WSU, FBS

  16. Density Estimation • Parametric • Assume a Gaussian (e.g.) distribution. • Estimate the parameters (,). • Non-parametric • Histogram sampling • Bin size is critical • Gaussian smoothingcan help M. Raymer – WSU, FBS

  17. The Gaussian distribution Multivariate (d-dimensional): Univariate:   A parametric Bayesian classifier must estimate  and  from the training samples. M. Raymer – WSU, FBS

  18. Making decisions • Once you have the distributions for • Each feature and • Each class • You can ask questions like… If I have an apple, what is the probability that the diameter will be between 3.2 and 3.5 inches? M. Raymer – WSU, FBS

  19. More decisions… Non-parametric Parametric  Count  Diameter M. Raymer – WSU, FBS

  20. A Simple Example • You are given a fruit with adiameter of 4” – is it a pear or an apple? • To begin, we need to know the distributions of diameters for pears and apples. M. Raymer – WSU, FBS

  21. Maximum Likelihood Class-Conditional Distributions P(x) 1” 2” 3” 4” 5” 6” M. Raymer – WSU, FBS

  22. What are we asking? • If the fruit is an apple, how likely is itto have a diameter of 4”? • If the fruit is a xenofruit from planet Xircon, how likely is it to have a diameter of 4”? Is this the right question to ask? M. Raymer – WSU, FBS

  23. A Key Problem • We based this decision on (class conditional) • What we really want to use is (posterior probability) • What if we found the fruit in a pear orchard? • We need to know the prior probability of finding an apple or a pear! M. Raymer – WSU, FBS

  24. Statistical decisions… • If a fruit has a diameter of 4”, how likely is it to be an apple? 4” Fruit Apples M. Raymer – WSU, FBS

  25. “Inverting” the question Given an apple, what is the probability that it will have a diameter of 4”? Given a 4” diameter fruit, what is the probability that it is an apple? M. Raymer – WSU, FBS

  26. Prior Probabilities • Prior probability + Evidence Posterior Probability • Without evidence, what is the “prior probability” that a fruit is an apple? M. Raymer – WSU, FBS

  27. The heart of it all • Bayes Rule M. Raymer – WSU, FBS

  28. Bayes Rule or M. Raymer – WSU, FBS

  29. Example Revisited • Is it an ordinary apple or an uncommon pear? M. Raymer – WSU, FBS

  30. Bayes Rule Example M. Raymer – WSU, FBS

  31. Bayes Rule Example M. Raymer – WSU, FBS

  32. Solution M. Raymer – WSU, FBS

  33. Marginal Distributions M. Raymer – WSU, FBS

  34. Combining Marginals • Assuming independent features: • If we assume independence and use Bayes rule, we have a Naïve Bayes decision maker (classifier). M. Raymer – WSU, FBS

  35. Bayes Decision Rule • Provably optimal when the features (evidence) follow Gaussian distributions, and are independent. M. Raymer – WSU, FBS

  36. Likelihood Ratios • When deciding between two possibilities, we don’t need the exact probabilities. We only need to know which one is greater. • The denominator for all the classes is always equal. • Can be eliminated • Useful when there are many possible classes M. Raymer – WSU, FBS

  37. Likelihood Ratio Example  M. Raymer – WSU, FBS

  38. Likelihood Ratio Example M. Raymer – WSU, FBS

  39. In-class example: Oranges Grapefruit M. Raymer – WSU, FBS

  40. Example (cont’d) • After observing several hundred fruit pass down the assembly line, we observe that • 72% are oranges • 28% are grapefruit • Fruit ‘x’ • Red intensity = 8.2 • Mass = 7.6 What shall we predict for the class of fruit ‘x’? M. Raymer – WSU, FBS

  41. The whole enchilada and… (Naïve assumption) Repeat for grapefruit and predict the more probable class. M. Raymer – WSU, FBS

  42. The whole enchilada (2) M. Raymer – WSU, FBS

  43. The whole enchilada (3) M. Raymer – WSU, FBS

  44. Conclusion Predict that fruit ‘x’ is a grapefruit, despite the relative scarcity of grapefruits on the conveyor belt. M. Raymer – WSU, FBS

  45. Abbreviated • Since the denominator is the same for all classes, we can just compare: and M. Raymer – WSU, FBS

  46. Likelihood comparison M. Raymer – WSU, FBS

  47. What if we want more complexity? Bayesian Networks P. Anderson, College of Charleston

  48. Bayesian Networks are built upon Independence Variables A and B are independent if any of the following hold: • P(A,B) = P(A)P(B) • P(A | B) = P(A) • P(B | A) = P(B) This says that knowing the outcome of A does not tell me anything new about the outcome of B.

  49. Independence How is independence useful? • Suppose you have n coin flips and you want to calculate the joint distribution P(C1, …, Cn) • If the coin flips are not independent, you need 2n values in the table • If the coin flips are independent, then Each P(Ci) table has 2 entries and there are n of them for a total of 2n values

  50. Conditional Independence Variables A and B are conditionally independent given C if any of the following hold: • P(A, B | C) = P(A | C)P(B | C) • P(A | B, C) = P(A | C) • P(B | A, C) = P(B | C) Knowing C tells me everything about B. I don’t gain anything by knowing A (either because A doesn’t influence B or because knowing C provides all the information knowing A would give)

More Related