1 / 16

Bayesian Networks

Bayesian Networks. CPSC 386 Artificial Intelligence Ellen Walker Hiram College. Bayes’ Rule. P(A^B) = P(A|B) * P(B) = P(B|A) * P(A) So P(A|B) = P(B|A) * P(A) / P(B) This allows us to compute diagnostic probabilities from causal probabilities and prior probabilities!.

khouser
Download Presentation

Bayesian Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College

  2. Bayes’ Rule • P(A^B) = P(A|B) * P(B) • = P(B|A) * P(A) • So • P(A|B) = P(B|A) * P(A) / P(B) • This allows us to compute diagnostic probabilities from causal probabilities and prior probabilities!

  3. Joint Probability Distribution • Consider all possibilities of a set of propositions E.g. Picking 2 cards from a deck P(card1 is red and card2 is red) P(card1 is red and card2 is black) P(card1 is black and card2 is red) P(card1 is black and card2 is black) Sum of all combinations should be 1 Sum of all “card1 is red” combinations is P(card1 is red)

  4. Joint P.D. table

  5. Operations on Joint P.D. • Marginalization (summing out) • Add up elements in a row or column that contain all possibilities for a given item to remove that item from the distribution • P(1st card is red) = 0.245+0.255 = 0.5 • Conditioning • Get a distribution over one variable by summing out all other variables • Normalization • Take the values in the distribution and find a proper multiplier (alpha) so that they add up to 1.

  6. A bigger joint PD

  7. Based on that PD… • Summing out… • P(flu) = 0.25 • P(fever) = 0.45 • P(flu ^ fever) = 0.2 • P(~flu ^ fever) = 0.25 • P(flu | fever) = P(flu ^ fever) / P(fever) = .0.2 / 0.45 = (4/9) • Normalizing • P(flu | fever) =  <0.2, 0.25> = 0.2/ (0.45) = 4/9

  8. Evaluating Full Joint PD’s • Advantage • All combinations are available • Any joint or unconditional probability can be computed • Disadvantage • Combinatorial Explosion! For N variables, need 2N individual probabilities • Difficult to get probabilities for all combinations

  9. Independence • Absolute independence: • P(A|B) = P(A) or P(A,B) = P(A)*P(B) • No need for joint table

  10. Conditional independence • P(A|B,C)= P(A|C) or P(A,B|C) = P(A|C)*P(B|C) • If we know the truth of C, then A and B become independent • (e.g. ache and fever are independent given flu) • We can say C “separates” A and B

  11. Naïve Bayes model • Assume that all possible effects (symptoms) are separated by the cause • Then: • P(cause, effect1, effect2, effect3…) = P(effect1|cause) * P(effect2|cause) * … • Can work surprisingly well in many cases • Necessary conditional probabilities can be learned

  12. Bayesian Network • Data structure that represents • Dependencies (and conditional independencies) among variables • Necessary information to compute a full joint probability distribution

  13. Structure of Bayesian Network • Nodes represent random variables • (e.g. flu, fever, ache) • Directed links (arrows) connect pairs of nodes, from parent to child • Each node has joint P.D. P(child | parents) • More parents, bigger P.D. • Graph has no directed cycles • No node is its own (great… grand) parent!

  14. Example Bayesian Network Thermometer >100 F Damp weather ache fever spots flu measles

  15. Probabilities in the network • Probability of set of variable assignments is the product of joint probabilities computed from parents • P(x1,x2, …) = P(x1 |parents(X1))* P(x2 |parents(X2)) … • Example • P(~therm ^ damp ^ ache ^ ~fever ^flu) =P(~therm) * P(damp) * P(ache | damp) *P(~fever | therm) * P(flu | ache ^ fever)

  16. Constructing a Bayesian Network • Start with root causes • Add direct consequences next (connected) and so on… • E.g. damp weather -> ache, not damp weather -> flu • Each node should be directly connected to (influenced by) only a few • If we choose a different order, we’ll get tables that are too big!

More Related