1 / 66

CPSC 7373: Artificial Intelligence Lecture 4: Uncertainty

CPSC 7373: Artificial Intelligence Lecture 4: Uncertainty. Jiang Bian, Fall 2012 University of Arkansas at Little Rock. Chapter 13: Uncertainty. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule. Uncertainty.

nathan
Download Presentation

CPSC 7373: Artificial Intelligence Lecture 4: Uncertainty

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPSC 7373: Artificial IntelligenceLecture 4: Uncertainty Jiang Bian, Fall 2012 University of Arkansas at Little Rock

  2. Chapter 13: Uncertainty • Outline • Uncertainty • Probability • Syntax and Semantics • Inference • Independence and Bayes' Rule

  3. Uncertainty Let action At = leave for airport tminutes before flight Will At get me there on time? Problems: • partial observability (road state, other drivers' plans, etc.) • noisy sensors (traffic reports) • uncertainty in action outcomes (flat tire, etc.) • immense complexity of modeling and predicting traffic Hence a purely logical approach either • risks falsehood: “A25 will get me there on time”, or • leads to conclusions that are too weak for decision making: “A25 will get me there on time, if there‘s no accident on the bridge and it doesn’t rain and my tires remain intact etc etc.” (A1440 might reasonably be said to get me there on time but I'd have to stay overnight in the airport …)

  4. Methods for handling uncertainty • Default or nonmonotonic logic: • Assume my car does not have a flat tire • Assume A25 works unless contradicted by evidence • Issues: What assumptions are reasonable? How to handle contradiction? • Rules with fudge factors: • A25 |→0.3 get there on time • Sprinkler |→0.99WetGrass • WetGrass |→0.7Rain • Issues: Problems with combination, e.g., Sprinkler causes Rain?? • Probability • Model agent's degree of belief Given the available evidence, A25will get me there on time with probability 0.04

  5. Probability Probabilistic assertions summarize effects of • laziness: failure to enumerate exceptions, qualifications, etc. ignorance: lack of relevant facts, initial conditions, etc. Subjective probability: • Probabilities relate propositions to agent's own state of knowledge e.g., P(A25 | no reported accidents) = 0.06 These are not assertions about the world Probabilities of propositions change with new evidence: e.g., P(A25 | no reported accidents, 5 a.m.) = 0.15

  6. Bayes Network: Example BATTERY AGE FAN-BLET BROKEN ALTERNATOR BROKEN BATTERY DEAD NOT CHARGING BATTERY METER BATTERY FLAT NO OIL NO GAS FUEL LINE BLOCKED STARTER BROKEN LIGHTS CAR WON’T START OIL LIGHT GAS GAUGE DIP STICK

  7. Bayes Network: Example BATTERY AGE FAN-BLET BROKEN ALTERNATOR BROKEN BATTERY DEAD NOT CHARGING BATTERY METER BATTERY FLAT NO OIL NO GAS FUEL LINE BLOCKED STARTER BROKEN LIGHTS CAR WON’T START OIL LIGHT GAS GAUGE DIP STICK

  8. Probabilities: Coin Flip • Suppose the probability for heads is 0.5. What's the probability for it coming up tails? • P(H) = 1/2 • P(T) = __?__

  9. Probabilities: Coin Flip • Suppose the probability for heads is 1/4. What's the probability for it coming up tails? • P(H) = 1/4 • P(T) = __?__

  10. Probabilities: Coin Flip • Suppose the probability for heads is 1/2. Each of the coin flip is independent. What's the probability for it coming up three heads in a row? • P(H) = 1/2 • P(H, H, H) = __?__

  11. Probabilities: Coin Flip • Xi = result of i-th coin flig; • Xi = {H, T}; and • Pi(H) = 1/2 ∀i • P(X1=X2=X3=X4) = __?__

  12. Probabilities: Coin Flip • Xi = result of i-th coin flig; • Xi = {H, T}; and • Pi(H) = 1/2 ∀i • P(X1=X2=X3=X4) = __?__ • P(X1=X2=X3=X4=H) + P(X1=X2=X3=X4=T) = 1/8

  13. Probabilities: Coin Flip • Xi = result of i-th coin flig; • Xi = {H, T}; and • Pi(H) = 1/2 ∀i • P({X1,X2,X3,X4} contains at least 3 H) = __?__

  14. Probabilities: Coin Flip • Xi = result of i-th coin flig; • Xi = {H, T}; and • Pi(H) = 1/2 ∀i • P({X1,X2,X3,X4} contains at least 3 H) = __?__ • P(HHHH) + P(HHHT) + P(HHTH) + P(HTHH) + P(THHH) = 5*1/16 = 5/16

  15. Probabilities: Summary • Complementary probability: • P(A) = p; then P(¬A) = 1 – p • Independence: • X⊥Y; then P(X)P(Y) = P(X,Y) marginal joint probability

  16. Dependence • Given, P(X1=H)= ½ • H: P(X2=H|X1=H) = 0.9 • T: P(X2=T|X1=T) = 0.8 • P(X2=H) = __?__

  17. Dependence • Given, P(X1=H)= ½ • H: P(X2=H|X1=H) = 0.9 • T: P(X2=T|X1=T) = 0.8 • P(X2=H) = __?__ • P(X2=H|X1=H) * P(X1=H) + P(X2=H|X1=T) * P(X1=T) • = 0.9 * ½ + (1 – 0.8) * ½ = 0.55

  18. What we have learned? • Total probability: • Negation of probabilities • What about?

  19. What we have learned? • Negation of probabilities • What about? • You can negate the event (X), but you can never negate the conditional variable (Y).

  20. Example: Weather • Given, • P(D1); P(D1=Sunny) = 0.9 • P(D2=Sunny|D1=Sunny) = 0.8 • P(D2=Rainy|D1=Sunny) = ??

  21. Example: Weather • Given, • P(D1); P(D1=Sunny) = 0.9 • P(D2=Sunny|D1=Sunny) = 0.8 • P(D2=Rainy|D1=Sunny) = ?? • 1 - P(D2=Sunny|D1=Sunny) = 0.2 • P(D2=Sunny|D1=Rainy) = 0.6 • P(D2=Rainy|D1=Rainy) = ?? • 1 - P(D2=Sunny|D1=Rainy) = 0.4 • Assume the transition probabilities from D2 to D3 are the same: • P(D2=Sunny) = ??; and P(D3=Sunny) = ??

  22. Example: Weather • Given, • P(D1); P(D1=Sunny) = 0.9 • P(D2=Sunny|D1=Sunny) = 0.8 • P(D2=Sunny|D1=Rainy) = 0.6 • Assume the transition probabilities from D2 to D3 are the same: • P(D2=Sunny) = 0.78; • P(D2=Sunny|D1=Sunny) * P(D1=Sunny) + P(D2=Sunny|D1=Rainy) * P(D1=Rainy) = 0.8*0.9 + 0.6*0.1 = 0.78 • P(D3=Sunny) = 0.756 • P(D3=Sunny|D2=Sunny) * P(D2=Sunny) + P(D3=Sunny|D2=Rainy) * P(D2=Rainy) = 0.8*0.78 + 0.6*0.22 = 0.756

  23. Example: Cancer • There exists a type of cancer, where 1% of the population will carry the disease. • P(C) = 0.01; P(¬C) = 1- 0.01 = 0.99 • There exists a test of the cancer. • P(+|C) = 0.9; P(-|C) = 0.1 • P(+|¬C) = 0.2; P(-|¬C) = 0.8 • P(C|+) = ?? • Joint probabilities: • P(+, C) = ??; P(-, C) = ?? • P(+, ¬C) = ??; P(-, ¬C) = ??

  24. Example: Cancer • There exists a type of cancer, where 1% of the population will carry the disease. • P(C) = 0.01; P(¬C) = 1- 0.01 = 0.99 • There exists a test of the cancer. • P(+|C) = 0.9; P(-|C) = 0.1 • P(+|¬C) = 0.2; P(-|¬C) = 0.8 • P(C|+) = ?? • Joint probabilities: • P(+, C) = 0.009; P(-, C) = 0.001 • P(+, ¬C) = 0.198; P(-, ¬C) = 0.792

  25. Example: Cancer • There exists a type of cancer, where 1% of the population will carry the disease. • P(C) = 0.01; P(¬C) = 1- 0.01 = 0.99 • There exists a test of the cancer. • P(+|C) = 0.9; P(-|C) = 0.1 • P(+|¬C) = 0.2; P(-|¬C) = 0.8 • P(C|+) = 0.043 • P(+, C) / (P(+, C) + P(+, ¬C)) • 0.009 / 0.009 + 0.198 = 0.043 • Joint probabilities: • P(+, C) = 0.009; P(-, C) = 0.001 • P(+, ¬C) = 0.198; P(-, ¬C) = 0.792

  26. Bayes Rule LIKELILHOOD PRIOR POSTERIOR MARGINAL LIKEIIHOOD TOTAL PROBABILITY

  27. Bayes Rule: Cancer Example LIKELILHOOD PRIOR POSTERIOR MARGINAL LIKEIIHOOD

  28. Bayes Network • Graphically, • Diagnostic reasoning: P(A|B) or P(A|¬B) • How many parameters?? A P(A) Not observable B observable P(B|A) P(B|¬A)

  29. Two test cancer example • 2-Test Cancer Example • P(C|T1=+,T2=+) = P(C|++) = ?? C P(C) = 0.01; P(¬C) = 0.99 P(+|C) = 0.9; P(-|C) = 0.1 P(-|¬C) = 0.8; P(+|¬C) = 0.2 T1 T2

  30. Two test cancer example • 2-Test Cancer Example • P(C|T1=+,T2=+) = P(C|++) = 0.1698 C P(C) = 0.01; P(¬C) = 0.99 P(+|C) = 0.9; P(-|C) = 0.1 P(-|¬C) = 0.8; P(+|¬C) = 0.2 T1 T2

  31. Bayes Rule: Compute

  32. Two test cancer example • 2-Test Cancer Example P(C) = 0.01; P(¬C) = 0.99 P(+|C) = 0.9; P(-|C) = 0.1 P(-|¬C) = 0.8; P(+|¬C) = 0.2 P(C|++) = ?? C T1 T2

  33. Two test cancer example • 2-Test Cancer Example P(C) = 0.01; P(¬C) = 0.99 P(+|C) = 0.9; P(-|C) = 0.1 P(-|¬C) = 0.8; P(+|¬C) = 0.2 P(C|+-) = ?? C T1 T2

  34. Two test cancer example • 2-Test Cancer Example P(C) = 0.01; P(¬C) = 0.99 P(+|C) = 0.9; P(-|C) = 0.1 P(-|¬C) = 0.8; P(+|¬C) = 0.2 P(C|+-) = 0.0056 C T1 T2

  35. Conditional Independence • 2-Test Cancer Example • We not only assume that T1 and T2 are identically distributed; but also conditionally independent. • P(T2|C,T1)=P(T2|C) P(C) = 0.01; P(¬C) = 0.99 P(+|C) = 0.9; P(-|C) = 0.1 P(-|¬C) = 0.8; P(+|¬C) = 0.2 C T1 T2

  36. Conditional Independence • Given A, B ⊥C • B ⊥C|A =? B⊥C A B C

  37. Conditional Independence • Given A, B ⊥C • B ⊥C|A =? B⊥C • Intuitively, getting a positive test result about cancer gives us information about whether you have cancer or not. • So if you get a positive test result you're going to raise the probability of having cancer relative to the prior probability. • With that increased probability we will predict that another test will with a higher likelihood give us a positive response than if we hadn't taken the previous test. A B C

  38. Two test cancer example • 2-Test Cancer Example P(C) = 0.01; P(¬C) = 0.99 P(+|C) = 0.9; P(-|C) = 0.1 P(-|¬C) = 0.8; P(+|¬C) = 0.2 P(T2=+|T1=+) = ?? C T1 T2

  39. Conditional independence: cancer example • 2-Test Cancer Example • Conditional independence: given that I know C, knowledge of the first test gives me no more information about the second test. • It only gives me information if C was unknown. P(C) = 0.01; P(¬C) = 0.99 P(+|C) = 0.9; P(-|C) = 0.1 P(-|¬C) = 0.8; P(+|¬C) = 0.2 C P(T2=+|T1=+) = P(+2|+1,C)P(C|+1) + P(+2 |+1, ¬C)P(¬C|+1) = P(+2|C)0.043 + P(+2|¬C)(1-0.043) = 0.9 * 0.043 + 0.2 * 0.957 = 0.2301 T1 T2 P(+2|+1,C) = P(+2|C)

  40. Absolute and Conditional A B A⊥B C A⊥B|C A B A⊥B => A ⊥B | C ?? A⊥B | C => A ⊥B ??

  41. Absolute and Conditional A B A⊥B C A⊥B|C A B A⊥B => A ⊥B | C ?? A⊥B | C => A ⊥B ??

  42. Confounding Cause S R C H P(S) = 0.7 P(R) = 0.01 A B P(H|S, R) = 1 P(H|¬S, R) = 0.9 P(H|S, ¬R) = 0.7 P(H|¬S, ¬R) = 0.1 P(R|S) = ??

  43. Confounding Cause S R C H P(S) = 0.7 P(R) = 0.01 A B P(H|S, R) = 1 P(H|¬S, R) = 0.9 P(H|S, ¬R) = 0.7 P(H|¬S, ¬R) = 0.1 P(R|S) = 0.01

  44. Explaining Away S R P(H|S, R) = 1 P(H|¬S, R) = 0.9 P(H|S, ¬R) = 0.7 P(H|¬S, ¬R) = 0.1 P(S) = 0.7 P(R) = 0.01 H P(R|H,S) = ?? Explaining away means that if we know that we are happy, then sunny weather can explain away the cause of happiness. - If I know that it’s sunny, it becomes less likely that I received a raise. If we see a certain effect that could be caused by multiple causes, seeing one of those causes can explain away any other potential cause of this effect over here.

  45. Explaining Away S R P(H|S, R) = 1 P(H|¬S, R) = 0.9 P(H|S, ¬R) = 0.7 P(H|¬S, ¬R) = 0.1 P(S) = 0.7 P(R) = 0.01 H P(R|H,S) = 0.0142 P(R|H,S) = P(H|R,S)P(R|S) / P(H|S) = P(H|R,S)P(R)/ P(H|R,S)P(R) + P(H| ¬R, S)P(¬R) = 1* 0.01/ 1* 0.01+ 0.7*(1-0.01) = 0.0142 P(R|H) = ??

  46. Explaining Away S R P(H|S, R) = 1 P(H|¬S, R) = 0.9 P(H|S, ¬R) = 0.7 P(H|¬S, ¬R) = 0.1 P(S) = 0.7 P(R) = 0.01 P(H) = 0.5245 P(R|H,S) = 0.0142 H P(R|H) = 0.01849 P(R|H, ¬S) = ?? P(R|H) = P(H|R)P(R) / P(H) = (P(H|R,S)P(S) + P(H|R,¬S)P(¬S))P(R)/P(H) =0.97 * 0.01 /0.5245 =0.01849

  47. Explaining Away S R P(H|S, R) = 1 P(H|¬S, R) = 0.9 P(H|S, ¬R) = 0.7 P(H|¬S, ¬R) = 0.1 P(S) = 0.7 P(R) = 0.01 P(H) = 0.5245 P(R|H,S) = 0.0142 H P(R|H) = 0.01849 P(R|H, ¬S) = 0.0833 P(R|H,¬S) P(H|R, ¬S)P(R|¬S) / P(H|¬S) = P(H|R, ¬S)P(R)/ P(H|R, ¬S)P(R) + P(H| ¬R, ¬S)P(¬R) = 0.0833

  48. Conditional Dependence S R P(H|S, R) = 1 P(H|¬S, R) = 0.9 P(H|S, ¬R) = 0.7 P(H|¬S, ¬R) = 0.1 P(S) = 0.7 P(R) = 0.01 H P(R|H,S) = 1.42% ≠ P(R|H) = 1.849% P(R|S) = 0.01 = P(R) since R⊥S P(R|H, ¬S) = 8.33% R⊥S | H R⊥S Independence does not imply conditional independence!!!

  49. Absolute and Conditional A B A B A⊥B C C A⊥B | C => A ⊥B ?? A B A⊥B => A ⊥B | C ??

  50. Bayes Networks • Bayes networks define probability distributions over graphs of random variables. • Instead of enumerating all possibilities of the combinations of random variables, the Bayes network is defined by probability distributions that are inherent to each individual node. • The joint probability represented by a Bayes network is the product of various Bayes network probabilities that are defined over individual nodes where each node's probability is only conditioned on the incoming arcs. • P(A)P(B)P(C|A,B)P(D|C)P(E|C) • 10 probability values P(A), P(B) A B P(C|A,B) C D E P(D|C) P(E|C) • 25-1 = 31 probability values

More Related