1 / 29

Announcements

Announcements. Homework 8 due today, November 13 ½ to 1 page description of final project due Thursday, November 15 Current Events Christian - now Jeff - Thursday Research Paper due Tuesday, November 20. Probabilistic Reasoning. Lecture 15. Probabilistic Reasoning.

terra
Download Presentation

Announcements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Announcements • Homework 8 due today, November 13 • ½ to 1 page description of final project due Thursday, November 15 • Current Events • Christian - now • Jeff - Thursday • Research Paper due Tuesday, November 20 CS 484 – Artificial Intelligence

  2. Probabilistic Reasoning Lecture 15

  3. Probabilistic Reasoning • Logic deals with certainties • A → B • Probabilities are expressed in a notation similar to that of predicates in First Order Predicate Calculus: • P(R) = 0.7 • P(S) = 0.1 • P(¬(A Λ B) V C) = 0.2 • 1 = certain; 0 = certainly not CS 484 – Artificial Intelligence

  4. What's the probability that either A is true or B is true? Venn Diagram A A Λ B B P(A V B) = CS 484 – Artificial Intelligence

  5. Conditional Probability • Conditional probability refers to the probability of one thing given that we already know another to be true: • This states the probability of B, given A. A A Λ B B CS 484 – Artificial Intelligence

  6. Calculate • P(R|S) given that the probability of rain is 0.7, the probability of sun is 0.1 and the probability of rain and sun is 0.01 • P(R|S) = • Note: P(A|B) ≠ P(B|A) CS 484 – Artificial Intelligence

  7. Joint Probability Distributions • A joint probability distribution represents the combined probabilities of two or more variables. • This table shows, for example, that P (A Λ B) = 0.11 P (¬A Λ B) = 0.09 • Using this, we can calculate P(A): P(A) = P(A Λ B) + P(A Λ ¬B) = 0.11 + 0.63 = 0.74 A Λ B A B CS 484 – Artificial Intelligence

  8. Bayes’ Theorem • Bayes’ theorem lets us calculate a conditional probability: • P(B) is the prior probability of B. • P(B | A) is the posterior probability of B. CS 484 – Artificial Intelligence

  9. Bayes' Theorem Deduction • Recall: CS 484 – Artificial Intelligence

  10. Medical Diagnosis • Data • 80% of the time you have a cold, you also have a high temperature. • At any one time, 1 in every 10,000 people has a cold • 1 in every 1000 people has a high temperature • Suppose you have a high temperature. What is the likelihood that you have a cold? CS 484 – Artificial Intelligence

  11. Witness Reliability • A hit-and-run incident has been reported, and an eye witness has stated she is certain that the car was a white taxi. • How likely is she right? • Facts: • Yellow taxi company has 90 cars • White taxi company has 10 cars • Expert says that given the foggy weather, the witness has 75% chance of correctly identifying the taxi CS 484 – Artificial Intelligence

  12. Witness Reliability – Prior Probability • Imagine lady shown a sequence of 1000 cars • Expect 900 to be yellow and 100 to be white • Given 75% accuracy, how many will she say are white and yellow • Of 900 yellow cars, says yellow and says white • Of 100 yellow cars, says yellow and says white • What is the probability women says white? • How likely is she right? CS 484 – Artificial Intelligence

  13. Comparing Conditional Probabilities • Medical diagnosis • Probability of cold (C) is 0.0001 • P(HT|C) = 0.8 • Probability of plague (P) is 0.000000001 • P(HT|P) = 0.99 • Relative likelihood of cold and plague CS 484 – Artificial Intelligence

  14. Simple Bayesian Concept Learning (1) • P (H|E) is used to represent the probability that some hypothesis, H, is true, given evidence E. • Let us suppose we have a set of hypotheses H1…Hn. • For each Hi • Hence, given a piece of evidence, a learner can determine which is the most likely explanation by finding the hypothesis that has the highest posterior probability. CS 484 – Artificial Intelligence

  15. Simple Bayesian Concept Learning (2) • In fact, this can be simplified. • Since P(E) is independent of Hiit will have the same value for each hypothesis. • Hence, it can be ignored, and we can find the hypothesis with the highest value of: • We can simplify this further if all the hypotheses are equally likely, in which case we simply seek the hypothesis with the highest value ofP(E|Hi). • This is the likelihood of E givenHi. CS 484 – Artificial Intelligence

  16. Bayesian Belief Networks (1) • A belief network shows the dependencies between a group of variables. • If two variables A and B are independent if the likelihood that A will occur has nothing to do with whether B occurs. • C and D are dependent on A; D and E are dependent on B. • The Bayesian belief network has probabilities associated with each link. E.g., P(C|A) = 0.2, P(C|¬A) = 0.4 CS 484 – Artificial Intelligence

  17. Bayesian Belief Networks (2) • A complete set of probabilities for this belief network might be: • P(A) = 0.1 • P(B) = 0.7 • P(C|A) = 0.2 • P(C|¬A) = 0.4 • P(D|A Λ B) = 0.5 • P(D|A Λ ¬B) = 0.4 • P(D|¬A Λ B) = 0.2 • P(D|¬A Λ ¬B) = 0.0001 • P(E|B) = 0.2 • P(E|¬B) = 0.1 CS 484 – Artificial Intelligence

  18. Bayesian Belief Networks (3) • We can now calculate conditional probabilities: • In fact, we can simplify this, since there are no dependencies between certain pairs of variables – between E and A, for example. Hence: CS 484 – Artificial Intelligence

  19. College Life Example • C = that you will go to college • S = that you will study • P = that you will party • E = that you will be successful in your exams • F = that you will have fun C S P E F CS 484 – Artificial Intelligence

  20. College Life Example C S P E F CS 484 – Artificial Intelligence

  21. College Example • Using the tables to solve problems such as P(C==true, S = true, P = false, E = true, F = false) == P(C,S, ¬P,E, ¬F) • General solution CS 484 – Artificial Intelligence

  22. Noisy-V Function • Want to assume know all reasons for a possible event • E.g. Medical Diagnosis System • P(HT|C) = 0.8 • P(HT|P) = 0.99 • Assume P(HT|C V P) = 1 (?) • Assumption clearly not true • Leak node – represents all other causes • P(HT|O) = 0.9 • Define noise parameters – conditional probabilities for ¬HT • P(¬ HT|C) = 1 – P(HT|C) = 0.2 • P(¬ HT|P) = • P(¬ HT|O) = • Further assumption – the causes of a high temperature are independent of each other and the noisy parameters are independent CS 484 – Artificial Intelligence

  23. Noisy V-Function • Benefit of Noisy V-Function • If cold, plague, and other is all false, P(¬HT) = 1 • Otherwise, P(¬HT) is equal to product of the noise parameters for all the variables that are true • E.g. If plague and other is true and cold is false, P(HT) = 1 – (0.01 * 0.1) = 0.999 • Benefit – don’t need to store as many values as the Bayesian belief network CS 484 – Artificial Intelligence

  24. Bayes’ Optimal Classifier • A system that uses Bayes’ theory to classify data. • We have a piece of data y, and are seeking the correct hypothesis from H1 … H5, each of which assigns a classification to y. • The probability that y should be classified as cjis: • x1 to xn are the training data, and m is the number of hypotheses. • This method provides the best possible classification for a piece of data. • Example: Given some date will classify it as true or false • P(true|x1,…,xn) = • P(false|x1,…,xn) = CS 484 – Artificial Intelligence

  25. The Naïve Bayes Classifier (1) • A vector of data is classified as a single classification. p(ci| d1, …, dn) • The classification with the highest posterior probability is chosen. • The hypothesis which has the highest posterior probability is the maximum a posteriori, or MAP hypothesis. • In this case, we are looking for the MAP classification. • Bayes’ theorem is used to find the posterior probability: CS 484 – Artificial Intelligence

  26. The Naïve Bayes Classifier (2) • Since P(d1, …, dn) is a constant, independent of ci, we can eliminate it, and simply aim to find the classification ci, for which the following is maximised: • We now assume that all the attributes d1, …, dnare independent • So P(d1, …, dn|ci) can be rewritten as: • The classification for which this is highest is chosen to classify the data. CS 484 – Artificial Intelligence

  27. Classifier Example Training Data • New piece of data to classify • (x = 2, y = 3, z =4) • Want P(ci|x=2,y=3,z=4) • P(A) * P(x=2|A) * P(y=3|A) * P(z=4|A) • P(B) * P(x=2|B) * P(y=3|B) * P(z=4|B) CS 484 – Artificial Intelligence

  28. M-estimate • Problem with too little training data • (x=1, y=2, z=2) • P(x=1 | B) = 1/4 • P(y=2 | B) = 2/4 • P(z=2 | B) = 0 • Avoid problem by using M-estimate which pads the computation with additional samples • Conditional probability = (a + mp) / (b + m) • m = 5 (equivalent sample size) • p = 1/num_values_for_category (1/4 for x) • a = training example with category value and classification (x=1 and B is 1) • b = training examples with classification (B is 4) CS 484 – Artificial Intelligence

  29. Collaborative Filtering • A method that uses Bayesian reasoning to suggest items that a person might be interested in, based on their known interests. • If we know that Anne and Bob both like A, B and C, and that Anne likes D then we guess that Bob would also like D. • P(Bob likes Z | Bob likes A, Bob likes B, …, Bob likes Y) • Can be calculated using decision trees: B CS 484 – Artificial Intelligence

More Related