1 / 37

Ch5 Stochastic Methods

Ch5 Stochastic Methods. Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2011. Outline. Introduction Intro to Probability Baye’s Theory Naïve Baye’s Theory Application’s of the Stochastic Methods.

justice
Download Presentation

Ch5 Stochastic Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ch5 Stochastic Methods Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2011

  2. Outline • Introduction • Intro to Probability • Baye’s Theory • Naïve Baye’s Theory • Application’s of the Stochastic Methods

  3. Chapter 4 introduced heuristic search as an approach to problem solving in domains where A problem does not have an exact solution or The full state space may be to costly to calculate Introduction

  4. Introduction • Important application domains for the use of the stochastic method are • diagnostic reasoning where cause/effect relationships are not always captured in a purely deterministic fashion • Gambling

  5. Outline • Introduction • Intro to Probability • Baye’s Theory • Naïve Baye’s Theory • Application’s of the Stochastic Methods

  6. Elements of Probability Theory • Elementary Event • An elementary or atomic event is a happing or occurrence that cannot be made up of other events • Event E • An event is a set of elementary events • Sample Space, S • The set of all possible outcomes of an event E • Probability, p • The probability of an event E in a sample space is the ratio of the number of elements in E to the total number of possible outcomes

  7. Elements of Probability Theory • For example, what is the probability that a 7 or an 11 are the result of the roll of two fair dice? • Elementary Event: play two dice • Event: Roll the dice • Sample Space: each die has 6 outcomes, so the total set of outcomes of the two dice is 36

  8. Elements of Probability Theory • The number of combinations of the two dice that can give an 7 is 1,6; 2,5; 3,4; 4,3; 5,2 and 6,1 • So the probability of rolling a 7 is 6/36 • The number of combinations of the two dice that can give an 11 is 5,6; 6,5 • So the probability of rolling a 11 is 2/36 • Thus, the probability to the answer is 6/36 + 2/36 =2/9

  9. Probability Reasoning • Suppose you are driving the interstate highway and realize you are gradually slowing down because of increase traffic congestion • The you access to the state highway statistics and download the relevant statistical information

  10. Probability Reasoning • In this situation, we have 3 parameter • Slowing down (S): T or F • Whether or not there’s an accident (A): T or F • Whether or not there’s a road construction (C): T of F

  11. Probability Reasoning • We may also present it in the traditional Venn Diagram

  12. Elements of Probability Theory • Two events A and B are independent if and only if the probability of their both occurring is equal to the product o their occurring individually • P(A B) = P(A) * P(B)

  13. Elements of Probability Theory • Consider the situation where bit strings of length 4 are randomly generated • We want to know whether the event of the bit sting containing an even number of 1s is independent of the event where the bit string ends with 0 • We know the total space is 2^4 = 16

  14. Elements of Probability Theory • There are 8 bit strings of length four that end with 0 • There are 8 bit strings of length four that have even number of 1’s • The number of bit strings that have both an even number of 1s and end with 0 is 4: {1100, 1010, 0110, 0000}

  15. Elements of Probability Theory • P({even number of 1s} {end with 0})=p({even number of 1s}) * p({end with 0}) • 4/16=8/16*8/16

  16. Probability Reasoning • Finally, the conditional probability • p(d|s) = |d s|/|s|

  17. Probability Reasoning

  18. Outline • Introduction • Intro to Probability • Baye’s Theory • Naïve Baye’s Theory • Application’s of the Stochastic Methods

  19. Bayes’ Theorem • P(A), P(B) is the prior probability • P(A|B) is the conditional probability of A, given B. • P(B|A) is the conditional probability of B, given A.

  20. Bayes’ Theorem • Suppose there is a school with 60% boys and 40% girls as its students. • The female students wear trousers (50%) or skirts (50%) in equal numbers; the boys all wear trousers. • An observer sees a (random) student from a distance, and what the observer can see is that this student is wearing trousers. • What is the probability this student is a girl? The correct answer can be computed using Bayes' theorem

  21. Bayes’ Theorem • P(B|A), or the probability of the student wearing trousers given that the student is a girl. Since girls are as likely to wear skirts as trousers, this is 0.5. • P(A), or the probability that the student is a girl regardless of any other information, this probability equals 0.4. • P(B), or the probability of a (randomly selected) student wearing trousers regardless of any other information. Since half of the girls and all of the boys are wearing trousers, this is 0.5×0.4 + 1.0×0.6 = 0.8.

  22. Bayes’ Theorem

  23. Outline • Introduction • Intro to Probability • Baye’s Theory • Naïve Baye’s Theory • Application’s of the Stochastic Methods

  24. Naïve Bayesian Classifier: Training Dataset Class: C1:buys_computer = ‘yes’ C2:buys_computer = ‘no’ Data sample X = (age <=30, Income = medium, Student = yes Credit_rating = Fair)

  25. Bayesian Theorem: Basics Let X be a data sample Let H be a hypothesis (our prediction) that X belongs to class C Classification is to determine P(H|X), the probability that the hypothesis holds given the observed data sample X Example: customer X will buy a computer given that know the customer’s age and income

  26. Naïve Bayesian Classifier: Training Dataset Class: C1:buys_computer = ‘yes’ C2:buys_computer = ‘no’ Data sample X = (age <=30, Income = medium, Student = yes Credit_rating = Fair)

  27. Naïve Bayesian Classifier: An Example P(Ci): P(buys_computer = “yes”) = 9/14 = 0.643 P(buys_computer = “no”) = 5/14= 0.357 Compute P(X|Ci) for each class P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222 P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6 P(income = “medium” | buys_computer = “yes”) = 4/9 = 0.444 P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4 P(student = “yes” | buys_computer = “yes) = 6/9 = 0.667 P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2 P(credit_rating = “fair” | buys_computer = “yes”) = 6/9 = 0.667 P(credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4

  28. Naïve Bayesian Classifier: An Example X = (age <= 30 , income = medium, student = yes, credit_rating = fair) P(X|Ci) : P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044 P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019 P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028 P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007 Therefore, X belongs to class (“buys_computer = yes”)

  29. Towards Naïve Bayesian Classifier This can be derived from Bayes’ theorem Since P(X) is constant for all classes, only needs to be maximized

  30. Naïve Bayesian Classifier: An Example Test on the following example: X = (age > 30, Income = Low, Student = yes Credit_rating = Excellent)

  31. Outline • Introduction • Intro to Probability • Baye’s Theory • Naïve Baye’s Theory • Application’s of the Stochastic Methods

  32. Tomato • You say [t ow m ey t ow] and I say [t ow m aa t ow] • Probabilistic finite machine • A finite state machine where the next state function is a probability distribution over the full set of states of the machine • Probabilistic finite state acceptor • An acceptor, whene one or more states are indicates as the start state and one or more as the accept states

  33. So how is “Tomato” pronounced • A probabilistic finite state acceptor for the pronunciation of “tomato”, adapted from Jurafsky and Martin (2000).

  34. Natural Language Processing • IN the second example, we consider the phoneme recognition problem, • Often called decoding • Suppose a phoneme recognition algorithm has identified the phone ni (as in “knee”) that occurs just after the recognized word I

  35. Natural Language Processing • We want to associate ni with either a word or the first part of the word • Then we need Switchboard Corpora, which is 1.4M word collection of telephone conversation,to assist us.

  36. Natural Language Processing

  37. Natural Language Processing • We next apply a form of Naïve Baye’s theorem to analysis the phone ni following I

More Related