1 / 44

Probability Distributions Continued

Probability Distributions Continued. Module 2b. Hypergeometric Distribution. Sampling Without Replacement: Consider a batch of 20 microwave modules, of which 3 are defective. We sample and test one module first, and find it is defective.

nonnie
Download Presentation

Probability Distributions Continued

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probability Distributions Continued Module 2b

  2. Hypergeometric Distribution Sampling Without Replacement: Consider a batch of 20 microwave modules, of which 3 are defective. We sample and test one module first, and find it is defective. • at this time, there was a 3/20 chance of obtaining a defect We sample a second time, without replacing the module: • this time, there is a 2/19 chance of obtaining a defect • the outcome of the second trial is no longer independent of the first trial • probability of success/failure changes with each trial K. McAuley

  3. Hypergeometric Distribution Let’s figure out how to solve problems where trials aren’t independent. We’ll use a counting approach instead of looking at the probability of a success on each trial. • Suppose we have a total of 20 objects in the batch, of which 3 are defective (so 17 are good) • We take out 8 objects, all at once, andwe want to know the probability that 2 are defective (and 6 are not) • How many ways could we select 8 objects from the batch of 20? • How many possible ways could we select 2 defective items when there are 3 in the batch? • How many ways could we select 6 good items when there are 17 good items in the batch? K. McAuley

  4. Hypergeometric Distribution Let’s do it in general: • Suppose we have a total of N objects in the batch, of which d are defective • We randomly take out n objects, and we want to know the probability of x of them being defective • There are ways of taking the sample of n objects • There are ways of selecting the x defective objects • There are ways of selecting the n-x good objects K. McAuley

  5. Hypergeometric Distribution Example • Given a batch of 200 dashboard components, of which 10% are typically defective • We take a sample of 10 components and test without replacement • What is the probability of 3 defective components? • What is the probability of finding 0 defective components? K. McAuley

  6. Poisson Distribution Example: Consider a 100 km section of the 401, in which car accidents occur randomly and independently. The average number of accidents in the 100 km section (per month) is 15. Let’s make some predictions about what might happen next month. What is the probability of a) 0 accidents occurring b) 10 accidents occurring K. McAuley

  7. Poisson Distribution • Used when considering discrete occurrences in a continuous interval • e.g., # of breakages in 500 m of yarn • e.g., # of times photocopier will jam during 1 year • Derived from a Binomial distribution in which the number of trials is very large • To use the Poisson Distribution, we must be willing to assume that occurrences in different segments of the interval are independent. Why? K. McAuley

  8. overall interval Poisson Distribution • Consider the time or space interval of interest and divide it into small sub-intervals • Assume that what happens in each sub-interval is a Bernoulli trial, with a probability p of success. • To get the Poisson Distribution, make the size of the sub-intervals very small. Why? K. McAuley

  9. Poisson Distribution • Remember the Binomial distribution: • If we take the limit as the size of the sub-intervals  0 and the number of sub-intervals  , but keep the average number of successes = np constant for the total interval we get the Poisson Distribution: K. McAuley

  10. Poisson Distribution • We can also define , the average number of occurrences per unit time (or length) so that =  t, where t is the length of the interval of interest K. McAuley

  11. Poisson Distribution Example: Consider a 100 km section of the 401, in which car accidents occur randomly and independently. The average number of accidents in the 100 km section (per month) is 15. Let’s make some predictions about what might happen next month. What is the probability of a) 0 accidents occurring b) 10 accidents occurring K. McAuley

  12. Poisson Distribution - Example  = 15 occurrences on average, over the interval of interest (or  = 15 occurrences per month and t=1 month) K. McAuley

  13. Poisson Distribution Mean: • we identified  as the average number of occurrences in the interval, so this makes sense. Variance: • What do we think about this? K. McAuley

  14. Poisson Distribution Additional Notes: • The Poisson distribution can be used to approximate the Binomial distribution, when the number of independent trials is very large, and p is small • use  = n p • Why is the approximation helpful? • if n=1000 trials, we must calculate 1000! which is very very large. • e.g., for n > 20, p < 0.05 - approximation is good for n > 100, p < 0.01 - approximation is even better K. McAuley

  15. Continuous Random Variables

  16. Continuous Random Variables Outcomes are values along the real number line. Examples: Temperature or pressure measurements reported to a large number of decimals Problem: There are infinitely many possible values of X, so the probability of obtaining any particular value is vanishingly small. We need to think about the probability that X will be in a particular interval when defining probability distributions for continuous variables. K. McAuley

  17. Probability Density Functions Consider a probability density function fX(x) We get probabilities using areas under this curve • fX(x) is like a “continuous histogram” with the total area under the curve equal to 1. • Integrate fX(x) between particular values of x to get probability that X will be in the range of interest K. McAuley

  18. Probability Density Function Example - Normal probability density function - the familiar “bell-shaped” curve What is the probability that 1.0<X<2.5? K. McAuley

  19. Cumulative Distribution Function What is P(X<)? • e.g., P(Temperature<350 C) Cumulative Distribution Function K. McAuley

  20. Expected Value (or Mean) We can also define the expected value operation in a manner analogous to the discrete case • Use an integral instead of a summation K. McAuley

  21. Variance … is the expected squared deviation from the mean Standard deviation is the square root of variance. K. McAuley

  22. Expected Values Just like the discrete case, we can find the expected value for any function of the random variable, if we know the probability density function K. McAuley

  23. Important continuous distributions We will learn about these now • Uniform distribution • Exponential distribution • Normal distribution and these ones later when we need them • Student’s t-distribution • Chi-squared distribution • F-distribution K. McAuley

  24. Uniform Distribution • We have values that occur in an interval • e.g., composition is between 2.5 and 3.5 g/Land the probability is equal (uniform) across the interval, but zero elsewhere fX(x) x a b K. McAuley

  25. Uniform Distribution What is the probability density function? • Area of the rectangle must equal 1. Why?  height = 1/(b-a) K. McAuley

  26. Uniform Distribution Mean - • Does this match our intuition? Variance - • How could we prove this? • What happens to the variance as b and a get further apart? K. McAuley

  27. Uniform Distribution When would we use a uniform distribution? Examples • readout from a pressure gauge • if we are provided only with the pressure in Pa to the nearest integer, the true pressure could be anywhere from 0.5 Pa below the reading to 0.5 Pa above • in the absence of any additional information, we assume that values are distributed uniformly between these two limits • another example - numerical truncation and round-off in computations K. McAuley

  28. Normal Distribution • One of the most important distributions. Why? The Normal distribution at the left has =1 and 2=1 K. McAuley

  29. Normal Distribution • is symmetric • centre is at the mean • variance and standard deviation are measures of the width of the distribution Cumulative distribution function: • Unfortunately, this integral has no analytical solution, so we rely on numerical integration results in tables • values in tables for µ=0 and  =1 are in Appendix of text. K. McAuley

  30. Standard Normal Distribution Problem • We don’t have a table for each possible mean and standard deviation Solution • Apply a transformation and use standard normal distribution tables If X is the original normally distributed random variable with mean µX and standard deviation X, thenZ has a mean of zero and a standard deviation of 1. K. McAuley

  31. Standard Normal Distribution • mean of Z • variance of Z What rules about expectations were used to show this? Which things are random and which aren’t? K. McAuley

  32. Using the Standard Normal Tables • What is P(Z < 1.96)? • What is P(Z < -1.96)? • What is P(-1.96 < Z < 1.96)? K. McAuley

  33. Central Limit Theorem • Why is the Normal distribution so important? • Because the sum or average of the N random variables follows a Normal distribution if N is a large number • Imagine N independent random variables, each having the same distribution (any type of distribution at all) with mean  and variance 2 then the average becomes normally distributed as N becomes large. Z is the standard Normal distribution K. McAuley

  34. Central Limit Theorem • In many instances, the Normal distribution provides a reasonable approximation for quantities that are a sum or average of independent random variables • Course marks are sometimes normally distributed. Why? Why not? • Repeated measurements of the same variable often tend to a Normal distribution. Why? K. McAuley

  35. New Topic- Failures in Time Example problem: • We have an important pump on a recycle line • The packing fails on average 0.6 times/year • What is the probability of the pump packing failing before 1 year? • We could also say, what is probability that the “time to failure” is less than 1 year? K. McAuley

  36. Exponential Distribution Assume that events occur in time at an average rate  occurrences per unit time What is the probability that the first occurrence of the event happens before time “t”? Approach - • Similar to a Poisson problem but what is different? • P(event occurs before a given time) = 1 - P(event doesn’t occur during the entire time interval) K. McAuley

  37. Exponential Distribution • Event doesn’t occur in a given time means 0 occurrences • Poisson - with occurrence rate of t in time interval t. • P(event occurs before this time) = K. McAuley

  38. Exponential Distribution Denote continuous random variable X as the time to occurrence. Cumulative distribution function • Probability density function is Why? • Sometimes we know , the average number of failures per unit time, and sometimes we know , the mean time to failure: K. McAuley

  39. Pump Failure Problem • If the packing fails on average 0.6 times / year, what is the chance of failure within the first year? • P(pump fails within year) • 45% chance of failure within year • We derived the Exponential distribution from the Poisson Distribution, which came from the Binomial Distribution. What troubling assumptions are we making? K. McAuley

  40. Exponential Distribution - Notes • The time to failure is a continuous random variable • We assume that the expected failure rate is constant, and that it doesn’t increase as equipment wears out • We assume that failures are independent, and that each time increment is an independent trial • mean and variance: K. McAuley

  41. Exponential Distribution Problem Variations - • given mean time to failure, determine probability that time to failure is less than a given value • given fraction of components failing in a specified time, what is probability that time to failure is less than a given value? • what is probability that a component lasts at least a given time? Let’s make one up and do it? K. McAuley

  42. Exponential Distribution Example: If a component has operated for 100 hours already, what is the probability that it will operate for at least 200 hours all together before failing, i.e., P(X>200 | X> 100)=? • Let A = {X>100}, B = {X>200} • Remember conditional probability • for our events, we have Why? • A  B = B K. McAuley

  43. Exponential Distribution What does this tell us? K. McAuley

  44. Exponential Distribution Memoryless property of Exponential Distribution • The probability of the component lasting for another 100 hours, given that it has functioned for 100 hours, is simply probability of it lasting 100 hours • Prior history has no influence on probability of failure when exponential distribution is used • Is this how life works? • When are we justified in using the exponential distribution and when should we avoid it? K. McAuley

More Related