Probability Distributions Continued

Probability Distributions Continued Module 2b

Hypergeometric Distribution Sampling Without Replacement: Consider a batch of 20 microwave modules, of which 3 are defective. We sample and test one module first, and find it is defective. • at this time, there was a 3/20 chance of obtaining a defect We sample a second time, without replacing the module: • this time, there is a 2/19 chance of obtaining a defect • the outcome of the second trial is no longer independent of the first trial • probability of success/failure changes with each trial K. McAuley

Hypergeometric Distribution Let’s figure out how to solve problems where trials aren’t independent. We’ll use a counting approach instead of looking at the probability of a success on each trial. • Suppose we have a total of 20 objects in the batch, of which 3 are defective (so 17 are good) • We take out 8 objects, all at once, andwe want to know the probability that 2 are defective (and 6 are not) • How many ways could we select 8 objects from the batch of 20? • How many possible ways could we select 2 defective items when there are 3 in the batch? • How many ways could we select 6 good items when there are 17 good items in the batch? K. McAuley

Hypergeometric Distribution Let’s do it in general: • Suppose we have a total of N objects in the batch, of which d are defective • We randomly take out n objects, and we want to know the probability of x of them being defective • There are ways of taking the sample of n objects • There are ways of selecting the x defective objects • There are ways of selecting the n-x good objects K. McAuley

Hypergeometric Distribution Example • Given a batch of 200 dashboard components, of which 10% are typically defective • We take a sample of 10 components and test without replacement • What is the probability of 3 defective components? • What is the probability of finding 0 defective components? K. McAuley

Poisson Distribution Example: Consider a 100 km section of the 401, in which car accidents occur randomly and independently. The average number of accidents in the 100 km section (per month) is 15. Let’s make some predictions about what might happen next month. What is the probability of a) 0 accidents occurring b) 10 accidents occurring K. McAuley

Poisson Distribution • Used when considering discrete occurrences in a continuous interval • e.g., # of breakages in 500 m of yarn • e.g., # of times photocopier will jam during 1 year • Derived from a Binomial distribution in which the number of trials is very large • To use the Poisson Distribution, we must be willing to assume that occurrences in different segments of the interval are independent. Why? K. McAuley

overall interval Poisson Distribution • Consider the time or space interval of interest and divide it into small sub-intervals • Assume that what happens in each sub-interval is a Bernoulli trial, with a probability p of success. • To get the Poisson Distribution, make the size of the sub-intervals very small. Why? K. McAuley

Poisson Distribution • Remember the Binomial distribution: • If we take the limit as the size of the sub-intervals  0 and the number of sub-intervals  , but keep the average number of successes = np constant for the total interval we get the Poisson Distribution: K. McAuley

Poisson Distribution • We can also define , the average number of occurrences per unit time (or length) so that =  t, where t is the length of the interval of interest K. McAuley

Poisson Distribution Example: Consider a 100 km section of the 401, in which car accidents occur randomly and independently. The average number of accidents in the 100 km section (per month) is 15. Let’s make some predictions about what might happen next month. What is the probability of a) 0 accidents occurring b) 10 accidents occurring K. McAuley

Poisson Distribution - Example  = 15 occurrences on average, over the interval of interest (or  = 15 occurrences per month and t=1 month) K. McAuley

Poisson Distribution Mean: • we identified  as the average number of occurrences in the interval, so this makes sense. Variance: • What do we think about this? K. McAuley

Poisson Distribution Additional Notes: • The Poisson distribution can be used to approximate the Binomial distribution, when the number of independent trials is very large, and p is small • use  = n p • Why is the approximation helpful? • if n=1000 trials, we must calculate 1000! which is very very large. • e.g., for n > 20, p < 0.05 - approximation is good for n > 100, p < 0.01 - approximation is even better K. McAuley

Continuous Random Variables

Continuous Random Variables Outcomes are values along the real number line. Examples: Temperature or pressure measurements reported to a large number of decimals Problem: There are infinitely many possible values of X, so the probability of obtaining any particular value is vanishingly small. We need to think about the probability that X will be in a particular interval when defining probability distributions for continuous variables. K. McAuley

Probability Density Functions Consider a probability density function fX(x) We get probabilities using areas under this curve • fX(x) is like a “continuous histogram” with the total area under the curve equal to 1. • Integrate fX(x) between particular values of x to get probability that X will be in the range of interest K. McAuley

Probability Density Function Example - Normal probability density function - the familiar “bell-shaped” curve What is the probability that 1.0<X<2.5? K. McAuley

Cumulative Distribution Function What is P(X<)? • e.g., P(Temperature<350 C) Cumulative Distribution Function K. McAuley

Expected Value (or Mean) We can also define the expected value operation in a manner analogous to the discrete case • Use an integral instead of a summation K. McAuley

Variance … is the expected squared deviation from the mean Standard deviation is the square root of variance. K. McAuley

Expected Values Just like the discrete case, we can find the expected value for any function of the random variable, if we know the probability density function K. McAuley

Important continuous distributions We will learn about these now • Uniform distribution • Exponential distribution • Normal distribution and these ones later when we need them • Student’s t-distribution • Chi-squared distribution • F-distribution K. McAuley

Uniform Distribution • We have values that occur in an interval • e.g., composition is between 2.5 and 3.5 g/Land the probability is equal (uniform) across the interval, but zero elsewhere fX(x) x a b K. McAuley

Uniform Distribution What is the probability density function? • Area of the rectangle must equal 1. Why?  height = 1/(b-a) K. McAuley

Uniform Distribution Mean - • Does this match our intuition? Variance - • How could we prove this? • What happens to the variance as b and a get further apart? K. McAuley

Uniform Distribution When would we use a uniform distribution? Examples • readout from a pressure gauge • if we are provided only with the pressure in Pa to the nearest integer, the true pressure could be anywhere from 0.5 Pa below the reading to 0.5 Pa above • in the absence of any additional information, we assume that values are distributed uniformly between these two limits • another example - numerical truncation and round-off in computations K. McAuley

Normal Distribution • One of the most important distributions. Why? The Normal distribution at the left has =1 and 2=1 K. McAuley

Normal Distribution • is symmetric • centre is at the mean • variance and standard deviation are measures of the width of the distribution Cumulative distribution function: • Unfortunately, this integral has no analytical solution, so we rely on numerical integration results in tables • values in tables for µ=0 and  =1 are in Appendix of text. K. McAuley

Standard Normal Distribution Problem • We don’t have a table for each possible mean and standard deviation Solution • Apply a transformation and use standard normal distribution tables If X is the original normally distributed random variable with mean µX and standard deviation X, thenZ has a mean of zero and a standard deviation of 1. K. McAuley

Standard Normal Distribution • mean of Z • variance of Z What rules about expectations were used to show this? Which things are random and which aren’t? K. McAuley

Using the Standard Normal Tables • What is P(Z < 1.96)? • What is P(Z < -1.96)? • What is P(-1.96 < Z < 1.96)? K. McAuley

Central Limit Theorem • Why is the Normal distribution so important? • Because the sum or average of the N random variables follows a Normal distribution if N is a large number • Imagine N independent random variables, each having the same distribution (any type of distribution at all) with mean  and variance 2 then the average becomes normally distributed as N becomes large. Z is the standard Normal distribution K. McAuley

Central Limit Theorem • In many instances, the Normal distribution provides a reasonable approximation for quantities that are a sum or average of independent random variables • Course marks are sometimes normally distributed. Why? Why not? • Repeated measurements of the same variable often tend to a Normal distribution. Why? K. McAuley

New Topic- Failures in Time Example problem: • We have an important pump on a recycle line • The packing fails on average 0.6 times/year • What is the probability of the pump packing failing before 1 year? • We could also say, what is probability that the “time to failure” is less than 1 year? K. McAuley

Exponential Distribution Assume that events occur in time at an average rate  occurrences per unit time What is the probability that the first occurrence of the event happens before time “t”? Approach - • Similar to a Poisson problem but what is different? • P(event occurs before a given time) = 1 - P(event doesn’t occur during the entire time interval) K. McAuley

Exponential Distribution • Event doesn’t occur in a given time means 0 occurrences • Poisson - with occurrence rate of t in time interval t. • P(event occurs before this time) = K. McAuley

Exponential Distribution Denote continuous random variable X as the time to occurrence. Cumulative distribution function • Probability density function is Why? • Sometimes we know , the average number of failures per unit time, and sometimes we know , the mean time to failure: K. McAuley

Pump Failure Problem • If the packing fails on average 0.6 times / year, what is the chance of failure within the first year? • P(pump fails within year) • 45% chance of failure within year • We derived the Exponential distribution from the Poisson Distribution, which came from the Binomial Distribution. What troubling assumptions are we making? K. McAuley

Exponential Distribution - Notes • The time to failure is a continuous random variable • We assume that the expected failure rate is constant, and that it doesn’t increase as equipment wears out • We assume that failures are independent, and that each time increment is an independent trial • mean and variance: K. McAuley

Exponential Distribution Problem Variations - • given mean time to failure, determine probability that time to failure is less than a given value • given fraction of components failing in a specified time, what is probability that time to failure is less than a given value? • what is probability that a component lasts at least a given time? Let’s make one up and do it? K. McAuley

Exponential Distribution Example: If a component has operated for 100 hours already, what is the probability that it will operate for at least 200 hours all together before failing, i.e., P(X>200 | X> 100)=? • Let A = {X>100}, B = {X>200} • Remember conditional probability • for our events, we have Why? • A  B = B K. McAuley

Exponential Distribution What does this tell us? K. McAuley

Exponential Distribution Memoryless property of Exponential Distribution • The probability of the component lasting for another 100 hours, given that it has functioned for 100 hours, is simply probability of it lasting 100 hours • Prior history has no influence on probability of failure when exponential distribution is used • Is this how life works? • When are we justified in using the exponential distribution and when should we avoid it? K. McAuley

Probability Distributions Continued

Probability Distributions Continued

Presentation Transcript

Probability Distributions

Probability Distributions

Probability Distributions

PROBABILITY DISTRIBUTIONS

Probability Distributions Continued

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

Probability Distributions

PROBABILITY DISTRIBUTIONS