440 likes | 546 Views
1. 1. Econ 240A. Power Four. Last Time. Probability. Problem 6.61.
E N D
1 1 Econ 240A Power Four
Last Time • Probability
Problem 6.61 • A survey of middle aged men reveals that 28% of them are balding at the crown of their head. Moreover, it is known that such men have an 18% probability of suffering a heart attack in the next ten years. Men who are not balding in this way have an 11% probability of a heart attack. Find the probability that a middle aged man will suffer a heart attack in the next ten years.
P (Bald and MA) = 0.28 Bald Not Bald Middle Aged men
P (Bald and MA) = 0.28 P(HA/Bald and MA) = 0.18 P(HA/Not Bald and MA) = 0.11 Bald Not Bald Middle Aged men
Probability of a heart attack in the next ten years • P(HA) = P(HA and Bald and MA) + P(HA and Not Bald and MA) • P(HA) = P(HA/Bald and MA)*P(BALD and MA) + P(HA/Not BALD and MA)* P(Not Bald and MA) • P(HA) = 0.18*0.28 + 0.11*0.72 = 0.054 + .0792 = 0.1296
Random Variables • There is a natural transition or easy segue from our discussion of probability and Bernoulli trials last time to random variables • Define k to be the random variable # of heads in 1 flip, 2 flips or n flips of a coin • We can find the probability that k=0, or k=n by brute force using probability trees. We can find the histogram for k, its central tendency and its dispersion
Outline • Random Variables & Bernoulli Trials • example: one flip of a coin • expected value of the number of heads • variance in the number of heads • example: two flips of a coin • a fair coin: frequency distribution of the number of heads • one flip • two flips
Outline (Cont.) • Three flips of a fair coin, the number of combinations of the number of heads • The binomial distribution • frequency distributions for the binomial • The expected value of a discrete random variable • the variance of a discrete random variable
Concept • Bernoulli Trial • two outcomes, e.g. success or failure • successive independent trials • probability of success is the same in each trial • Example: flipping a coin multiple times
Flipping a Coin Once The random variable k is the number of heads it is variable because k can equal one or zero it is random because the value of k depends on probabilities of occurrence, p and 1-p Heads, k=1 Prob. = p Prob. = 1-p Tails, k=0
Flipping a coin once • Expected value of the number of heads is the value of k weighted by the probability that value of k occurs • E(k) = 1*p + 0*(1-p) = p • variance of k is the value of k minus its expected value, squared, weighted by the probability that value of k occurs • VAR(k) = (1-p)2 *p +(0-p)2 *(1-p) = VAR(k) = (1-p)*p[(1-p)+p] =(1-p)*p
Flipping a coin twice: 4 elementary outcomes h, h; k=2 heads h, h Prob =p heads Prob=1-p Prob =p tails h, t; k=1 h, t Prob=p heads t, h; k=1 t, h Prob =1-p tails Prob =1-p t, t tails t, t; k=0
Flipping a Coin Twice • Expected number of heads • E(k)=2*p2 +1*p*(1-p) +1*(1-p)*p + 0*(1-p)2 E(k) = 2*p2 + p - p2 + p - p2 =2p • so we might expect the expected value of k in n independent flips is n*p • Variance in k • VAR(k) = (2-2p)2 *p2 + 2*(1-2p)2 *p(1-p) + (0-2p)2 (1-p)2
Continuing with the variance in k • VAR(k) = (2-2p)2 *p2 + 2*(1-2p)2 *p(1-p) + (0-2p)2 (1-p)2 • VAR(k) = 4(1-p)2 *p2 +2*(1 - 4p +4p2)*p*(1-p) + 4p2 *(1-p)2 • adding the first and last terms, 8p2 *(1-p)2 + 2*(1 - 4p +4p2)*p*(1-p) • and expanding this last term, 2p(1-p) -8p2 *(1-p) + 8p3 *(1-p) • VAR(k) = 8p2 *(1-p)2 + 2p(1-p) -8p2 *(1-p)(1-p) • so VAR(k) = 2p(1-p) , or twice VAR(k) for 1 flip
Frequency Distribution for the Number of Heads • A fair coin
One Flip of the Coin probability 1/2 1 head O heads # of heads
Two Flips of a Fair Coin probability 1/2 1/4 0 2 # of heads 1
Three Flips of a Fair Coin • It is not so hard to see what the value of the number of heads, k, might be for three flips of a coin: zero, one ,two, three • But one head can occur two ways, as can two heads • Hence we need to consider the number of ways k can occur, I.e. the combinations of branching probabilities where order does not count
Three flips of a coin; 8 elementary outcomes 3 heads 2 heads 2 heads 1 head 2 heads 1 head 1 head 0 heads
Three Flips of a Coin • There is only one way of getting three heads or of getting zero heads • But there are three ways of getting two heads or getting one head • One way of calculating the number of combinations is Cn(k) = n!/k!*(n-k)! • Another way of calculating the number of combinations is Pascal’s triangle
Three Flips of a Coin Probability 3/8 2/8 1/8 0 1 2 3 # of heads
The Probability of Getting k Heads • The probability of getting k heads (along a given branch) in n trials is: pk *(1-p)n-k • The number of branches with k heads in n trials is given by Cn(k) • So the probability of k heads in n trials is Prob(k) = Cn(k) pk *(1-p)n-k • This is the discrete binomial distribution where k can only take on discrete values of 0, 1, …k
Expected Value of a discrete random variable • E(x) = • the expected value of a discrete random variable is the weighted average of the observations where the weight is the frequency of that observation
Expected Value of the sum of random variables • E(x + y) = E(x) + E(y)
Expected Number of Heads After Two Flips • Flip One: kiI heads • Flip Two: kjII heads • Because of independence p(kiI and kjII) = p(kiI)*p(kjII) • Expected number of heads after two flips: E(kiI + kjII) = (kiI + kjII) p(kiI)*p(kjII) • E(kiI + kjII) = kiI p(kiI)* p(kjII) +
Cont. • E(kiI + kjII) = kiI p(kiI)* p(kjII) + kjII *p(kjII) p(kiI) • E(kiI + kjII) = E(kiI) + E(kjII) = p*1 + p*1 =2p • So the mean after n flips is n*p
Variance of a discrete random variable • VAR(xi) = • the variance of a discrete random variable is the weighted sum of each observation minus its expected value, squared,where the weight is the frequency of that observation
Cont. • VAR(xi) = • VAR(xi) = • VAR(xi) = • So the variance equals the second moment minus the first moment squared
The variance of the sum of discrete random variables • VAR[xi + yj] = E[xi + yj - E(xi + yj)]2 • VAR[xi + yj] = E[(xi - Exi) + (yj - Eyj)]2 • VAR[xi + yj] = E[(xi - Exi)2 + 2(xi - Exi) (yj - Eyj) + (yj - Eyj)2] • VAR[xi + yj] = VAR[xi] + 2 COV[xi*yj] + VAR[yj]
The variance of the sum if x and y are independent • COV [xi*yj] = E(xi - Exi) (yj - Eyj) • COV [xi*yj]= (xi - Exi) (yj - Eyj) • COV [xi*yj]= (xi - Exi) p[x(i)]* (yj - Eyj)* p[y(j)] • COV [xi*yj] = 0
Variance of the number of heads after two flips • Since we know the variance of the number of heads on the first flip is p*(1-p) • and ditto for the variance in the number of heads for the second flip • then the variance in the number of heads after two flips is the sum, 2p(1-p) • and the variance after n flips is np(1-p)
The Field Poll • In a sample of 731 people, 33% indicate they will vote for Bill Jones as Senator. The others are for Boxer (50%) or otherwise inclined (17% undecided or for others) • If the poll is an accurate reflection or subset of the population of voters on Nov. 2, what is the expected proportion that will vote for Jones? • How much uncertainty is in that expectation?
Field Poll • The estimated proportion, from the sample, that will vote for Jones is: • where is 0.33 or 33% • k is the number of “successes”, the number of people sampled who are for Jones, approximately 241 • n is the size of the sample, 731
Field Poll • What is the expected proportion of voters Nov. 2 that will vote for Jones? • = E(k)/n = np/n = p, where from the binomial distribution, E(k) = np • So if the sample is representative of voters and their preferences, 33% should vote for Jones next November
Field Poll • How much dispersion is in this estimate, i.e. as reported in newspapers, what is the margin of sampling error? • The margin of sampling error is calculated as the standard deviation or square root of the variance in • = VAR(k)/n2 = np(1-p)/n2 =p(1-p)/n • and using 0.33 as an estimate of p, • = 0.33*0.67/731 =0.00030
Field Poll • So the sampling error should be 0.017 or 1.7%, i.e. the square root of 0.00030 • The Field Poll reports a 95% confidence interval or about two standard errors , I.e 2*1.7%
Field Poll • Is it possible that Bill Jones could win? This estimate of 0.33 plus or minus twice the sampling error of 0.017, creates an interval of 0.31 to 035. • Based on a normal approximation to the binomial, the true proportion voting for Jones should fall in this interval with probability of about 95%, unless sentiments change.