Estimation Procedures and Hypothesis Testing: Clint's Dilemma

Lecture 4 Preview: Estimation Procedures, Estimates, and Hypothesis Testing Review: Clint’s Dilemma and Estimation Procedures Clint’s Opinion Poll and His Dilemma Clint’s Estimation Procedure: The General and the Specific Taking Stock and Our Strategy to Assess the Reliability of Clint’s Poll Results: Use the General Properties of the Estimation Procedure to Assess the Reliability of the One Specific Application Importance of the Mean (Center) of the Estimate’s Probability Distribution Importance of the Variance (Spread) of the Estimate’s Probability Distribution for an Unbiased Estimation Procedure Hypothesis Testing Motivating Hypothesis Testing – Evidence and the Cynic Formalize Hypothesis Testing – Five Steps Significance Levels and Standards of Proof Type I and Type II Errors: The Tradeoffs

Clint’s Dilemma - Revisited On the eve of the election, Clint must decide whether or not to hold a pre-election party: If he is comfortably ahead, he will not hold the party; he will save his campaign funds for a future political endeavor (or a vacation to the Caribbean in January). If he is not comfortably ahead, he will hold the party trying to capture more votes. There is not enough time to canvas everyone, however. What should he do? Econometrician’s Philosophy: If you lack the information to determine the value directly, estimate the value to the best of your ability using the information you do have. Clint’s Estimation Procedure Questionnaire: Are you voting for Clint? Procedure: Clint selects 16 students at random. Results: 12 students report that they will vote for Clint and 4 against Clint. = .75 Estimated fraction of population supporting Clint Clint uses the information collected from the sample to draw inferences about the entire population. Seventy-five percent, .75, of the sample support Clint. This poll suggests that Clint leads. Question: Should Clint be confident that he has the election in hand or should he fund the party?

Clint’s Estimation Procedure: General and Specific General Properties versus One Specific Application Clint’s Estimation Procedure: Calculate the fraction of those16 randomly selected students supporting Clint Apply the estimation procedure once to Clint’s sample of the 16 randomly selected students: Before poll After poll vi = 1 if Yes = 0 if No Random Variable: Distribution Estimate: Numerical Value EstFrac = .75 How reliable is EstFrac? Mean[EstFrac] = p = ActFrac = Actual fraction of the population supporting Clint Var[EstFrac] T = Sample Size Strategy: Use the general properties of the estimation procedure to assess the reliability of Clint’s estimate. Mean and variance describe the center and spread of the estimate’s probability distribution

Question: Why is the mean of the estimate’s probability distribution important? A mean describes the center of its probability distribution. Probability Distribution of EstFrac Conceptually, an estimation procedure is unbiased whenever it does not systematically underestimate or overestimate the actual population fraction. Formally, an estimation procedure is unbiased whenever the mean of the estimated fraction’s probability distribution equals the actual population fraction. EstFrac Mean[EstFrac] Relative Frequency Interpretation of Probability ActFrac Unbiased Estimation Procedure Average of the estimate’s numerical values after many, many repetitions = Mean[EstFrac] = ActFrac Average of the estimate’s numerical values after many, many repetitions This provides us with some intuition. = ActFrac If the probability distribution is symmetric we have even more intuition. In one poll, the chances that the estimated fraction is too low the chances that the estimated fraction is too high equal

Question: Why is the variance of the estimate’s probability distribution important? How confident should Clint be that his estimate is close to the actual population fraction? Since the estimation procedure is unbiased, the answer to this question depends on the variance of the estimate’s probability distribution. Variance Large Variance Small Small probability that the numerical value of the estimated fraction, EstFrac, from one repetition of the experiment will be close to the actual population fraction, ActFrac.  Large probability that the numerical value of the estimated fraction, EstFrac, from one repetition of the experiment will be close to the actual population fraction, ActFrac. Estimate is unreliable Estimate is reliable Probability Distributions of EstFrac Variance large Variance small EstFrac EstFrac ActFrac ActFrac

Motivating Hypothesis Testing Thomas Jefferson and Sally Hemings: From American Sphinx: The Character of Thomas Jefferson by Joseph J. Ellis (p.21): “The results, published in the prestigious scientific magazine Nature … showed a match between Jefferson and EstonHemings, Sally’s last child. The chances of such a match occurring randomly are less than one in a thousand.” Clint’s Evidence: 12 of the 16 individuals polled support Clint suggesting that he is leading: seventy-five percent, .75, of the population polled support Clint. Cynic’s View: Despite the evidence, the election is actually a tossup. “Sure, seventy-five percent of those polled supported Clint, but the election is actually a toss up. The fact that seventy-five percent of those polled supported Clint was just the luck of the draw.” Lab 4.1 Question: Could the cynic be correct? Yes, our Opinion Poll simulation shows us that it is indeed possible for the cynic to be correct.

Assessing the Cynic’s View: Despite the evidence, the election is actually a tossup. Question for the Cynic: What is the probability that the fraction supporting Clint would equal .75 or more in one poll of 16 individuals, if the cynic is correct (that is, if election is actually a tossup and the actual fraction of the population supporting Clint equals .50)? Answer: Prob[Results IF Cynic Correct] Prob[Results IF Cynic Correct] small Prob[Results IF Cynic Correct] large Likely the cynic is correct Unlikely the cynic is correct Likely that the election is a tossup. Unlikely that the election is a tossup. Suggestion: Clint leads Suggestion: Election a toss up

Estimating Prob[Results IF Cynic Correct]: Using the Normal Distribution Question for the Cynic: What is the probability that the fraction supporting Clint would equal .75 or more in one poll of 16 individuals, if the cynic were correct (that is, if election is actually a toss up and the actual fraction of the population supporting Clint, ActFrac or p, is .50)? General Properties of the Estimation Procedure:Estimate’s Probability Distribution Var[EstFrac] Mean[EstFrac] =ActFrac= p = .50 = .125 SD[EstFrac] IF Cynic Correct: p = .50 Normal distribution Mean = .50 .95 The normal distribution’s rules of thumb: SD = .125 Standard Deviations from Probability ofRandom Variable’s Mean being within 1 .68 2.953 >.99 .025 EstFrac 2 SD’s 2 SD’s .50 .75 Prob[Results IF Cynic Correct]  .025 1 chance in 40. The results of Clint’s poll, .75, is 2 standard deviation above the mean. Since the normal distribution is symmetric, the probability of being 2 or more standard deviations above the mean is: The probability of being within 2 standard deviation of the mean is .95.

Formalizing Hypothesis Testing Step 1: Collect Evidence – Conduct the Poll Clint polls 16 students chosen randomly and finds that 12 of them support him; that is, fraction of the sample supporting Clint is .75 or seventy-five percent: Estimated Fraction = .75 The evidence suggests that Clint is ahead. Step 2: Play the cynic and challenge the evidence; construct the null and alternative hypotheses Cynic’s view: Despite the evidence, the election is actually a toss up. p = ActFrac H0: p = .50  Election is a tossup; the cynic is correct. H1: p > .50  Clint leads; the evidence accurately portrays reality. Step 3: Formulate the question to assess the null hypothesis and the cynic’s view. H0 is called the null hypothesis; the null hypothesis challenges the evidence. Generic Question: What is the probability that the result would be like the one obtained (or even stronger), if H0 is true (if the cynic is correct and the election is actually a tossup)? H1 is called the alternative hypothesis; the alternative hypothesis is consistent with the evidence. Specific Question: The estimated fraction was .75 in the poll of 16 individuals: What is the probability that .75 or more of the 16 individuals polled would support Clint if H0 is true (if the cynic is correct and the actual population fraction actually equaled .50)? Answer: Prob[Results IF Cynic Correct] or Prob[Results IF H0 True] Prob[Results IF H0 True] small Prob[Results IF H0 True] large Unlikely that H0 is true Likely that H0 is true Reject H0 Do not reject H0

H0: p = .50 Election is a tossup; cynic is correct p = ActFrac H1: p > .50 Clint leads; cynic is incorrect Step 4: Use general properties of the estimation procedure to calculate Prob[Results IF H0 True]. Numerical value of EstFrac was .75: What is the probability that .75 or more of the 16 individuals polled would support Clint, if H0 were true: if ActFrac, p, equaled .50? Equation for Variance If H0is true Estimation procedure is unbiased If H0is true Mean[EstFrac] = ActFrac = p Var[EstFrac] = .50 SD[EstFrac] = .125 Prob[Results IF H0 True]  .025 Mean = .50 1 chance in 40 SD = .125 .025 Question: What should Clint do? Answer: He must decide whether he considers this probability to be large or small. EstFrac .50 .75 Step 5 formalizes this.

H0: p = .50 Cynic is correct: Election is a tossup. p = ActFrac H1: p > .50 Cynic is incorrect: Clint is leading. Prob[Results IF H0 True]  .025 Step 5: Decide on the standard of proof, a significance level The significance level is the dividing line between the probability being small and the probability being large. Significance Level = 5 or 10 percent(.10 or .05) Significance Level = 1 percent(.01) Prob[Results IF H0 True]Less Than Significance Level Prob[Results IF H0 True]Greater Than Significance Level Prob[Results IF H0 True] small Prob[Results IF H0 True] large Unlikely that H0 is true Likely that H0 is true Reject H0 Do not reject H0 Suggestion: Clint leads Suggestion: Election a toss up In academia, the traditional, most frequently used significance levels are 1, 5, and 10 percent. 5 or 10 percent, .05 or .10, significance level: Reject H0 – suggesting that Clint leads 1 percent, .01, significance level: Do not reject H0 – suggesting that the election is a tossup. NB: While 1, 5, and 10 percent are the traditional choices, there is nothing sacred about them. Claim: Our choice of a significance level implicitly reveals our “standard of proof.”

Standard of Proof and Significance Levels H0: p = .50  Cynic is correct; Election tossup Prob[Results IF H0 True]  .025 H1: p > .50  Cynic Cynic is incorrect; Clint leads Significance Level: Dividing line between small and large. Significance Level Significance Level .01 .05 Significance Level = 5 Percent = .05 Significance Level = 1 Percent = .01 Prob[Results IF H0 True]Less Than Significance Level Prob[Results IF H0 True]Greater Than Significance Level Prob[Results IF H0 True] small Prob[Results IF H0 True] large Unlikely that H0 is true Likely that H0 is true Reject H0 Do not reject H0 Suggestion: Clint leads Suggestion: Election a toss up More Difficult To Conclude Clint Leads More Difficult To RejectH0 Higher Standard of Proof LowerSignificanceLevel Key Point   

How Should We Choose the Significance Level, the Standard of Proof? Police charge a 17 year old male with a serious crime. They have strong, but not irrefutable, evidence against him. Cynic’s View: Young man is innocent. H0: Cynic is correct; Defendant innocentH1: Cynic is incorrect; Defendant guilty Translate this into hypothesis testing language Prob[Results IF H0 True] Less Than Significant Level Prob[Results IF H0 True] Greater Than Significance Level Suppose that you are a member of the jury. Jury Finds Guilty Jury Finds Innocent Four Possible Scenarios: Reject H0 Do not reject H0 Defendant Actually Innocent H0 Actually True ErrorImprison innocent man Type I ErrorImprison innocent man CorrectFree innocent man Defendant Actually Guilty H0 Actually False CorrectImprison guilty man Type II ErrorFree guilty man ErrorFree guilty man Question: How do we decide to reject H0 or not? Answer: Compare Prob[Results IF H0 True] and the Significant Level

How Should We Choose the Significance Level, the Standard of Proof? H0: Cynic is correct; Defendant innocentH1: Cynic is incorrect; Defendant guilty Four Possible Scenarios: Jury’s Verdict: Guilty Innocent Reject H0 Do not reject H0 Defendant Actually Innocent H0 Actually True ErrorImprison innocent man Type I ErrorImprison innocent man CorrectFree innocent man Defendant Actually Guilty H0 Actually False CorrectImprison guilty man ErrorFree guilty man Type II ErrorFree guilty man Choice of significance level: Relative costs of Type I and Type II error are crucial. Costs of Type I ErrorUnfairly imprison innocent man Costs of Type II ErrorFree criminal who can commit more crimes Translate this into hypothesis testing language Penalties for finding a defendant guilty as an adult are harsher. Try defendantas an adult rather than a juvenile Costs of Incarcerating an Innocent ManBecomes Greater More Difficult to Find the Defendant Guilty HigherStandardof Proof    Costs of Type I Relative to Type IIError Higher LowerSignificanceLevel Try defendantas an adult rather than a juvenile More Difficult to Reject H0   

Estimation Procedures and Hypothesis Testing: Clint's Dilemma