Hypothesis Testing

Hypothesis Testing After 2 hours of frustration trying to fill out an IRS form, you are skeptical about the IRS claim that the form takes 15 minutes on average to complete. How would you challenge the IRS claim?

Methods to Test a Claim • One method would be to find a random sample and find a confidence interval for the average amount of time to fill out the form, and then determine whether the interval suggests an average different from 15 minutes. • Another method would be to conduct an Hypothesis Test

The Null Hypothesis • Is denoted by H0 • It is the statement that is under investigation or being tested. It asserts there is no change. • For the IRS form, the null hypothesis is: H0:  = 15 minutes

The Alternative Hypothesis • Is denoted by H1 • This is the statement you will adopt in the situation the evidence (data) is so strong that you reject the null hypothesis. • For the IRS form, the alternative hypothesis could be: H1:  > 15 minutes

Three Types of Tests • Left-tailed Tests: H0:  = k; H1:  < k • Right-tailed Tests: H0:  = k; H1:  > k • Two-tailed Tests: H0:  = k; H1:  ≠ k

Type of Test to Use • This depends on what you suspect. For the IRS form, you suspected the mean was greater time was greater than claimed, so you would lean to a right-tailed test. • If you suspect that the average length of time you get from your phone battery is less than claimed, you would use a left-tailed test. • Or perhaps you wish to test the average chlorine level in a pool (too high is harmful, and too low is not sanitary) so you might test whether it is different from the target (two tailed test).

Data to Collect • You will collect information similar to that done for confidence intervals. • If the distribution is not normal, you will need a sample size of at least 30 to test the mean. If the population standard deviation is known, you will use z, if it is not known, you may use t, especially if the sample is not very large.

Test Statistic • If the distribution is normal (or the sample size is larger than 30) and the standard deviation is known. Then

P-values • Assuming H0 is true, the probability that the test statistic will take on values as or more extreme than the observed test statistic (computed from the sample data) is called the P-value of the test. • The smaller the P-value, the stronger the evidence against H0.

Types of Errors • A Type I Error occurs if we reject the null hypothesis when it is true. • A Type II error occurs if we fail to reject the null hypothesis if it is false. • A type I error is analogous to convicting an innocent person for a crime they didn’t commit. • A type II error is analogous to failing to convict a guilty person.

Level of Significance • The level of significance  is the probability of rejecting the null hypothesis when it is true. • A common level of significance is .05 (that means if we reject the null hypothesis, we will be at least 95% sure that the null hypothesis is false). • We will reject the null hypothesis if P-value ≤  • If P-value > , we do not reject the null hypothesis. (Courts do not prove people innocent, they fail to convict them--so failing to reject the null hypothesis doesn’t mean it is true)

Summary of Hypothesis Tests • Determine the null and alternative hypotheses and set the level of significance . • Collect the data and compute the test statistic. • Compute the P-value. • If P-value ≤ , then reject H0. If P-value > , then do not reject H0.

Further Remarks • Roughly, the P-value is measuring how rare your test statistic is given the hypothesized mean. A small P-value indicates are rare event under the hypothesis. Rather than concluding it is a rare event, the more likely conclusion is that the hypothesis is not correct. • The basic principle for rareness is that measurements far away from the mean in terms of standard deviation are rare.

9.1#12: P/E of Bank Stocks • Is the price to earnings ratio of US bank stocks less than the S&P 500 mean of 19? A random sample of 14 such bank stocks had a sample mean of 17.1. Assume the P/E ratios of US bank stocks are normally distributed with a standard deviation of 4.5. Conduct a left tailed test at a 5% level of significance to determine if the mean P/E of US bank stocks is less than 19.

9.1#13: Hail Damage • Nationally, approximately 11% of the total U.S. wheat crop is destroyed by hail each year. A random sample of 16 such claims in Weld County Colorado found the an average of 12.5% of crops destroyed by hail. Assume the percentage of crops destroyed by hail in Weld County is normally distributed with a standard deviation of 5.0%. Do these data indicate that the Weld County average is different from the national average (in either direction)? Test at =.01

Hypothesis Tests when  is unknown • Follow same procedures as before. If distribution is not approximately normal, then the sample size must be at least 30. • Except use the t-distribution with d.f.=n-1 and the test statistic will be

Critical Region Method • As with previous method for hypothesis tests, determine H0, H1 and . • Instead of waiting to compute P-value and compare to , you predetermine the critical region, that is the values of the test statistic at which you will reject H0. • Then compute test statistic, and if it is in the critical region, reject H0 otherwise do not reject H0 .

Hypothesis Testing