SESSION 41 & 42. Last Update 12 th May 2011. Hypothesis Testing. Learning Objectives. Null and Alternative Hypothesis Type I and Type II Error Rejection Region Method Hypothesis Testing Student-t Goodness-of-Fit: Chi-Squared Statistic. Stating Hypothesis.
12th May 2011
Two hypothesis need to be stated to conduct hypothesis testing: the first is called the null hypothesis and is assumed to be the default statement. The opposing hypothesis is known as alternative hypothesis. The notations are H0 and H1. A typical example involves criminal trials:
H0 : The defendant is innocent.
H1 : The defendant is guilty.
A jury must make a decision based on the evidence presented by the prosecution and the defense. In a similar vein, evidence from sample statistic may be used to accept or reject a particular hypothesis in statistical problems.
In statistical parlance:
Convicting a defendant is equivalent to rejecting the null hypothesis in favor of the alternative hypothesis.
Acquitting a defendant is equivalent to not rejecting the null hypothesis in favor of the alternative hypothesis.
A Type I error (α) occurs when we reject a true null hypothesis (e.g. convict an innocent defendant).
A Type II error (β) occurs when we do not reject a false null hypothesis (e.g. acquitting a guilty defendant).
The error probabilities are inversely related, meaning that any attempt to reduce on will increase the other.
Example computer assembly:
A random sample of n = 100 computers is selected. The average assembly time is X-bar = 63 minutes. The population standard deviation is assumed to be known and equal to σ = 10. Is there sufficient evidence to infer that the assembly time of the entire population is more than 60 minutes?
Note that some textbooks allow for (≥, ≤) equalities in the null hypothesis.
The rejection region is a range of values such that if the test statistic falls into that range, the null hypothesis is rejected in favor of the alternative hypothesis. The rejection region is:
Thus, in order to test whether a sample statistic falls into the rejection region, it has to be converted into a Z-score first.
The conversion is defined as
Note that μis the hypothesized population mean.
Since we need zα. Normally, a 95% confidence interval is used. Thus, α = 1 – CL = 0.05. Using the normal probability table this corresponds to z0.05 = 1.645.
H0 is rejected if
Consequently, H0 is rejected.
There is sufficient evidence to reject the naught hypothesis in favor of the alternative hypothesis. The true population mean can be assumed to larger than 60 minutes.
Use a 95% confidence level for the following exercises and σ = 10 (assumed to be known):
We established that the normal distribution can be used to approximate the binomial distribution. The test statistic for p:
Where p-hat is the hypothesized population proportion, p is the sample proportion and n is the sample size.
So far it is assumed that the population variance is known. That is not a realistic assumption. We can use the sample variance as an estimator of the population variance. It can be shown, however, that in small samples the estimator is biased. The student-t distribution and associated t statistic in the sampling distribution comes about as a result of estimating the (unknown) population variance from the sample. The student-t distribution is fundamentally different from normal for samples as large as n = 200 (note that your student manual refers to n = 30) but approximates the normal distribution function for larger n. You will be provided with the critical values in the exam!
The Chi-Squared is frequently used for goodness-of-fit tests. It can be shown that for grouped data, the squared differences between observed and expected frequencies divided by the expected frequencies are approximately Chi-Squared distributed. This notion is expressed in:
Much like for the normal as well as student-t distributions, critical values can be gleaned from standardized tables in statistic textbooks. You will be provided with the critical values in the exam!
A sample of 90 potential buyers of motor cars was asked to select their preferred car colour. The results were: white = 38, red = 32 and blue = 20.Do these findings indicate significant differences in colour preferences? Test at the 5% level of significance.
[Hint: use 5,991 as the critical value)
H0: All frequencies are the same ( )
H1: At least one frequency differs
and rejection region
There is no evidence to infer that there are significant differences in colour preference.