Understanding Hypothesis Testing: Null and Alternative Hypotheses, Errors, and Student-t Distribution

SESSION 41 & 42 Last Update 12th May 2011 Hypothesis Testing

Learning Objectives • Null and Alternative Hypothesis • Type I and Type II Error • Rejection Region Method • Hypothesis Testing • Student-t • Goodness-of-Fit: Chi-Squared Statistic

Stating Hypothesis Two hypothesis need to be stated to conduct hypothesis testing: the first is called the null hypothesis and is assumed to be the default statement. The opposing hypothesis is known as alternative hypothesis. The notations are H0 and H1. A typical example involves criminal trials: H0 : The defendant is innocent. H1 : The defendant is guilty. A jury must make a decision based on the evidence presented by the prosecution and the defense. In a similar vein, evidence from sample statistic may be used to accept or reject a particular hypothesis in statistical problems.

Type I and Type II Error In statistical parlance: Convicting a defendant is equivalent to rejecting the null hypothesis in favor of the alternative hypothesis. Acquitting a defendant is equivalent to not rejecting the null hypothesis in favor of the alternative hypothesis. A Type I error (α) occurs when we reject a true null hypothesis (e.g. convict an innocent defendant). A Type II error (β) occurs when we do not reject a false null hypothesis (e.g. acquitting a guilty defendant). The error probabilities are inversely related, meaning that any attempt to reduce on will increase the other.

Concepts in Hypothesis Testing • There are two hypothesis. One is called null hypothesis and the other alternative hypothesis. • The testing procedure begins with the assumption that the null hypothesis is true. • The goal of the process is to determine whether there is enough evidence to infer that H1 is true. • There are two possible decisions: • Conclude that there is enough evidence to support H1. • Conclude that there is insufficient evidence to support H1. • Two possible errors can be made: Type I error and Type II error. The associated probabilities are α and β.

Expressing H0 and H1 Example computer assembly: A random sample of n = 100 computers is selected. The average assembly time is X-bar = 63 minutes. The population standard deviation is assumed to be known and equal to σ = 10. Is there sufficient evidence to infer that the assembly time of the entire population is more than 60 minutes?

Solution (1/4) • State H0 and H1. The null hypothesis always contains an equality sign (=) whereas the alternative hypothesis contains an inequality sign: • (>, <) for one-sided tests: e.g. Is there sufficient evidence to infer that the assembly time is more than / less than 60 minutes? • (≠) for two-sided tests: e.g. Is there sufficient evidence to infer that the assembly time is different from 60 minutes? Note that some textbooks allow for (≥, ≤) equalities in the null hypothesis.

Solution (2/4) • Define the rejection region: The rejection region is a range of values such that if the test statistic falls into that range, the null hypothesis is rejected in favor of the alternative hypothesis. The rejection region is: Thus, in order to test whether a sample statistic falls into the rejection region, it has to be converted into a Z-score first.

Rejection Region 0.05 0.025

Solution (3/4) • Converting to Z-score: The conversion is defined as Note that μis the hypothesized population mean. • Determining zα (z critical): Since we need zα. Normally, a 95% confidence interval is used. Thus, α = 1 – CL = 0.05. Using the normal probability table this corresponds to z0.05 = 1.645.

Solution (4/4) • Compare test statistic to z critical: H0 is rejected if Here: Consequently, H0 is rejected. • Formally express the results: There is sufficient evidence to reject the naught hypothesis in favor of the alternative hypothesis. The true population mean can be assumed to larger than 60 minutes.

Exercises Use a 95% confidence level for the following exercises and σ = 10 (assumed to be known): • Assume now that the average assembly time across n = 100 computers is 59 minutes. Is there enough evidence to infer that the actual assembly time is less than 60 minutes across the population? • Assume now that the average assembly time across n = 25 computers is 63 minutes. Is there enough evidence to infer that the actual assembly time is different from 60 minutes across the population?

Test Statistic Binomial Distribution We established that the normal distribution can be used to approximate the binomial distribution. The test statistic for p: Where p-hat is the hypothesized population proportion, p is the sample proportion and n is the sample size.

Student-t Distribution So far it is assumed that the population variance is known. That is not a realistic assumption. We can use the sample variance as an estimator of the population variance. It can be shown, however, that in small samples the estimator is biased. The student-t distribution and associated t statistic in the sampling distribution comes about as a result of estimating the (unknown) population variance from the sample. The student-t distribution is fundamentally different from normal for samples as large as n = 200 (note that your student manual refers to n = 30) but approximates the normal distribution function for larger n. You will be provided with the critical values in the exam!

Chi-Squared Density Function

Chi-Squared Distribution The Chi-Squared is frequently used for goodness-of-fit tests. It can be shown that for grouped data, the squared differences between observed and expected frequencies divided by the expected frequencies are approximately Chi-Squared distributed. This notion is expressed in: Much like for the normal as well as student-t distributions, critical values can be gleaned from standardized tables in statistic textbooks. You will be provided with the critical values in the exam!

Example A sample of 90 potential buyers of motor cars was asked to select their preferred car colour. The results were: white = 38, red = 32 and blue = 20.Do these findings indicate significant differences in colour preferences? Test at the 5% level of significance. [Hint: use 5,991 as the critical value)

Solution (1/2) • Postulate Hypotheses: H0: All frequencies are the same ( ) H1: At least one frequency differs • Rejection Region: Here: 5%

Solution (2/2) • Calculate test statistic: • Since and rejection region There is no evidence to infer that there are significant differences in colour preference.

Understanding Hypothesis Testing: Null and Alternative Hypotheses, Errors, and Student-t Distribution

Understanding Hypothesis Testing: Null and Alternative Hypotheses, Errors, and Student-t Distribution

Presentation Transcript

Welcome Back!

Presentation Agenda

GCSE Mathematics Revision

Comprehending in Action:

Welcome to the 3 rd Session on Corporate Governance: Responsibility of the Board

Session 3.3: Resource Mobilization

Bus 4 Revision Workshop

GCSE Mathematics Revision

IA901 2012 Session Four

Session 2.1 The Person of Jesus Christ

ILS Session 10

Introduction to the Session

2013 Legislative Session: What Passed/Failed this Session?

Session-based Distributed Programming in Java

Introduction to the Session

Session Title: Familiarisation with schemes. and Gender Sensitisation

The Modern Manual Wheelchair

Air Force TMA DQ Course Break-Out Session

Air Force TMA DQ Course Break-Out Session

TMA DQ Course AF Break-Out Session

Summary on the session :Hardware and Computing Fabrics

Sea Ice

Sea Ice