Hypothesis Testing

HypothesisTesting

Two siblings, Arlen and an Robin agree to resolve their disputed ownership of an Ert´epainting by tossing a penny. Arlen produces a penny and, just as Robin is about to toss it in the air, Arlen smoothly suggests that spinning the penny on a table might ensure better randomization. Robin assents and spins the penny. As it spins, Arlen calls “Tails!” The penny comes to rest with Tails facing up and Arlen takes possession of the Ert´e. Robin is left with the penny.

That evening, Robin wonders if she has been had. She decides to perform an experiment. She spins the same penny on the same table 100 times and observes 68 Tails. It occurs to Robin that perhaps spinning this penny was not entirely fair, but she is reluctant to accuse her brother of impropriety until she is convinced that the results of her experiment cannot be dismissed as coincidence. How should she proceed?

What is the true value of p? More precisely, what is a reasonable guess as to the true value of p? • Is p = 0.5? Specifically, is the evidence that p 6= 0.5 so compelling that Robin can comfortably accuse Arlen of impropriety? • What are plausible values of p? In particular, is there a subset of [0, 1] that Robin can confidently claim contains the true value of p?

Another Example A sociologist has been hired to assess the effectiveness of a rehabilitation program for alcoholics in her city. The program serves a large area, and she does not have the resources to test every single client. Instead, she draws a random sample of 127 people from the list of all clients and questions them on a variety of issues. She notices that, on the average, the people in her sample miss fewer days of work each year than the city as a whole. Are alcoholics treated by the program more reliable than workers in general?

We can see that there is a difference in rates of absenteeism and that the average rate of absenteeism for the sample is lower than the rate for the community. • Although it’s tempting, we can’t make any conclusions yet because we are working with a random sample of the population we are interested in, not the population itself (all people treated in the program).

Explanation 1 • The first explanation, which we will call explanation A, is that the difference between the community mean of 7.2 days and the sample mean of 6.8 days reflects a real difference in absentee rates between the population of all treated alcoholics and the community. • The difference is statistically significant in the sense that it is very unlikely to have occurred by random chance alone. • If explanation A is true, the population of all treated alcoholics is different from the community and the sample did not come from a population with a mean absentee rate of 7.2 days.

Explanation 2 • The second explanation, or explanation B, is that the observed difference between sample and community means was caused by mere random chance. • In other words, there is no important difference between treated alcoholics and the community as a whole, and the difference between the sample mean of 6.8 days and the mean of 7.2 days of absenteeism for the community is trivial and due to random chance. • If explanation B is true, the population of treated alcoholics is just like everyone else and has a mean absentee rate of 7.2 days.

0.4 0.3 0.2 0.1 0.0 5 6 7 8 9 10 Remember that this is the distribution of s. From sample to sample measured values deviate. Also remember that the deviation is

Which explanation is correct? As long as we are working with a sample rather than the entire group, we cannot be absolutely (100%) sure about the answer to this question. • However, we can set up a decision-making procedure so conservative that one of the two explanations can be chosen knowing that the probability of choosing the incorrect explanation is very low.

This decision-making process begins with the assumption that explanation B is correct. Symbolically, the assumption that the mean absentee rate for all treated alcoholics is the same as the rate for the community as a whole can be stated as

If explanation B (the population of treated alcoholics is not different from the community as a whole and has a μ of 7.2) is true, then the probability of getting the observed sample outcome ( = 6.8) can be found. • Let us add an objective decision rule in advance. If the odds of getting the observed difference are less than 0.05 we will reject explanation B. • If this explanation were true, a difference of this size (7.2 days vs. 6.8 days) would be a very rare event, and in hypothesis testing we always bet against rare events.

unknown

Read Chapter 7- Chapter 9 from Essentials of Statistics by Joseph F. Healey • Read Chapter 8,9 and 10 from Research Methods by Charles Stangor

Hypothesis Testing