Testing Hypotheses and The Standard Error. Testing Hypotheses and The Standard Error.
PowerPoint Slideshow about 'Testing Hypotheses and The Standard Error' - jeanne
An Image/Link below is provided (as is) to download presentation
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
The standard error, as an estimate of chance fluctuation, is the measure against which the outcomes of experiments are checked. Is there a difference a “real” difference or merely a consequence of the many relatively small differences that could have arisen by chance?
To answer this question, the standard error of the differences between means is calculated and the obtained difference is compared to this stand error.
How low is low? At what point is a correlation coefficient too low to warrant treating it seriously?
The problem is complex. In basic research, low correlations—of course, they should be statistically significant—may enrich theory and research. It is in applied research where prediction is important. It is here where value judgments about low correlation and the trivial amounts of variance shared have grown. In basic research, however, the picture is more complicated. One conclusion is fairly sure: correlation coefficients, like other statistics, must be tested for statistical significance.
The main research purpose of inferential statistics is to test research hypotheses by testing statistical hypotheses.
Broadly speaking, scientists use two types of hypotheses: substantive and statistical. A substantive hypothesis is the usual type of hypothesis discussed in Chapter 2, where a conjectural statement of the relation between two or more variables is expressed.
Statistical hypotheses must be tested against something, however. It is not possible to simply test a stand-alone statistical hypothesis. That is, we do not directly test the statistical proposition in and of itself. We test it against an alternative proposition. Naturally, there can be several alternatives to . The alternative usually selected is the null hypothesis, which is invented by Sir Ronald Fisher.
The null hypothesis is a statistical proposition that states, essentially, that there is no relation between the variables. The null hypothesis says, “You’re wrong, there is no relation; disprove me if you can.”
Researchers sometimes unwittingly use null hypotheses as substantive hypotheses. The trouble with this is that it places the investigator in a difficult position logically because it is extremely difficult to demonstrate the empirical “validity” of a null hypothesis. After all, if the hypothesis is supposed, it could well be one of the many chance results that are possible, rather than a meaningful nondifference!
Although as researchers we want to demonstrate that is true, it cannot be done in a direct way easily. If we want to test this hypothesis directly, we would need to test an infinite number of values. That is, we would need to test each and every situation where is not equal to zero.
In hypothesis testing, the procedure dictates that we test the null hypothesis. The null hypothesis is written as . Note that it points directly to a value, namely zero. What we need is to gather enough empirical data to show that the null hypothesis is not tenable.
In statistical terms, we would “reject H0” Rejecting H0 would indicate to us that we have a significant result. Rejecting H0 leads us to ward supporting H1. Supporting H1, in turn leads to support for our substantive hypothesis.
If there are not enough empirical data to refute the null hypothesis, we would not be able to reject the null hypothesis. Statistically we would say “failed to reject H0” or “do not “reject” H0; one can never “accept” H0. To “accept” H0 would require repeating the study an infinite number of times, and getting exactly zero each time. On the other hand, we can “fail to reject” H0 because the results are not sufficiently different from what one would predict (under the assumption that H0 is true) to warrant the conclusion that it is false.
The states of H0 is akin to the defendant in a trial who is deemed to be “innocent” until proved “guilty.” If the trial results in a verdict of “not guilty”, this does not mean the defendant is “innocent.” It merely means that guilt could not be demonstrated beyond a reasonable doubt.
When the investigator fails to reject H0 it does not mean H0 is true, merely that H0 cannot be shown to be false beyond a “reasonable” doubt.
Suppose we draw a random sample of 100 children from eighth-grade classes in such-and-such a school system, and we find the mean=110 and SD=10. How accurate is this mean?
What we do is to set up a hypothetical distribution of sample means, all calculated from samples of 100 pupils, each drawn from the parent population of eighth-grade pupils. If we know the mean of this population of means, everything would be simple. In fact, we cannot obtain it. The best we can do is to estimate it with our sample value, or sample mean. We simply say, in this case, “Let the sample mean equal the mean of the population mean” —and hope we are right. Then we must test our equation. We do this with the standard error.
If samples are drawn from a population at random, the means of the samples will tend to be normally distributed. The larger the Ns, the more this is so. And the shape and kind of distribution of the original population makes no difference.
Why is it important to show that distributions of means approximate normality? We work with means a great deal in data analysis, and if they are normally distributed then one can use the known properties of the normal curve to interpret obtaained research data.
To infer is to derive a conclusion from premises or from evidence. To infer statistically is to derive probabilistic conclusions from probabilistic premises. We conclude probabilistically; that is, at a specified level of significance.
Another form of inference, discussed at length in the chapter on sampling, is that from a sample to a population.
One of the gravest dangers of research is the inferential leap from sample data to population fact.
It can be said, in sum, that statistics enable scientists to test substantive hypotheses indirectly by enabling them to test statistical hypotheses directly. They test the “truth” of substantive hypotheses by subjecting null hypotheses to statistical tests on the bases of probabilistic reasoning.
H0 is rejected with the awareness that an error might have been made, but the chances of that happening are less than 0.05. The conclusion of rejecting H0 on an average is correct more than 95% of the time.
As a rule, in selecting a significance level one must decide which type of error is more important to avoid or minimize. To be certain that an event of some importance has been identified before reporting it, use a fairly stringent criterion of significance, such as 0.01. On the other hand, if there is greater concern not to miss something, use a less stringent level, such as 0.05.
The size of the sample is related to both types of errors. With a fixed value of type one error and a fixed sample size n, the value of type two error is predetermined. If type two error is too large, it can be reduced by either raising the level of type one error for fixed n, or by increasing n for a fixed level of type one error. Although type two error is seldom determined in an experiment, researchers can be assured that it is reasonably small by collecting a large sample.
Using our substantive hypothesis we can state it statistically. Even though we have referred to it as our statistical hypothesis, many statisticians refer to it as the research, or experimental or alternative hypothesis.
A sample that is too large is a waste of resources. A sample that is too small is also a wasted effort since it will not be large enough to detect a significant effect (difference).
By increasing the sample size, the sampling distribution becomes narrower and the standard error becomes smaller. As a result, a large sample increases the likelihood of detecting a difference. However, too large of a sample will make a very small difference statistically significant, but not necessarily of practical significance.