1 / 31

Lecture 2: Null Hypothesis Significance Testing Continued

Lecture 2: Null Hypothesis Significance Testing Continued. Laura McAvinue School of Psychology Trinity College Dublin. Null Hypothesis Significance Testing. Previous lecture, Steps of NHST Specify the alternative/research hypothesis Set up the null hypothesis Collect data

sveta
Download Presentation

Lecture 2: Null Hypothesis Significance Testing Continued

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 2:Null Hypothesis Significance Testing Continued Laura McAvinue School of Psychology Trinity College Dublin

  2. Null Hypothesis Significance Testing • Previous lecture, Steps of NHST • Specify the alternative/research hypothesis • Set up the null hypothesis • Collect data • Run the appropriate statistical test • Obtain the test statistic and associated p value • Decide whether to reject or fail to reject the null hypothesis on the basis of p value

  3. Null Hypothesis Significance Testing • Decision to reject or fail to reject Ho • P value • Probability of obtaining the observed results if Ho is true • By convention, use the significance level of p < .05 • Conclude that it is highly unlikely that we would obtain these results by chance, so we reject Ho • Caveat! The fact that there is a significance level does not mean that there is a simple ‘yes’ or ‘no’ answer to your research question

  4. Null Hypothesis Significance Testing • If you obtain results that are not statistically significant (p>.05), this does not necessarily mean that the relationship you are interested in does not exist • There are a number of factors that affect whether your results come out as statistically significant • One and two-tailed tests • Type I and Type II errors • Power

  5. One and Two-tailed Tests • One-tailed / Directional Test • Run this when you have a prediction about the direction of the results • Two-tailed / Non-Directional Test • Run this when you don’t have a prediction about the direction of the results

  6. Recall previous example… • Research Qu • Do anxiety levels of students differ from anxiety levels of young people in general? • Prediction • Due to the pressure of exams and essays, students are more stressed than young people in general • Method • You know the mean score for the normal young population on the anxiety measure = 50 • You predict that your sample will have mean > 50 • Run a one-tailed one-sample t test at p < .05 level

  7. One-tailed Test • Compare the mean of your sample to the sampling distribution for the population mean • Decide to reject Ho if your sample mean falls into the highest 5% of the sampling distribution

  8. Dilemma • But! What if your prediction is wrong? • Perhaps students are less stressed than the general young population • Their own bosses, summers off, no mortgages • With previous one-tailed test, you could only reject Ho if you got an extremely high sample mean • What if you get an extremely low sample mean? • Run a two-tailed test • Hedge your bets • Reject Ho if you obtain scores at either extreme of the distribution, very high or very low sample mean

  9. Two-tailed Test • You will reject Ho when a score appears in the highest 2.5% of the distribution or the lowest 2.5% • Note that it’s not the highest 5% and the lowest 5% as then you’d be operating at p = .1 level, rejecting Ho for 10% of the distribution • So, we gain ability to reject Ho for extreme values at either end but values must be more extreme

  10. Errors in NHST • Howell (2008) p. 157 • “Whenever we reach a decision with a statistical test, there is always a chance that our decision is the wrong one” • Misleading nature of NHST • Because there is a significance level (p = .05), people interpret NHST as a definitive exercise • Results are statistically significant or not • We reject Ho or we don’t • The Ho is wrong or right

  11. Errors in NHST • Remember we are dealing with probabilities • We make our decision on the basis of the likelihood of obtaining the results if Ho is true • There is always the chance that we are making an error • Two kinds of Error • We reject Howhen it is true (Type I error) • We say there’s a significant difference when there’s not • We accept Ho when it is false (Type II error) • We say there is no significant difference when there is

  12. Type I Error • Our anxiety example • Predict that students will have greater anxiety score than young people in general • Test Ho that students’ anxiety levels do not differ from young people • One-tailed one sample t-test at p < .05 • Compare sample mean with sampling distribution of mean for the population (Ho)

  13. Type I Error • Decide to reject Ho if your sample mean falls in the top 5% of the distribution • But! • This 5%, even though at the extreme end, still belongs to the distribution • If your sample mean falls within this top 5%, there is still a chance that your sample came from the Ho population

  14. Type I Error • For example, if p = .04, this means that there is a very small chance that your sample mean came from that population, • But thisis still a chance, you could berejecting Ho when it is in fact true • Researchers are willing to accept this small risk (5%) of making a Type I error, of rejecting Ho when it is in fact true • Probability of making Type I error = alpha  =the significance level that you chose • .05, .01

  15. Type II Error • So why not set a very low significance level to minimise your risk of making a Type I error? • Set p < .01 rather than p < .05 • As you decrease the probability of making a Type I error you increase the probability of making a Type II error • Type II Error • Fail to reject Ho when it is false • Fail to detect a significant relationship in your data when a true relationship exists

  16. For argument’s sake, imagine that H1 is correct • Sampling Distribution under Ho • Sampling Distribution under H1 • Reject Ho if sample mean equals any value to the right of the critical value (red region) • Correct Decision • Accept Ho if sample mean equals any value to the left of the critical region • Type II Error

  17. Four Outcomes of Decision Making

  18. Power • You should minimise both Type I and Type II errors • In reality, people are often very careful about Type I (i.e. strict about ) but ignore Type II altogether • If you ignore Type II error, your experiment could be doomed before it begins • even if a true effect exists (i.e. H1 is correct), if  is high, the results may not show a statistically significant effect • How do you reduce the probability of a Type II error? • Increase the power of the experiment

  19. Power • Power • The probability of correctly rejecting a false Ho • A measure of the ability of your experiment to detect a significant effect when one truly exists • 1 - 

  20. How do we increase the power of our experiment? • Factors affecting power • The significance level () • One-tailed v two-tailed test • The true difference between Ho and H1(o - 1) • Sample Size (n)

  21. The Influence of  on Power • Reduce the significance level ()… • Reduce the probability of making a Type I error • Rejecting the Ho when it is true • Increase the probability of making a Type II error • Accepting the Ho when it is false • Reduce the power of the experiment to detect a true effect as statistically significant

  22. Reduce  and reduce power

  23. Increase  and increase power But! You increase the probability of a Type I error!

  24. The Influence of One v Two-tailed Tests on Power • We lose power with a two-tailed test • power is divided across the two tails of the experiment • Values must be more extreme to be statistically significant

  25. The Influence of the True Difference between Ho and H1 • The bigger the difference between o and 1, the easier it is to detect it

  26. The Influence of Sample Size on Power • The bigger the sample size, the more power you have • A big sample provides a better estimate of the population mean • With bigger sample sizes, the sampling distribution for the mean clusters more tightly around the population mean • Standard deviation of the sampling distribution, known as standard error the mean is reduced • There is less overlap between the sampling distributions under Ho and H1 • The power to detect a significant difference increases

  27. The Influence of Sample Size on Power

  28. Sample Size Exercise • Open the following dataset • Software / Kevin Thomas / Power dataset (revised) • Explores the effects of Therapy on Depression • Perform two Independent Samples t-test • Analyse / Compare means / Independent Samples t test • Group represents Therapy v Control • Score represents post-treatment depression • 1. Group1 & Score1 • 2. Group 2 & Score 2

  29. Complete the following table

  30. What explains these results?

  31. So, how do I increase the power of my study? • You can’t manipulate the true difference between Ho and H1 • You could increase your significance level () but then you would increase the risk of a Type I error • If you have a strong prediction about the direction of the results, you should run a one-tailed test • The factor that is most under your control is sample size • Increase it!

More Related