1 / 48

Lesson 9

Lesson 9. One-Way Analysis of Variance. What is Analysis of Variance?. So far, you’ve learned about the three different types of t -tests. These tests can be used when There is one sample to be compared to a population mean There are two independent samples to be compared or

sumana
Download Presentation

Lesson 9

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lesson 9 One-Way Analysis of Variance

  2. What is Analysis of Variance? So far, you’ve learned about the three different types of t-tests. These tests can be used when • There is one sample to be compared to a population mean • There are two independent samples to be compared or • There are two related samples to be compared

  3. What is Analysis of Variance? But we’re restricted to no more than 2 samples. Sometimes we need to have more than 2 treatments. Analysis of Variance (ANOVA) is used to test hypotheses about means when • There is one dependent variable • There is one or more independent variables

  4. What is Analysis of Variance? • When there is a single independent variable, an ANOVA can be used when there are three or more treatments. • To be quite precise, its entirely possible to use an ANOVA when there are only two treatments. The results end up being exactly the same as if we had used a t-test.

  5. ANOVA • As the researcher, we get to determine the values of the independent variable. Usually, these are the levels of the treatment provided. In ANOVA, there are usually three or more treatments that are being tested. • Then we compare the means of each treatment group to see if the different levels of the independent variable appear to change the values of the dependent variable.

  6. ANOVA If we observe a change in the dependent variable that is greater than what we would have expected to occur due to chance differences, then we conclude that it is the levels of the independent variable causing the difference.

  7. We’re All Different • Look at this dataset.

  8. Descriptives Notice that the means for each group are different, but so are the standard deviations. The question is, did we see enough variation in the means, taking into account the different distribution “spreads,” to say that grade level is the cause of the difference and not just random chance?

  9. Why Not t-Tests • If we want to compare means, couldn’t we just do several t-tests to compare each pair of means? • NO! Every time you do t-test, you risk making a Type I error.

  10. Why Not t-tests? If you have only 4 means to compare (A, B, C, and D), then you have to make 6 comparisons • A vs. B • A vs. C • A vs. D • B vs. C • B vs. D • C vs. D

  11. Why Not t-tests? If we set a = .05 for each of these comparisons, then the probability that we will make at least one Type I error across all 6 of these comparisons is

  12. Why not t-tests 2.5% of total area 2.5% of total area 0.5% of total area Reject H0 Do not reject H0 Reject H0 0.5% of total area a = .05 Reject H0 Do not reject H0 Reject H0 a = .01

  13. Where’s the Variance? • Look at our dataset again. Within Groups Between Groups

  14. ANOVA Philosophy • In an Analysis of Variance, we • Partition the Total Variance in the scores into the variability accounted for by treatment differences (Between groups) and the variability not accounted for by treatment differences (Within groups). • If the treatment applied results in a change in the dependent variable, then there should be more variability between the groups than within the groups.

  15. Step-by-Step • The very first step in conducting a one-way ANOVA is to state your research question. That is, • Is at least one treatment group mean significantly different from any other treatment group mean?

  16. Hypothesize This • Next, we generate our hypotheses. • Remember, we have to have two hypotheses for each research question—a null hypothesis and an alternative. • H0: m1 = m2 = . . . = mk (k is the number of treatments) • H1: At least one treatment group mean is different from at least one other treatment group mean. (Don’t even try to put this into symbols. It just can’t be done.)

  17. Rejection Isn’t Always Bad • Now we need to establish our rejection rule. • That is, under what circumstances will we reject our null hypothesis and accept our alternative hypothesis? • With ANOVA, we use the F-statistic. • Reject H0 if the F that we calculate is greater than the F from the table using ? and ? degrees of freedom.

  18. Huh? • Well that wasn’t very helpful, was it. • Reject H0 if the SPSS significance level is < .05. • But here’s an important point—in ANOVA, we ONLY use one-tailed tests. Yep, every ANOVA is done testing to see if the obtained F-value is larger than the critical F-value.

  19. Source Table The way to begin any ANOVA is by generating a table that lists all of the possible sources of variability. For a one-way ANOVA, that would be Source Between Groups Within Groups Total

  20. Source Table Next, let’s determine how much variability is due to each of these sources. We start measuring variability by calculating a Sum of Squares (SS). And you know how to calculate a Sum of Squares, right?? But we’ll go through each one step-by-step in a few minutes. Source SS Between Groups Within Groups Total

  21. Source Table • Also, we’ll need to know how many degrees of freedom are associated with each source of variability in our model. Source SS df Between Groups Within Groups Total

  22. Getting Started • Actually, the easiest place to begin is with the degrees of freedom. • For our example, our dataset with five children in each of 3 grades has 15 scores. • Therefore, we have 15 – 1 = 14 total degrees of freedom.

  23. More Degrees • There are 3 treatment groups involved (3rd, 4th, and 5th graders). So we have 3 – 1 = 2 degrees of freedom betweenthe groups. • And since there were a total of 14 degrees of freedom and we just accounted for 2 of them, there must be 14 – 2 = 12 degrees of freedom within the groups.

  24. Filling in the Table Source SS df Between Groups 2 Within Groups 12 Total 14

  25. Now Taking Center Stage… • Now let’s work on the Sums of Squares, starting with the Total. OK, you already know how to do this one. Right?? • Just remember, WHATEVER letter of the alphabet gets used to represent the scores (X, Y, L), it always refers to the values of the DEPENDENT VARIABLE!

  26. Total Sum of Squares • X represents each individual’s score on the dependent variable. • G represents the Grand Total of all the scores on the dependent variable. • N is the total number of scores. We’ll see this again!

  27. For Our Example: Total SS Remember that we want to keep track of this value.

  28. Break Out Next, let’s partition out the Between Groups (Treatments) Sum of Squares T is the total of the scores in each treatment and n is the number of scores in each treatment. There will be as many T2/n terms as there are treatment groups. You were warned!

  29. Example Between Groups SS Sum of the 3rd Grade Scores The number I said you’d want again later. Sum of the 4th Grade Scores Sum of the 5th Grade Scores

  30. Leftovers Again? So how much variability is left for Within Groups (Treatments)? Within Groups SS = Total SS – Between SS

  31. Example Within Groups SS

  32. Example Source SS df Between Groups 40 2 Within Groups 42 12 Total 82 14

  33. Mean Squares • A Mean Square allows us to determine the amount of variability in relation tothe degrees of freedom. • Since we’ve laid out our source table as we have, we just have to divide across each line.

  34. Example Source SS df MS Between Groups 40 2 = 20.0 Within Groups 42 12 = 3.5 Total 82 14 Don’t Need

  35. Drumroll, Please • The final step is to calculate the F-statistic. • The F-value is the ratio between two Mean Squares. • Which two? Honey, we only got two!!

  36. Example

  37. But is it Significant? • To answer our research question, we have to determine whether the F-value we have calculated is significantly greater than 0. • We would use • the degrees of freedom for Between groups as our numerator df • the degrees of freedom for Within groups as our denominator df

  38. Example • In our example, we had 2 df in the numerator and 12 in our denominator. • The F-value in an F-table for 2 and 12 degrees of freedom using a = .05 is 3.89 • Since the value we calculated of 5.714 is greater than the value from the table, we reject our null hypothesis.

  39. REMEMBER… …the research question! Is at least one treatment group mean significantly different from any other treatment group mean? We have determined that there is at least one difference. But where is that difference?

  40. Foretelling the Future • At this point, we don’t know where the difference(s) is/are. • We’ll need a multiple comparison test to determine this. • For now, we can take a pretty good guess that at least the smallest mean is significantly different from the largest mean.

  41. A slight detour F-test is significant Reject H0 and accept H1 Measure the effect size Determine which group means are significantly different using a multiple comparison test

  42. A Happy Ending • So we can probably guess that the 3rd graders scored significantly lower than the 5th graders.

  43. Wait a Minute! • Multiple comparison tests computed after a significant F-test are called post hoc (after the fact) tests. • ONLY evaluate the results of a post hoc test when the F-test is significant.

  44. That’s Not Turkey… One of the most popular such post hoc test is the Tukey. Its Tukey. Pronounced with a long u and e. Like two-key. One reason its is so popular is because it uses values we’ve already calculated for the F-test.

  45. Doin’ the Tukey Trot • Divide the value you got for the Within Groups Mean Square by n, where n is the number of scores in each treatment group. • Take the square root of the value you got in Step 1. • Find q. This value is found in a q table. The degrees of freedom are (1) the number of treatment groups (r) and (2) the degrees of freedom for the Mean Square Within Groups (Error df). • Multiply the q value by the answer from Step 2. This is your HSD value.

  46. Doin the Tukey Trot • Let’s calculate the HSD for our example. • Our Mean Square Within was 3.5 and n=5. So 3.5 / 5 = 0.7 • Taking the square root, we get 0.837. • In this example, we had 3 treatment groups and our degrees of freedom for Within Groups was 12. So our q value is 3.77. • Multiplying 3.77 x 0.837, we get HSD = 3.155.

  47. Doin’ the Tukey Trot • 5th grade mean = 8.0 • 3rd grade mean = 4.0 • The difference is 8-4 =4.0 • This is larger than 3.155. So the 5th graders scored differently higher than the 3rd graders.

  48. Doin’ the Tukey Trot 5th grade mean = 8.0 and the 4th grade mean = 6.0. The difference here is 2.0. Likewise, the difference between the 3rd grade mean and the 4th grade mean is 2.0. This difference is not greater than 3.122. Therefore, the 3rd graders did not score significantly different from the 4th graders and the 4th graders did not score significantly different from the 5th graders. Only the 3rd and 5th grade scores are significantly different.

More Related