Factorial Analysis of Variance (ANOVA) on SPSS. Practice reproducing the analyses yourself: 2 Factor Between (2 levels x 2 levels).sav 2 Factor Between (2 levels x3 levels).sav 3 Factor Between (2 levels x 2 levels x 2 levels).sav 2 Factor Within (2 levels x 2 levels).sav All on Portal.
Factorial Analysis of Variance (ANOVA) on SPSS Practice reproducing the analyses yourself: 2 Factor Between (2 levels x 2 levels).sav 2 Factor Between (2 levels x3 levels).sav 3 Factor Between (2 levels x 2 levels x 2 levels).sav 2 Factor Within (2 levels x 2 levels).sav All on Portal
Reading • http://www.socialresearchmethods.net/kb/expfact.htm - a simple summary of factorial designs • http://davidmlane.com/hyperstat/index.html - see sections 11 & 12 for between subjects designs and section 13 for within subjects (repeated measures) designs. This is recommended –its concise, clear and to the point. It also contains a very good glossary from which you can quickly refresh your memory for definitions of such things as the Standard Error etc. • Chapters 10,11,12 of Gravetter & Forzano cover between, within, and factorial design issues. • Chapters 13,14,15 of Gravetter & Wallnau cover the stats – ANOVA etc. However don’t get bogged down with formulas for calculating sums of squares…. See next slide
Things you should know: • How to interpret interaction plots • How to interpret ANOVA tables and assumption tests • That the Error degrees of Freedom is always N-1 (N= total number of data points) • That the degrees of freedom for a test of a main effect of a factor = number of levels the factor contains -1. • That the degrees of freedom for a test of an interaction between two or more factors = the number of levels in one factor x the number of levels in the other x…etc. Thus the DF for a 3 way interaction between factors having 2,2 and 4 factor levels is 1 x 1 x 3=3. • That ANOVA uses F tests and that the F statistic for any effect is the Mean Square for the Effect divided by the Error mean Square: MScondition/MSerror • That when you have an alpha level of .05 this means that the probability of not making a Type 1 error is 95% (.95) for each test you do • Thus if you have 20 F tests in your ANOVA table the probability of none of them being spurious is .95 x .95 x .95 x .95……or .9520 or (1-a)20 • This actually = .36 or 36% which is why (in complex designs especially) you should stick to examining a few predictions.
Things you needn’t worry about: • The precise way that Sums of Squares are calculated • (But it will help your understanding of ANOVA if you at least understand the gist of how variability is partitioned). • How Levene’s test or Mauchly’s test are calculated – only that they test the assumption of homogeneity of variance for between subjects designs and its (more or less equivalent) in within subjects designs. • In the SPSS output you can largely ignore the following when doing repeated measures analyses (at this stage at least): • The multivariate tests which you get at the beginning • Tests of within subject contrasts (although these can be a useful tool for examining patterns in the data) • Any tests of between subjects effects that only involve an intercept (i.e. you can ignore this output when all your factors are within subjects)
1. Between Subjects Designs • 2 Factor designs • Data Format All scores in a single column Additional columns for each Factor
Main assumptions of ANOVA: • Assumptions: • There are 3 main assumptions underlying ANOVA • 1. Homogeneity of variance • The error variance within each condition should be statistically equal. Thus any differences between conditions should only be a shift in the mean. Put another way the effect of treatment/condition manipulations is to add a constant to each individual’s score.
Main assumptions of ANOVA: OK s2A s2B s2C m m m m A B C C NOT OK m m m C A B
Main assumptions of ANOVA: 2. Normality The distribution of errors within each condition should be normal. By errors we mean deviations from the mean for that condition. Because the errors are the deviations from the condition means this is equivalent to saying that the scores should be distributed normally about the condition means.
Main assumptions of ANOVA: 3.Independence of observations The data points should represent independent observations. Knowing the value of one should not tell you anything about the value of any other. N.B. This assumption is obviously violated in repeated measures experiments (because knowing that one data point comes from subject x –who might be a particularly fast responder, say- does tell you something about the likelihood of another observation from subject x being relatively fast). This is why Subjects have to be included as a factor in the analysis of repeated measures designs- the non-independent component is partialled out.
Design • Experiment to investigate the effect of stimulus duration and modality (Word vs Picture) on Recognition performance. • Dependent Variable (Score) • Two Factors: Modality and Duration
Factor Levels • Modality – two levels Word, Picture • Duration – two levels 200msec, 800msec • = 2 x 2 design
Modality Pictures Words 200 ms 5 subjects 5 subjects Duration 800 ms 5 subjects 5 subjects
View Factor Level Labels This person scored 127.19 and was tested in the ‘word’ modality and with the 800msec duration
Dependent Variable : Score Fixed Factors: duration + modality
Options – Condition means, descriptive stats, test for homogeneity (equality) of variances.
Displays overall mean, means for each level of duration, mean for each level of modality and the means for each combination of duration by modality (= the interaction means). Means
Produces Levene’s test for homogeneity of variance (one of the assumptions of Anova – i.e. that the variances within each cell of the design are not significantly different. Homogeneity Test
Gives descriptive statistics (mean, max, min, SD etc. by the experimental groups) Descriptive stats
Output Factors and Factor level labels
Output Descriptives- cell means & SDs
Output Levene’s test. This significant result means the assumption of equal group variances has not been met.
Output In this case the analysis is notvalid !. A data transformation may be of use here.
At this point either – Abandon the analysis See if a data transformation removes the problem (e.g. Log(score)) Report results but with ‘extreme caution’
Assume we have different data: 2 Factor Between (2 levels x 2 levels).sav Levene’s test, and any test that checks assumptions for an analysis should not be significant. Here the p value of .271 says that ‘there is no evidence for any differences in variances’ between the groups – which is what we want.
ANOVA Table (Ignore shaded items) Test for the Main Effect of Duration (i.e. 200 vs 800 ms pooling across both Modalities) Significant effect of Duration, F(1,16) = 5.5, p = .032
There was a significant effect of Stimulus Duration. Participants who viewed the stimulus for 200 msec scored higher (M =134) than those who viewed it for 800 msec (M = 115), F(1,16) = 5.5, p = .032.
This difference is significant Duration Profile Plot
ANOVA Table (Ignore shaded items) Test for the Main Effect of Modality (i.e. Pictures vs Words pooling across both Durations). No Significant effect of Modality.
Profile Plot for Modality Check the scale! This difference is not significant
Any graphs you present should be using the same scale. By default SPSS changes the scale so that the data takes up the whole graph area. Here are the two graphs on the same scale: Duration Modality
ANOVA Table (Ignore shaded items) Test for the Interaction between Modality and Duration. There was a significant two-way interaction between modality and duration, F(1,16) = 7.2, p = .017.
Main effect of Duration is still observable in the graph 200 msec Average 800 msec Average
Main effect of Duration is still observable in the graph 200 msec Average 800 msec Average
Interpretation of the Modality by Duration Interaction Several ways of describing the interaction:
Interpretation of the Modality by Duration Interaction “….At the 200 msec duration pictures resulted in scores approximately 20 points higher than words whereas at the 800 msec duration the opposite pattern was true with words producing scores approximately 20 points below pictures), F (1,16) = 7.2, p = .017.……”
Interpretation of the Modality by Duration Interaction “ “For words there was a small increase in performance going from the 200 msec (M= to the 800 msec duration. With pictures, however, there was a large decrease in performance
Alternative Plot – same data At the 200 msec duration performance was better with pictures (M = 144) than words (M = 124) whereas at the 200 msec duration the opposite was true with words giving better performance (M = 127) than pictures (M = 103), F (1,16) = 7.2, p = .017.
Extension to factors with 3 Levels 10 extra participants at 500 msec duration - 5 with Words, 5 with Pictures 2 Factor Between (2 levels x 3 levels).sav
The analysis is the same, however the interpretation of the main effect of DURATION is a little more complex: Note the increased Degree of Freedom for Duration and the interaction
Duration Profile Plot: A significant F test only says that ‘not all the means are equal’
To examine individual pair-wise comparisons: • If you make a prioripredictions about which means you are interested in comparing: • You can use Simple T tests (LSD) for 3 means • Sidak or Bonferroni for a greater number of comparisons. • 2. If you want to make post hoccomparisons: • You can use Tukey’s Test