1 / 51

Topic 24: Two-Way ANOVA

Topic 24: Two-Way ANOVA. Outline. Two-way ANOVA Data Cell means model Parameter estimates Factor effects model. Two-Way ANOVA. The response variable Y is continuous There are two categorical explanatory variables or factors. Data for two-way ANOVA. Y is the response variable

duke
Download Presentation

Topic 24: Two-Way ANOVA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic 24: Two-Way ANOVA

  2. Outline • Two-way ANOVA • Data • Cell means model • Parameter estimates • Factor effects model

  3. Two-Way ANOVA • The response variable Y is continuous • There are two categorical explanatory variables or factors

  4. Data for two-way ANOVA • Y is the response variable • Factor A with levels i = 1 to a • Factor B with levels j = 1 to b • Yijk is the kth observation in cell (i,j) • In Chapter 19, we assume equal sample size in each cell (nij=n)

  5. KNNL Example • KNNL p 833 • Y is the number of cases of bread sold • A is the height of the shelf display, a=3 levels: bottom, middle, top • B is the width of the shelf display, b=2 levels: regular, wide • n=2 stores for each of the 3x2=6 treatment combinations (nT=12)

  6. Read the data data a1; infile ‘../data/ch19ta07.txt'; input sales height width; proc print data=a1; run;

  7. The data Obs sales height width 1 47 1 1 2 43 1 1 3 46 1 2 4 40 1 2 5 62 2 1 6 68 2 1 7 67 2 2 8 71 2 2 9 41 3 1 10 39 3 1 11 42 3 2 12 46 3 2

  8. Notation • For Yijk we use • i to denote the level of the factor A • j to denote the level of the factor B • k to denote the kth observation in cell (i,j) • i = 1, . . . , a levels of factor A • j = 1, . . . , b levels of factor B • k = 1, . . . , n observations in cell (i,j)

  9. Model • We assume that the response variable observations are • Normally distributed • With a mean that may depend on the levels of the factors A and B • With a constant variance • Independent

  10. Cell Means Model • Yijk = μij + εijk • where μij is the theoretical mean or expected value of all observations in cell (i,j) • the εijk are iid N(0, σ2) • This means Yijk ~ N(μij, σ2), independent • The parameters of the model are • μij, for i = 1 to a and j = 1 to b • σ2

  11. Estimates • Estimate μij by the mean of the observations in cell (i,j), • For each (i,j) combination, we can get an estimate of the variance • We need to combine these to get an estimate of σ2

  12. Pooled estimate of σ2 • In general we pool the sij2, using weights proportional to the df, nij -1 • The pooled estimate is s2 = (Σ (nij-1)sij2) / (Σ(nij-1)) • Here, nij = n, so s2 = (Σsij2) / (ab), which is the average sample variance

  13. Run proc glm proc glm data=a1; class height width; model sales= height width height*width; means height width height*width; run;

  14. Output

  15. Means statement height

  16. Means statement width

  17. Means statement ht*w

  18. Code the factor levels data a1; set a1; if height eq 1 and width eq 1 then hw='1_BR'; if height eq 1 and width eq 2 then hw='2_BW'; if height eq 2 and width eq 1 then hw='3_MR'; if height eq 2 and width eq 2 then hw='4_MW'; if height eq 3 and width eq 1 then hw='5_TR'; if height eq 3 and width eq 2 then hw='6_TW';

  19. Plot the data symbol1 v=circle i=none; proc gplot data=a1; plot sales*hw/frame; run;

  20. The plot

  21. Put the means in a2 proc means data=a1; var sales; by height width; output out=a2 mean=avsales; proc print data=a2; run;

  22. Output Data Set Obs height width _TYPE_ _FREQ_ avsales 1 1 1 0 2 45 2 1 2 0 2 43 3 2 1 0 2 65 4 2 2 0 2 69 5 3 1 0 2 40 6 3 2 0 2 44

  23. Plot the means symbol1 v=square i=join c=black; symbol2 v=diamond i=join c=black; proc gplot data=a2; plot avsales*height=width/frame; run;

  24. The interaction plot

  25. Questions to consider • Does the height of the display affect sales? If yes, compare top with middle, top with bottom, and middle with bottom • Does the width of the display affect sales? If yes, compare regular and wide

  26. But wait!!! Are these factor level comparisons meaningful? • Does the effect of height on sales depend on the width? • Does the effect of width on sales depend on the height? • If yes, we have an interaction and we need to do some additional analysis

  27. Factor effects model • For the one-way ANOVA model, we wrote μi = μ + αi • Here we use μij = μ + αi + βj + (αβ)ij • Under “common” formulation • μ (μ.. in KNNL) is the “overall mean” • αi is the main effect of A • βj is the main effect of B • (αβ)ij is the interaction between A and B

  28. Factor effects model • μ = (Σij μij)/(ab) • μi. = (Σj μij)/b and μ.j = (Σi μij)/a • αi = μi. – μ and βj = μ.j - μ • (αβ)ij is difference between μij and μ + αi + βj • (αβ)ij = μij - (μ + (μi. - μ)+ (μ.j - μ)) = μij – μi. – μ.j + μ

  29. Interpretation • μij = μ + αi + βj + (αβ)ij • μ is the “overall” mean • αi is an adjustment for level i of A • βj is an adjustment for level j of B • (αβ)ij is an additional adjustment that takes into account both i and j that cannot be explained by the previous adjustments

  30. Constraints for this framework • α. = Σi αi= 0 • β. = Σjβj = 0 • (αβ).j = Σi (αβ)ij = 0 for all j • (αβ)i. = Σj (αβ)ij = 0 for all i

  31. Estimates for factor effects model

  32. SS for ANOVA Table

  33. df for ANOVA Table • dfA = a-1 • dfB = b-1 • dfAB = (a-1)(b-1) • dfE = ab(n-1) • dfT = abn-1 = nT-1

  34. MS for ANOVA Table • MSA = SSA/dfA • MSB = SSB/dfB • MSAB = SSAB/dfAB • MSE = SSE/dfE • MST = SST/dfT

  35. Hypotheses for two-way ANOVA • H0A: αi = 0 for all i • H1A: αi ≠ 0 for at least one i • H0B: βj = 0 for all j • H1B: βj ≠ 0 for at least one j • H0AB: (αβ)ij = 0 for all (i,j) • H1AB: (αβ)ij ≠ 0 for at least one (i,j)

  36. F statistics • H0A is tested by FA = MSA/MSE; df=dfA, dfE • H0B is tested by FB = MSB/MSE; df=dfB, dfE • H0AB is tested by FAB = MSAB/MSE; df=dfAB, dfE

  37. ANOVA Table Source df SS MS F A a-1 SSA MSA MSA/MSE B b-1 SSB MSB MSB/MSE AB (a-1)(b-1) SSAB MSAB MSAB/MSE Error ab(n-1) SSE MSE _ Total abn-1 SSTO MST

  38. P-values • P-values are calculated using the F(dfNumerator, dfDenominator) distributions • If P ≤ 0.05 we conclude that the effect being tested is statistically significant

  39. KNNL Example • NKNW p 833 • Y is the number of cases of bread sold • A is the height of the shelf display, a=3 levels: bottom, middle, top • B is the width of the shelf display, b=2: regular, wide • n=2 stores for each of the 3x2 treatment combinations

  40. PROC GLM proc glm data=a1; class height width; model sales= height width height*width; run;

  41. Output Note that there are 6 cells in this design…(6-1)df for model

  42. Output ANOVA Note Type I and Type III Analyses are the same because nij is constant

  43. Other output Commonly do not consider R-sq when performing ANOVA…interested more in difference in levels rather than the models predictive ability

  44. Results • The main effect of height is statistically significant (F=74.71; df=2,6; P<0.0001) • The main effect of width is not statistically significant (F=1.16; df=1,6; P=0.32) • The interaction between height and width is not statistically significant (F=1.16; df=2,6; P=0.37)

  45. Interpretation • The height of the display affects sales of bread • The width of the display has no apparent effect • The effect of the height of the display is similar for both the regular and the wide widths

  46. Plot of the means

  47. Additional analyses • We will need to do additional analyses to explain the height effect (factor A) • There were three levels: bottom, middle and top • We could rerun the data with a one-way anova and use the methods we learned in the previous chapters • Use means statement with lines

  48. Run Proc GLM proc glm data=a1; class height width; model sales= height width height*width; means height / tukey lines; lsmeans height / adjust=tukey; run;

  49. MEANS Output

  50. LSMEANS Output

More Related