experimental statistics week 7 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Experimental Statistics - week 7 PowerPoint Presentation
Download Presentation
Experimental Statistics - week 7

play fullscreen
1 / 65
Download Presentation

Experimental Statistics - week 7 - PowerPoint PPT Presentation

carnig
111 Views
Download Presentation

Experimental Statistics - week 7

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Experimental Statistics - week 7 Chapter 15: Factorial Models (15.5) Chapter 17: Random Effects Models

  2. Model for 2-factor Design where

  3. Hypothetical Cell Means Auditory Visual 5 10 15 Auditory Visual 5 10 15

  4. Sum-of-Squares Breakdown (2-factor ANOVA) SSA SSB SSAB SSE

  5. 2-Factor ANOVA Table(2-Factor Completely Randomized Design) Source SS df MS F Main Effects A SSA a -1 B SSB b- 1 Interaction AB SSAB (a -1)(b- 1) Error SSE ab(n -1) Total TSS abn -1 (page 900)

  6. Hypotheses: Main Effects: Interactions:

  7. STIMULUS EXAMPLE: Personal computer presents stimulus, and person responds. Study of how RESPONSE TIME is effected by a WARNING given prior to the stimulus: 2-factors of interest: Warning Type --- auditory or visual Time between warning and stimulus -- 5 sec, 10 sec, or 15 sec. 

  8. Auditory Visual .204 .257 .170 .279 .181 .269 .167 .283 .182 .235 .187 .260 .202 .256 .198 .281 .236 .258 5 sec WarningTime 10 sec 15 sec

  9. GLM Output Stimulus Data The GLM Procedure Dependent Variable: response Sum of Source DF Squares Mean Square F Value Pr > F Model 5 0.02554894 0.00510979 17.66 <.0001 Error 12 0.00347200 0.00028933 Corrected Total 17 0.02902094 R-Square Coeff Var Root MSE response Mean 0.880362 7.458622 0.017010 0.228056 Source DF Type I SS Mean Square F Value Pr > F type 1 0.02354450 0.02354450 81.38 <.0001 time 2 0.00115811 0.00057906 2.00 0.1778 type*time 2 0.00084633 0.00042317 1.46 0.2701

  10. Testing Procedure 2 factor CRD Design Step 1. Test for interaction. Step 2. (a) IFthere IS NOT a significant interaction - test the main effects (b) IF there IS a significant interaction - compare cell means

  11. Stimulus Example Test for Interaction: Therefore we DO NOT reject the null hypothesis of no interaction.

  12. Stimulus Data

  13. Stimulus Example Test for Interaction: Therefore we DO NOT reject the null hypothesis of no interaction. Thus - based on the testing procedure, we next test for main effects.

  14. Testing Main Effects: For each main effect (i.e. A and B) Note:I’ll use LSD from this point on unless otherwise noted. In General: where N denotes the # of observations involved in the computation of a marginal mean.

  15. Auditory Visual .204 .257 .170 .279 .181 .269 .167 .283 .182 .235 .187 .260 .202 .256 .198 .281 .236 .258 5 sec WarningTime 10 sec 15 sec

  16. Stimulus Example Test for Main Effects: A (type): B (time): Thus, there is a significant effect due to type but not time - i.e. we can use LSD to compare marginal means for type - we will do this here for illustration although MC not needed when there are only 2 groups

  17. GLM Output -- Comparing “Types”   The GLM Procedure t Tests (LSD) for response NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square 0.000289 Critical Value of t 2.17881 Least Significant Difference 0.0175 Means with the same letter are not significantly different. t Grouping Mean N type A 0.264222 9 V B 0.191889 9 A

  18. GLM Output -- Comparing “Times” The GLM Procedure t Tests (LSD) for response NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.   Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square 0.000289 Critical Value of t 2.17881 Least Significant Difference 0.0214 Means with the same letter are not significantly different. t Grouping Mean N time A 0.238500 6 15 A A 0.226667 6 5 A A 0.219000 6 10

  19. Pilot Plant Data Variable = Chemical Yield Factors: A – Temperature (160, 180) B – Catalyst (C1 , C2) Temperature 59 74 61 70 50 69 58 67 50 81 54 85 46 79 44 81 Catalyst

  20. Pilot Plant -- Probability Plot of Residuals

  21. DATA one; INPUT temp catalyst$ yield; datalines; 160 C1 59 160 C1 61 . . . 180 C2 79 180 C2 81 ; PROCGLM; class temp catalyst; MODEL yield=temp catalyst temp*catalyst; Title 'Pilot Plant Example -- 2-way ANOVA'; MEANS temp catalyst/LSD; RUN; PROCSORT;BY temp catalyst; PROCMEANS; BY temp catalyst; OUTPUTOUT=cells MEAN=yield; RUN;

  22. Pilot Plant -- GLM Output Pilot Plant Example -- 2-way ANOVA General Linear Models Procedure Dependent Variable: YIELD Sum of Mean Source DF Squares Square F Value Pr > F Model 3 2525.0000000 841.6666667 58.05 0.0001 Error 12 174.0000000 14.5000000 Corrected Total 15 2699.0000000 R-Square C.V. Root MSE YIELD Mean 0.935532 5.926672 3.8078866 64.250000 Source DF Type I SS Mean Square F Value Pr > F TEMP 1 2116.0000000 2116.0000000 145.93 0.0001 CATALYST 1 9.0000000 9.0000000 0.62 0.4461 TEMP*CATALYST 1 400.0000000 400.0000000 27.59 0.0002

  23. RECALL: Testing Procedure 2 factor CRD Design Step 1. Test for interaction. Step 2. (a) IFthere IS NOT a significant interaction - test the main effects (b) IF there IS a significant interaction - compare cell means

  24. Pilot Plant Example Test for Interaction: Therefore we reject the null hypothesis of no interaction - and conclude that there is an interaction between temperature and catalyst. Thus, we DO NOT test main effects

  25. Since there is a significant interaction, we do not test for main effects! - instead compare “Cell Means” - NOTE: interaction plot is a plot of the cell means

  26. Pilot Plant Data Variable = Chemical Yield Factors: A – Temperature (160, 180) B – Catalyst (C1 , C2) Temperature 59 74 61 70 50 69 58 67 50 81 54 85 46 79 44 81 Catalyst

  27. Pilot Plant Data -- cell means Temperature 57.0 70.0 48.5 81.5 Catalyst

  28. Comparing Cell Means: If there is significant interaction, then we compare the a x b cell means using the criteria below. Procedure similar to that for comparing marginal means: where N denotes the # of observations involved in the computation of a cell mean.

  29. GLM Output -- Comparing “Temps” The GLM Procedure t Tests (LSD) for yield NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.   Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square 14.5 Critical Value of t 2.17881 Least Significant Difference 4.1483 Means with the same letter are not significantly different. t Grouping Mean N temp A 75.750 8 180 B 52.750 8 160 - disregard

  30. GLM Output -- Comparing “Catalysts” The GLM Procedure   t Tests (LSD) for yield NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square 14.5 Critical Value of t 2.17881 Least Significant Difference 4.1483 Means with the same letter are not significantly different. t Grouping Mean N catalyst A 65.000 8 C2 A A 63.500 8 C1 - disregard

  31. Note: - SAS does not provide a comparison of cell means

  32. NOTE: I will be out of the office tomorrow.

  33. Testing Procedure Revisted 2 factor CRD Design Step 1. Test for interaction. Step 2. (a) IFthere IS NOT a significant interaction - test the main effects (b) IF there IS a significant interaction - compare a x b cell means (by hand) Main Idea:We are trying to determine whether the factors effect the response either individually or collectively.

  34. Statistics 5372: Experimental Statistics Assignment Report Form Name: Lecture Assigned: Data Set or Problem Description Key Results of the Analysis Conclusions in the Language of the Problem Appendices: A. Tables and Figures Cited in the Report B. SAS Log from the Final SAS Run Notes: 1. All assignments should be typed using a word processor according to the format above. 2. SAS output should consist only of tables and figures cited in the report. The report should refer to these tables and figures using numbers you assign, i.e. Table 1, etc. 3. The data should be listed somewhere in the report. (within SAS code is ok)

  35. Homework Assignment Due March 1, 2005 15.41, page 935 In this problem the authors consider two measures of the stability of a drug: MG/ML and pH. They ran a 2-factor ANOVA for each of these response variables using storage time and laboratory used in the analysis as the classification variables. There are 4 storage times considered and 2 labs. The data are in the table on page 935 and the resulting 2-factor ANOVA tables are shown on 935-936. Using SAS, reproduce the ANOVA tables given in the book, and complete an assignment report form for the two analyses.

  36. Auditory Visual .204 .257 .170 .279 .181 .269 .167 .283 .182 .235 .187 .260 .202 .256 .198 .281 .236 .258 5 sec WarningTime 10 sec 15 sec

  37. Note:For balanceddesigns, i.e. for STIMULUS data .228 = (.227+.219+.239)/3 = (.192+.264)/2

  38. Now Consider: Auditory Visual .204 .257 .170 .279 .181 .269 .167 .283 .182 .235 .187 .260 .202 .256 .198 .281 .236 .258 5 sec WarningTime 10 sec 15 sec

  39. Balanced Experimental Designs • Every Combination of the Factor Levels has an Equal Number of Repeats • Sums of Squares • Uniquely Calculated • Usual Textbook Formulas Unbalanced Experimental Designs • Not Every Combination of the Factor Levels has an Equal Number of Repeats • Sums of Squares • Not Uniquely Calculated • Usual Textbook Formulas Are Not Valid

  40. Unbalanced Experimental Designs Many Software Programs Cannot Properly Calculate Sums of Squares for Unbalanced Designs - they typically use “Textbook Formulas” SAS: - must Use Proc GLM, not Proc ANOVA - Type I and Type III sums-of-squares results will not generally agree - use Type III sums of squares -- analysis is closest to that for “Balanced Experiments”

  41. Unbalanced Data -- GLM Output The GLM Procedure Dependent Variable: response Sum of Source DF Squares Mean Square F Value Pr > F Model 5 0.02547774 0.00509555 19.13 <.0001 Error 11 0.00293050 0.00026641 Corrected Total 16 0.02840824 R-Square Coeff Var Root MSE response Mean 0.896843 7.112913 0.016322 0.229471 Source DF Type I SS Mean Square F Value Pr > F type 1 0.02309680 0.02309680 86.70 <.0001 time 2 0.00122742 0.00061371 2.30 0.1460 type*time 2 0.00115351 0.00057676 2.16 0.1611 Source DF Type III SS Mean Square F Value Pr > F type 1 0.02367796 0.02367796 88.88 <.0001 time 2 0.00130085 0.00065042 2.44 0.1326 type*time 2 0.00115351 0.00057676 2.16 0.1611

  42. Model for 3-factor Factorial Design where and also, the sum over any subscript of a 2 or 3 factor interaction is zero

  43. Sum-of-Squares Breakdown (3-factor ANOVA)

  44. 3-Factor ANOVA Table(3-Factor Completely Randomized Design) Source SS df MS F Main Effects A SSA a -1 B SSB b - 1 C SSC c - 1 Interactions AB SSAB (a -1)(b- 1) AC SSAC (a -1)(c- 1) BC SSBC (b -1)(c- 1) ABC SSABC (a -1)(b- 1)(c- 1) Error SSE abc(n -1) Total TSS abcn -1 See page 908

  45. Popcorn Data Response variable --% of kernels that popped • Factors • (A) Brand (3 brands) • (B) Power of Microwave (500, 600 watts) • (C) 4, 4.5 minutes • n =2replications per cell

  46. Popcorn Data 1 500 4.5 70.3 1 500 4.5 91.0 1 500 4 72.7 1 500 4 81.9 1 600 4.5 78.7 1 600 4.5 88.7 1 600 4 74.1 1 600 4 72.1 2 500 4.5 93.4 2 500 4.5 76.3 2 500 4 45.3 2 500 4 47.6 2 600 4.5 92.2 2 600 4.5 84.7 2 600 4 66.3 2 600 4 45.7 3 500 4.5 50.1 3 500 4.5 81.5 3 500 4 51.4 3 500 4 67.7 3 600 4.5 71.5 3 600 4.5 80.0 3 600 4 64.0 3 600 4 77.0