200 likes | 438 Views
Lecture 8: Multiple comparisons. What are multiple comparisons? The problem of experiment-wise a error When do we do multiple comparisons? Statistical tests which control a e Estimating treatment effects. m C : m N+P. m N : m N+P. m c : m N. Frequency. m C. m N. m N+P. Yield.
E N D
Lecture 8: Multiple comparisons • What are multiple comparisons? • The problem of experiment-wise a error • When do we do multiple comparisons? • Statistical tests which control ae • Estimating treatment effects Bio 4118 Applied Biostatistics
mC: mN+P mN:mN+P mc:mN Frequency mC mN mN+P Yield Control Experimental (N) Experimental (N+P) What are multiple comparisons? • Pair-wise comparisons of different treatments • These comparisons may involve group means, medians, variances, etc. • for means, done after ANOVA • In all cases, H0 is that the groups in question do not differ. Bio 4118 Applied Biostatistics
Y Planned X1 X1 X2 X2 X3 X3 X4 X4 X5 X5 unplanned Y Types of comparisons • planned (a priori): independent of ANOVA results; theory predicts which treatments should be different. • unplanned (a posteriori): depend on ANOVA results; unclear which treatments should be different. • Test of significance are very different between the two! Bio 4118 Applied Biostatistics
Predicted threshold Planned comparisons (a priori contrasts): catecholamine levels in stressed fish 0.7 • Comparisons of interest are determined by experimenter beforehand based on theory and do notdepend on ANOVA results. • Prediction from theory: catecholamine levels increase above basal levels only after threshold PAO2 = 30 torr is reached. • So, compare only treatments above and below 30 torr (NT = 12). 0.6 0.5 0.4 [Catecholamine] 0.3 0.2 0.1 0.0 30 40 50 20 10 PA (torr) O 2 Bio 4118 Applied Biostatistics
0.7 0.6 0.5 0.4 [Catecholamine] 0.3 0.2 0.1 0.0 30 40 20 50 10 PA (torr) O 2 Predicted relationship Unplanned comparisons (a posteriori contrasts): catecholamine levels in stressed fish • Comparisons are determined by ANOVA results. • Prediction from theory: catecholamine levels increase with increasing PAO2 . • So, comparisons between any pairs of treatments may be warranted (NT = 21). Bio 4118 Applied Biostatistics
1.0 0.8 0.6 Experiment-wise a (ae) 0.4 0.2 0.0 10 8 4 6 0 2 Number of treatments Nominal a = .05 The problem: controlling experiment-wise a error • For k comparisons, the probability of accepting H0 (no difference) is (1 - a)k. • For 4 treatments, (1 - a)k = (0.95)6 = .735, so experiment-wise a (ae) = 0.265. • Thus we would expect to reject H0 for at least one paired comparison about 27% of the time, even if all four treatments are identical. Bio 4118 Applied Biostatistics
1.0 0.8 0.6 Experiment-wise a (ae) 0.4 0.2 0.0 6 8 2 4 10 0 Number of treatments Nominal a = .05 Controlling experiment-wise a error at nominal abyadjusting by total number of comparisons • To maintain ae at nominal a, we need to adjust afor each comparison by the total number of comparisons. • In this manner, ae becomes independent of the number of treatments and/or comparisons. Bio 4118 Applied Biostatistics
0.3 S, NT = 1 S, NT = 2 S, NT = 3 Probability (p) 0.2 0 0 5 10 15 20 Value of test statistic (S) Controlling experiment-wise a error at nominal a by using modified test statistics • Use modified test-statistic S for pair-wise comparisons whose distribution depends on the total number of comparisons NTsuch that p(S) increases with NT. Bio 4118 Applied Biostatistics
Use only after H0 is rejected on the basis of an ANOVA because... ... ANOVA is more robust and reliable than multiple comparisons. So, if H0 is accepted in original ANOVA, do not proceed to do multiple comparisons. Note, however, that there is no universally agreed-upon method for doing multiple comparisons, and... …results may differ depending on which method you use. So, proceed with caution! Using multiple comparisons Bio 4118 Applied Biostatistics
Controlling ae by adjusting individual a’s Bio 4118 Applied Biostatistics
Controlling ae by adjusting individual a’s p is probability associated with t-test of difference between 1 pair of means; k is total number of comparisons. Bio 4118 Applied Biostatistics
54.0 50.2 46.4 FKLNGTH 42.6 Dam construction 38.8 35.0 1966 1958 1965 1954 YEAR Example: Temporal variation in size of sturgeon (Model II ANOVA) • Prediction: dam construction resulted in loss of large sturgeon • Test: compare sturgeon size before and after dam construction • H0: mean size is the same for all years Bio 4118 Applied Biostatistics
Example: Temporal variation in size of sturgeon (ANOVA results) Conclusion: reject H0 Bio 4118 Applied Biostatistics
54.0 50.2 46.4 FKLNGTH 42.6 Dam construction 38.8 35.0 1966 1958 1965 1954 YEAR Multiple comparison results Bio 4118 Applied Biostatistics
Controlling ae by using modified test statistics Bio 4118 Applied Biostatistics
Making a choice • Tukey’s, GT2 • Try several methods, and if you get similar results, you’re on safe ground. • If you get differences, they are due to: • how conservative/liberal the test is • how powerful it is • If comparisons using Bonferroni are still significant, you’re O.K. • If comparisons using Sidak are still non-significant, you’re also O.K. Bio 4118 Applied Biostatistics
0.20 0.16 0.12 Growth rate l (cm/day) 0.08 0.04 0.00 16 20 24 28 Water temperature (°C) Estimated effect Range of possible effects Estimating treatment effects (Model I ANOVA only!) • Concern is not just with whether treatments differ, but with how much they differ. • For example, what effect does a change in water temperature of 4°C have on trout growth rate? • Since each treatment mean has a certain precision, so too will the estimate of the effect. Bio 4118 Applied Biostatistics
Estimating treatment effects: confidence limits for group means using MSerror • Use MSerror from ANOVA table as an estimate of the variance of each treatment to calculate confidence intervals for treatment means. Bio 4118 Applied Biostatistics
Estimating treatment effects: confidence limits for group means using within-group variances • Use estimated standard deviations sifor each treatment (group): Bio 4118 Applied Biostatistics
Computing multiple CI’s for treatment means • For any group, a =.05, i.e. 5% of the time the true mean will lie outside the estimated 95% CI’s. • So, if you calculate multiple CI’s, you should control for ae. • For example, using Bonferonni a’ = a/k, we would have: Bio 4118 Applied Biostatistics