1 / 21

Aesthetics and power in multiple testing – a contradiction?

Aesthetics and power in multiple testing – a contradiction?. MCP 2007, Vienna Gerhard Hommel. Introduction: Economics and Statistics. Economics: profit is not everything Ethical / social component Competing interests Aesthetics: protection of environment, industrial art, patronage

gardenia
Download Presentation

Aesthetics and power in multiple testing – a contradiction?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Aesthetics and power in multiple testing – a contradiction? MCP 2007, Vienna Gerhard Hommel

  2. Introduction: Economics and Statistics Economics: profit is not everything • Ethical / social component • Competing interests • Aesthetics: protection of environment, industrial art, patronage Statistics: power is not everything • Ethics: decisions are logical, conceivable, simple • Competing interests • Aesthetics: “beauty of mathematics” (subjective), but also same points as for ethics

  3. Examples for (non-) aesthetics: • Closure test + : principle simply to describe + : coherence directly obtained – : often very cumbersome to perform • Bonferroni-Holm: SD(α/n, α/(n-1), … , α/2, α) • Hochberg : SU(α/n, α/(n-1), … , α/2, α) • FDP, e.g. control of P(FDP > 0.2): SD(α/n, α/(n-1), α/(n-2), α/(n-3), 2α/(n-3), 2α/(n-4), … , 3α/(n-7), …) not beautiful (and not powerful)!

  4. Logical decisions: Coherence Coherence:When a hypothesis (= subset of the parameter space) is rejected, every of its subsets can be rejected. Closure test: Local level α tests for all - hypotheses + coherence  control of multiple level (FWER) α. Closure tests form a complete class within all MTP’s controlling the FWER α. But: Bonferroni-Holm is not coherent, in general! Quasi-coherence: coherence for all index sets forming an intersection. Quasi-closure test: Local level α tests for all index sets + quasi-coherence  control of multiple level (FWER) α.

  5. Monotonic decisions Consider: monotonicity between different hypotheses: p1, … ,pn = p-values pi  pj and Hj rejected  Hi rejected. Not obligatory: weights for hypotheses (from importance or expected power) • See Benjamini / Hochberg (1997) • Fixed sequence tests • Gatekeeping procedures

  6. Monotonic decisions:nested hypotheses Example: Yi = ß0 + ß1 xi + ß2 xi² +i H1: ß1 = ß2 = 0 H2: ß2 = 0 F test of H1: p = .051 t test of H2: p = .024 Bonferroni-Holm ( = .05) rejects only H2 Logical: reject H1, too. Size of a p-value is not the only criterion for rejection!

  7. Monotonic decisions:multiple comparisons Example: Comparison of k=4 means (ANOVA) Hij: i = j , 1  i < j  4 p13 = .0241 < p34 = .0244 (t test; pooled variance) Closure test rejects H14, H24, H34, but not H13! (same result with regwq) Non-monotonicity may be reasonable: It is easier to separate group 4 from the cluster of groups 1,2,3 than to find differences within the cluster.

  8. Monotonic decisions My conclusion: Only for equal weights and no logical constraints, it is mandatory that • decisions are monotonic in p-values, and • decisions are exchangeable.

  9. Monotonicity within same hypothesis(α-consistency) Given p-values p1, …, pn; q1, …, qn with qi pi for i=1,…,n. When a hypothesis is rejected, based on pi‘s, it should also be rejected when based on qi‘s. Counterexample 1 (WAP procedure of Benjamini-Hochberg, 1997): Stepdown based on p(j)  w(j)α/(w(j)+…+w(n)): Controls the FWER, but is not α-consistent.

  10. Monotonicity within same hypothesis(α-consistency) Counterexample 2: Tarone‘s (1990) MTP Uses information about minimum attainable p-values α1*, …, αn* n=2, α1*=.03, α2*=.04: • α = .05: no Hj can be rejected; • α = .035: H1 can be rejected if p1 .035. Hommel/Krummenauer (1998): monotonic improvement of Tarone‘s procedure (using a „rejection function“ b(α))

  11. The fallback procedure (I) Wiens (2003): „fixed sequence testing procedure“ with possibility to continue Dmitrienko, Wiens, Westfall (2005): „fallback procedure“ Wiens + Dmitrienko (2005): Proof that FWER is controlled, suggestion for improvement Two types of weights: • sequence of hypotheses; • „assigned weights“ α1‘,…,αn‘ with Σαi‘=α.

  12. The fallback procedure (II) Use „assigned weights“ α1‘,…,αn‘ with Σαi‘=α . Actual significance levels: α1 = α1‘ αi = αi‘ + αi-1 if Hi-1 has been rejected αi = αi‘ if Hi-1 has not been rejected. α1‘= α, α2‘ = ... = αn‘ = 0 fixed sequence test.

  13. Example for n = 2 • Endpoint 1: Functional capacity of heart • Endpoint 2: Mortality • α = .05,α1‘= .04, α2‘= .01 • p1  .04: Reject H1 and test H2 with α2 = .05 . • p1 > .04: Retain H1 and test H2 with α2 = .01 . Weighted Bonferroni-Holm with α1‘= .04, α2‘= .01 : Rejects H1, in addition, when p2 .01 and .04 < p1  .05 !

  14. Comparison with weighted Bonferroni-Holm • For n = 2: WBH is strictly more powerful than the fallback procedure. The improvement by Wiens + Dmitrienko is identical to WBH. • For n  3: There exist situations where fallback rejects and WBH not, and conversely. ( the improvement by W+D is not identical to WBH)

  15. The fallback procedure for n=3:weights for intersection hypotheses αi‘= wiα   wi = 1 (see W+D)

  16. The fallback procedure for n=3:equal weights αi‘= wiα  wi = 1/3 Consequence for importance: H2 H3 H1?

  17. The fallback procedure for n=3:equal weights αi‘= wiα  wi = 1/3 Consequence for importance: H2 H3 H1?

  18. The fallback procedure for n=3:equal weights; improvement by W+D αi‘= wiα  wi = 1/3 Consequence for importance: H2 H3 H1 (remains)

  19. The fallback procedure for n=3:equal weights The decisions of the fallback procedure (with equal weights) are not exchangeable (and can never become!). Example: p(1)=.015, p(2)=.02, p(3)=1; α=.05. (Bonferroni-Holm: rejects H(1) and H(2) ) • p1 < p2 < p3 : reject H1, H2 • p1 < p3 < p2 : reject H1 • p2 < p1 < p3 : reject H2 • p2 < p3 < p1 : reject H2, H3 • p3 < p1 < p2 : reject H3 (, H1) • p3 < p2 < p1 : reject H3

  20. The fallback procedure:critical questions • What are the relations of the two different types of weighting? • Can it be meaningful to give higher assigned weights for higher indices? • Can one give „guidelines“ how to choose the weights? • Equal assigned weights: what is the influence of ordering? (anyway: the procedure has „aesthetic“ drawbacks) • For which situations can one expect that the fallback procedure is more powerful than WBH? • Or should one better renounce it completely?

  21. Thank you for your attendance! Are there more questions? Or some answers?

More Related