Aesthetics and power in multiple testing a contradiction
Download
1 / 21

Aesthetics and power in multiple testing a contradiction - PowerPoint PPT Presentation


  • 117 Views
  • Uploaded on

Aesthetics and power in multiple testing – a contradiction?. MCP 2007, Vienna Gerhard Hommel. Introduction: Economics and Statistics. Economics: profit is not everything Ethical / social component Competing interests Aesthetics: protection of environment, industrial art, patronage

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Aesthetics and power in multiple testing a contradiction' - gardenia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Aesthetics and power in multiple testing a contradiction

Aesthetics and power in multiple testing – a contradiction?

MCP 2007, Vienna

Gerhard Hommel


Introduction economics and statistics
Introduction: Economics and Statistics

Economics: profit is not everything

  • Ethical / social component

  • Competing interests

  • Aesthetics: protection of environment, industrial art, patronage

    Statistics: power is not everything

  • Ethics: decisions are logical, conceivable, simple

  • Competing interests

  • Aesthetics: “beauty of mathematics” (subjective), but also same points as for ethics


Examples for non aesthetics
Examples for (non-) aesthetics:

  • Closure test

    + : principle simply to describe

    + : coherence directly obtained

    – : often very cumbersome to perform

  • Bonferroni-Holm: SD(α/n, α/(n-1), … , α/2, α)

  • Hochberg : SU(α/n, α/(n-1), … , α/2, α)

  • FDP, e.g. control of P(FDP > 0.2):

    SD(α/n, α/(n-1), α/(n-2), α/(n-3), 2α/(n-3), 2α/(n-4), … , 3α/(n-7), …)

    not beautiful (and not powerful)!


Logical decisions coherence
Logical decisions: Coherence

Coherence:When a hypothesis (= subset of the parameter space) is rejected, every of its subsets can be rejected.

Closure test: Local level α tests for all - hypotheses + coherence  control of multiple level (FWER) α.

Closure tests form a complete class within all MTP’s controlling the FWER α.

But: Bonferroni-Holm is not coherent, in general!

Quasi-coherence: coherence for all index sets forming an intersection.

Quasi-closure test: Local level α tests for all index sets + quasi-coherence  control of multiple level (FWER) α.


Monotonic decisions
Monotonic decisions

Consider: monotonicity between different hypotheses:

p1, … ,pn = p-values

pi  pj and Hj rejected  Hi rejected.

Not obligatory: weights for hypotheses (from importance or expected power)

  • See Benjamini / Hochberg (1997)

  • Fixed sequence tests

  • Gatekeeping procedures


Monotonic decisions nested hypotheses
Monotonic decisions:nested hypotheses

Example: Yi = ß0 + ß1 xi + ß2 xi² +i

H1: ß1 = ß2 = 0 H2: ß2 = 0

F test of H1: p = .051

t test of H2: p = .024

Bonferroni-Holm ( = .05) rejects only H2

Logical: reject H1, too.

Size of a p-value is not the only criterion for rejection!


Monotonic decisions multiple comparisons
Monotonic decisions:multiple comparisons

Example: Comparison of k=4 means (ANOVA)

Hij: i = j , 1  i < j  4

p13 = .0241 < p34 = .0244 (t test; pooled variance)

Closure test rejects H14, H24, H34, but not H13!

(same result with regwq)

Non-monotonicity may be reasonable:

It is easier to separate group 4 from the cluster of groups 1,2,3 than to find differences within the cluster.


Monotonic decisions1
Monotonic decisions

My conclusion:

Only for equal weights and no logical constraints, it is mandatory that

  • decisions are monotonic in p-values, and

  • decisions are exchangeable.


Monotonicity within same hypothesis consistency
Monotonicity within same hypothesis(α-consistency)

Given p-values p1, …, pn; q1, …, qn

with qi pi for i=1,…,n.

When a hypothesis is rejected, based on pi‘s, it should also be rejected when based on qi‘s.

Counterexample 1 (WAP procedure of Benjamini-Hochberg, 1997):

Stepdown based on p(j)  w(j)α/(w(j)+…+w(n)):

Controls the FWER, but is not α-consistent.


Monotonicity within same hypothesis consistency1
Monotonicity within same hypothesis(α-consistency)

Counterexample 2: Tarone‘s (1990) MTP

Uses information about minimum attainable p-values α1*, …, αn*

n=2, α1*=.03, α2*=.04:

  • α = .05: no Hj can be rejected;

  • α = .035: H1 can be rejected if p1 .035.

    Hommel/Krummenauer (1998): monotonic improvement of Tarone‘s procedure (using a „rejection function“ b(α))


The fallback procedure i
The fallback procedure (I)

Wiens (2003): „fixed sequence testing procedure“ with possibility to continue

Dmitrienko, Wiens, Westfall (2005): „fallback procedure“

Wiens + Dmitrienko (2005): Proof that FWER is controlled, suggestion for improvement

Two types of weights:

  • sequence of hypotheses;

  • „assigned weights“ α1‘,…,αn‘ with Σαi‘=α.


The fallback procedure ii
The fallback procedure (II)

Use „assigned weights“ α1‘,…,αn‘ with Σαi‘=α .

Actual significance levels:

α1 = α1‘

αi = αi‘ + αi-1 if Hi-1 has been rejected

αi = αi‘ if Hi-1 has not been rejected.

α1‘= α, α2‘ = ... = αn‘ = 0 fixed sequence test.


Example for n 2
Example for n = 2

  • Endpoint 1: Functional capacity of heart

  • Endpoint 2: Mortality

  • α = .05,α1‘= .04, α2‘= .01

  • p1  .04: Reject H1 and test H2 with α2 = .05 .

  • p1 > .04: Retain H1 and test H2 with α2 = .01 .

    Weighted Bonferroni-Holm with α1‘= .04, α2‘= .01 :

    Rejects H1, in addition, when p2 .01 and

    .04 < p1  .05 !


Comparison with weighted bonferroni holm
Comparison with weighted Bonferroni-Holm

  • For n = 2: WBH is strictly more powerful than the fallback procedure. The improvement by Wiens + Dmitrienko is identical to WBH.

  • For n  3: There exist situations where fallback rejects and WBH not, and conversely. ( the improvement by W+D is not identical to WBH)


The fallback procedure for n 3 weights for intersection hypotheses
The fallback procedure for n=3:weights for intersection hypotheses

αi‘= wiα

 wi = 1

(see W+D)


The fallback procedure for n 3 equal weights
The fallback procedure for n=3:equal weights

αi‘= wiα

wi = 1/3

Consequence

for importance:

H2 H3 H1?


The fallback procedure for n 3 equal weights1
The fallback procedure for n=3:equal weights

αi‘= wiα

wi = 1/3

Consequence

for importance:

H2 H3 H1?


The fallback procedure for n 3 equal weights improvement by w d
The fallback procedure for n=3:equal weights; improvement by W+D

αi‘= wiα

wi = 1/3

Consequence

for importance:

H2 H3 H1

(remains)


The fallback procedure for n 3 equal weights2
The fallback procedure for n=3:equal weights

The decisions of the fallback procedure (with equal weights) are not exchangeable (and can never become!).

Example: p(1)=.015, p(2)=.02, p(3)=1; α=.05.

(Bonferroni-Holm: rejects H(1) and H(2) )

  • p1 < p2 < p3 : reject H1, H2

  • p1 < p3 < p2 : reject H1

  • p2 < p1 < p3 : reject H2

  • p2 < p3 < p1 : reject H2, H3

  • p3 < p1 < p2 : reject H3 (, H1)

  • p3 < p2 < p1 : reject H3


The fallback procedure critical questions
The fallback procedure:critical questions

  • What are the relations of the two different types of weighting?

  • Can it be meaningful to give higher assigned weights for higher indices?

  • Can one give „guidelines“ how to choose the weights?

  • Equal assigned weights: what is the influence of ordering? (anyway: the procedure has „aesthetic“ drawbacks)

  • For which situations can one expect that the fallback procedure is more powerful than WBH?

  • Or should one better renounce it completely?


Thank you for your attendance are there more questions or some answers
Thank you for your attendance! Are there more questions? Or some answers?


ad