- 60 Views
- Uploaded on
- Presentation posted in: General

You may believe you are a Bayesian But you are probably wrong

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

You may believe you are a BayesianBut you are probably wrong

Stephen Senn

- The four systems of statistical inference
- An example of where it is good to be Bayesian
- Fisher’s argument against the Neyman-Pearson approach

- Examples of experts applying ‘the Bayesian’ approach
- Adrian Smith and colleagues 1987
- Lindley, 1993
- Howson and Urbach, 1989

- Some theoretical reasons for hesitation
- Conclusion
- Why I shall (probably) still be using mongrel statistics after this conference

- This talk should not be taken as an attack on the subjective Bayesian approach to statistical inference
- I do not claim it is a bad approach
- I do claim it can be very difficult and perhaps dangerous to rely on it as the only approach

- Fisherian
- Neyman-Pearson
- Jeffreys
- Bayesian (Ramsey-De Finetti-Savage)

George Barnard’s advice was to be familiar with all four

A two dimensional view of the four systems

Inverse probability

De Finetti

Use of semi-objective prior distributions to produce inverse probabilities

Jeffreys

Use of subjective expectation via utility

Fiducial inference

Likelihood

Significance tests

Fisher

Direct Probability

Pearson

Neyman

Inferences

Decisions

- A monoclonal antibody
- First-in-man study on 13 March 2006 carried out by Parexel on behalf of TeGenero
- In first cohort 8 volunteers
- Six allocated TGN1412 and two allocated placebo
- All six given TGN1412 suffered a cytokine storm

See.Senn SJ. Lessons from TGN1412. Applied Clinical Trials 2007;16(6):18-22.

FISHER'S EXACT TEST

Statistic based on the observed 2 by 2 table(x) :

P(X) = Hypergeometric Prob. of the table = 0.0357

FI(X) = Fisher statistic = 6.095

Asymptotic p-value: (based on Chi-Square distribution with 1 df )

Two-sided:Pr{FI(X) .GE. 6.095} = 0.0136

One-sided:0.5 * Two-sided = 0.0068

Exact p-value and point probabilities :

Two-sided:Pr{FI(X) .GE. 6.095}= Pr{P(X) .LE. 0.0357}= 0.0357

Pr{FI(X) .EQ. 6.095}= Pr{P(X) .EQ. 0.0357}= 0.0357

One-sided:Let y be the value in Row 1 and Column 1

y =6 min(Y) =4 max(Y) =6 mean(Y) = 4.500 std(Y) = 0.5669

Pr { Y .GE. 6 } = 0.0357

Pr { Y .EQ. 6 } = 0.0357

Datafile: C:\Program Files\Numerical\StatXact-4.0.1\Files\Research\TGN1412.cy3

BARNARD'S UNCONDITIONAL TEST FOR DIFFERENCE OF TWO BINOMIAL PROPORTIONS

Statistic based on the observed 2 by 2 table :

Binomial proportion for column <Yes > : pi_1 = 1.000

Binomial proportion for column <No > : pi_2 = 0.0000

Difference of binomial proportions : Delta = pi_2 - pi_1 = -1.000

Standardized difference of binomial proportions : Delta/Stdev = -2.828

Results:

-------------------------------------------------------------------------

Method P-value(1-sided) P-value( 2-sided)

-------------------------------------------------------------------------

Asymp 0.0023 (Left Tail) 0.0047

Exact 0.0111 (Left Tail) 0.0113

- “If you need statistics to prove it, I don’t believe it”
- Here the problem is the reverse
- You can’t prove it with statistics but everybody believes
- So does this mean statistics is irrelevant?
- Not if you look more closely…

- Timing of adverse events
- Increasing interest in using this feature in epidemiological studies
- Case series methodology
- Farrington and Whitaker (2006)

- Case series methodology
- Also if we use background knowledge of risk of cytokine storm we come to quite different conclusions
- But this is to be rather Bayesian

‘Their method only leads to definite results when mathematical postulates are introduced which could only be justified as a result of extensive experience.’

Fisher to Chester Bliss 6 October 1938 (Published in Bennett, 1990)

What Fisher is pointing to here is that although a null hypothesis may be more primitive than a test statistic, the same is not true of an alternative hypothesis.

Thus the alternative hypothesis cannot be made the justification for choosing the test statistic

- Two involve choice of prior distribution followed by formal Bayesian updating
- Racine-Poon, Grieve, Fluehler and Smith 1987
- Lindley 1993

- One involves an intuitive assertion of the posterior result, which is claimed to be Bayesian
- Howson and Urbach

- This is a fine paper with many examples as to how the Bayesian approach can be applied in drug development
- I shall just look at one of these
- The analysis of the Martin and Browning (1985) Data of metoprolol
- Actually, this paper is not cited by Racine et al but this is the relevant citation

Period 1

6 weeks

Period 2

6 weeks

100 mg once daily

200 mg once daily

4 weeks

Run in

Randomisation

100 mg once daily

200 mg once daily

31 patients aged 65+ with diastolic blood pressure in excess of 100mmHg randomised to these sequences. DBP measured after 6 weeks and 4-8 hours after last dose.

- The period 2 values could still be being influenced by the period 1 treatment
- Hence a comparison of period 1 and period 2 results would provide a biased measurement of the effect of treatment
- However, if we knew what the magnitude carry-over was we could take account of it
- Hence carry-over is a nuisance parameter and a prime candidate for the Bayesian approach

- None of the authors noted that the carry-over effect has to last for six weeks
- Nor did any of the discussants whether Bayesian or frequentist

- However the treatment effect only has to last for 4-8 hours
- The ratio of one to the other is at least 126
- You cannot use an uninformative prior for carry-over and be coherent

Anyone who is not shocked by quantum theory has not understood a single word.

Niels Bohr

Anyone who is not shocked by the Bayesian theory of statistical inference has not understood a single word

Stephen Senn

Paper by Lindley.

Lindley, D. The Analysis of Experimental Data, Teaching Statistics, 15, 22-25 (1993)

“The lady is a wine expert, testified by her being a Master (sic) of Wine, MW. She was given 6 pairs of glasses (not cups). One member of each pair contained some French claret. The other had a Californian Cabernet Sauvignon Merlot Blend.”

see also

Lindley, D. A Bayesian lady tasting tea. In Statistics an Appreciation, David and David (ed) Iowa State University Press (1984).

Lindley’s Prior for Wine Tasting

‘At this point I can only speak for myself though I hope many will agree with me. You may freely disagree and still be sensible.’

Lindley

I do disagree

Either the Lady knows something about wine or she hasn’t a clue. If she has, I think that she can repeat the trick of correct identification with high probability.

If she is a charlatan, there is a small probability that she may have a fine palate

Mathematical statistics is full of lemmas whereas applied statistics is full of dilemmas.

Senn’s Prior for Wine Tasting

- Imagine the lady has to distinguish between 20 pairs of glasses.
- You are given £100,000 to place at evens either for or against the following
- The lady will choose correctly in 12, 13,14, 15 or 16 pairs.
- How do you choose?

- Consider example of die rolled 600 times
- Results are
- 100, sixes, fives, fours and threes
- 123 twos
- 77 ones

- Pearson-Fisher chi-square statistic is 10.58

...one is, therefore, under no obligation to reject the null hypothesis, even though that hypothesis has pretty clearly got it badlywrong, in particular, in regard to the outcomes two and one” (p136, my italics).

From the second edition

- Lump of probability on fair die
- Symmetric Dirichlet prior over alternative
- Do not commit yourself to particular value of k, (Dirichlet parameter)
- Instead plot Bayes factor as function of K
- This is a sort of Type II likelihood

Bayes factor as function of prior

Parameter of symmetric Dirichlet

- If you had witnessed the die being rolled you would not necessarily conclude it was unfair
- If you were asked to decide whether these were results from a real die or one some philosophers had written down in a book you might decide on the latter
- This is because the Dirichlet distribution could not model your prior distribution
- It is somewhat unfair of H&U to claim that the frequentist approach has pretty clearly got it badly wrong
- I think that they would have great difficulty honestly specifying a prior distribution that allowed them to ‘get it right’ for this example and not look foolish for others

- Yes, if my aim is to claim that Bayesian methods are particularly bad
- They are not
- We all make errors in our search for errors and I am no exception

- No, if my aim is to counter the claim that Bayesian statistics is uniquely good
- In particular if the argument is that the only requirement for inference is coherence

- The DeFinetti theory is a theory of how to remain perfect
- You have a prior probability of all possible sequences of events
- As events unfold you strike out the sequences that did not occur and renormalise
- This is not, however a theory of how to become good

You have a collection of priors which do not form a coherent set.

You can only become Bayesian by trashing some of the priors until those that are left are coherent.

But if this is a legitimate thing to do, it seems to me that it must remain a legitimate thing to do in the future. This is then a license to continue not being Bayesian.

This then means that the Dutch book argument loses much of its force.

Statistician: Here is the result of the analysis of the trial you asked me to look at. I have added the likelihood to your prior. This is the posterior distribution.

Physician: Excellent! Now could you please take the results of the previous trials and do a meta-analysis?

Statistician. (after a pause) There is no need. The result I gave you is the meta-analysis. The previous trials are in your prior.

Physician. (after a pause). If the previous trials are in my prior, they got into my prior without your help at all. Why did I need you to help with producing the posterior?

In general Pn-1 + Dn Pn

Step 1: P0 + D1 P1,

Step 2: P1 + D2P2 ,

Or equivalently, P0 + D1 + D2 P2

But suppose P0 already includes D1 then this analysis would be illegitimate (like analysing 50 values using a chi-square on the percentages).

So use step 2 only. But suppose that P1 does not include D1.

This would be equivalent to analysing a contingency table of 200 observations using a chi-square on the percentages.

Then the principle of total information has been violated.

(Note, however that according to David Miller the principle of total information seems to be an independent principle which cannot be derived from maximising expected posterior utility except by imposing very artificial additional conditions.)

Theory

Elegant development based on coherence

Claim that it is the only way to behave

Claim to integrate all sources of information

Requires (in my view) perfect temporal coherence

Practice

A rag-bag of computational tools

Use of Bayes theorem but not therefore coherent

Often surprisingly poor treatment of prior distributions

Back to the drawing board allowed

- It is highly doubtful that the strong claims for Bayesian theory are a justification for Bayesian practice
- This does not mean that Bayesian statistics as practised is not useful
- The applied statistician needs a method that is useful in practice and not just in theory

- I remain sceptical of its claims to be the only useful statistical approach not least because admitting this to be true would still leave you sorely puzzled to do in practice

- When the robot gets stuck the scientist gets up and gives it a kick
- The robot does not know it is stuck
- To avoid being stuck in the inferential corner it is useful for us to have different ways of making inferences
- Where they disagree there is a warning that it is time to do some creative thinking

- Bayesian approach is excellent when you have to make decisions
- If you are going to uses frequentist approaches to decision making you may need to use stopping rules
- However, stopping rule adjustments are not a good way to summarise evidence
- And the same is true of Bayesian analyses
- I like randomisation but don’t make a fetish of it
- I like the likelihood principle but don’t make a fetish of it
- No (current) single approach to statistical inference seems to fit my needs as a jobbing statistician
- I like (to the limits of my lesser ability) following George Barnard’s advice of being prepared to consider four

Frequentists think that it is the thought that counts whereas Bayesians count the thoughts.