1 / 42

You may believe you are a Bayesian But you are probably wrong

You may believe you are a Bayesian But you are probably wrong. Stephen Senn. Outline. The four systems of statistical inference An example of where it is good to be Bayesian Fisher’s argument against the Neyman-Pearson approach Examples of experts applying ‘the Bayesian’ approach

darena
Download Presentation

You may believe you are a Bayesian But you are probably wrong

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. You may believe you are a BayesianBut you are probably wrong Stephen Senn

  2. Outline • The four systems of statistical inference • An example of where it is good to be Bayesian • Fisher’s argument against the Neyman-Pearson approach • Examples of experts applying ‘the Bayesian’ approach • Adrian Smith and colleagues 1987 • Lindley, 1993 • Howson and Urbach, 1989 • Some theoretical reasons for hesitation • Conclusion • Why I shall (probably) still be using mongrel statistics after this conference

  3. Warning • This talk should not be taken as an attack on the subjective Bayesian approach to statistical inference • I do not claim it is a bad approach • I do claim it can be very difficult and perhaps dangerous to rely on it as the only approach

  4. Four systems (Barnard) • Fisherian • Neyman-Pearson • Jeffreys • Bayesian (Ramsey-De Finetti-Savage) George Barnard’s advice was to be familiar with all four

  5. A two dimensional view of the four systems Inverse probability De Finetti Use of semi-objective prior distributions to produce inverse probabilities Jeffreys Use of subjective expectation via utility Fiducial inference Likelihood Significance tests Fisher Direct Probability Pearson Neyman Inferences Decisions

  6. TGN1412 • A monoclonal antibody • First-in-man study on 13 March 2006 carried out by Parexel on behalf of TeGenero • In first cohort 8 volunteers • Six allocated TGN1412 and two allocated placebo • All six given TGN1412 suffered a cytokine storm

  7. See. Senn SJ. Lessons from TGN1412. Applied Clinical Trials 2007;16(6):18-22.

  8. A Conventional Analysis FISHER'S EXACT TEST Statistic based on the observed 2 by 2 table(x) : P(X) = Hypergeometric Prob. of the table = 0.0357 FI(X) = Fisher statistic = 6.095 Asymptotic p-value: (based on Chi-Square distribution with 1 df ) Two-sided:Pr{FI(X) .GE. 6.095} = 0.0136 One-sided:0.5 * Two-sided = 0.0068 Exact p-value and point probabilities : Two-sided:Pr{FI(X) .GE. 6.095}= Pr{P(X) .LE. 0.0357}= 0.0357 Pr{FI(X) .EQ. 6.095}= Pr{P(X) .EQ. 0.0357}= 0.0357 One-sided:Let y be the value in Row 1 and Column 1 y =6 min(Y) =4 max(Y) =6 mean(Y) = 4.500 std(Y) = 0.5669 Pr { Y .GE. 6 } = 0.0357 Pr { Y .EQ. 6 } = 0.0357

  9. A Slightly Less Conventional Analysis Datafile: C:\Program Files\Numerical\StatXact-4.0.1\Files\Research\TGN1412.cy3 BARNARD'S UNCONDITIONAL TEST FOR DIFFERENCE OF TWO BINOMIAL PROPORTIONS Statistic based on the observed 2 by 2 table : Binomial proportion for column <Yes > : pi_1 = 1.000 Binomial proportion for column <No > : pi_2 = 0.0000 Difference of binomial proportions : Delta = pi_2 - pi_1 = -1.000 Standardized difference of binomial proportions : Delta/Stdev = -2.828 Results: ------------------------------------------------------------------------- Method P-value(1-sided) P-value( 2-sided) ------------------------------------------------------------------------- Asymp 0.0023 (Left Tail) 0.0047 Exact 0.0111 (Left Tail) 0.0113

  10. Conclusions • “If you need statistics to prove it, I don’t believe it” • Here the problem is the reverse • You can’t prove it with statistics but everybody believes • So does this mean statistics is irrelevant? • Not if you look more closely…

  11. Further information • Timing of adverse events • Increasing interest in using this feature in epidemiological studies • Case series methodology • Farrington and Whitaker (2006) • Also if we use background knowledge of risk of cytokine storm we come to quite different conclusions • But this is to be rather Bayesian

  12. Fisher on Neyman-Pearson ‘Their method only leads to definite results when mathematical postulates are introduced which could only be justified as a result of extensive experience.’ Fisher to Chester Bliss 6 October 1938 (Published in Bennett, 1990) What Fisher is pointing to here is that although a null hypothesis may be more primitive than a test statistic, the same is not true of an alternative hypothesis. Thus the alternative hypothesis cannot be made the justification for choosing the test statistic

  13. Three Examples Provided by Expert Bayesians • Two involve choice of prior distribution followed by formal Bayesian updating • Racine-Poon, Grieve, Fluehler and Smith 1987 • Lindley 1993 • One involves an intuitive assertion of the posterior result, which is claimed to be Bayesian • Howson and Urbach

  14. Racine et al • This is a fine paper with many examples as to how the Bayesian approach can be applied in drug development • I shall just look at one of these • The analysis of the Martin and Browning (1985) Data of metoprolol • Actually, this paper is not cited by Racine et al but this is the relevant citation

  15. Design Period 1 6 weeks Period 2 6 weeks 100 mg once daily 200 mg once daily 4 weeks Run in Randomisation 100 mg once daily 200 mg once daily 31 patients aged 65+ with diastolic blood pressure in excess of 100mmHg randomised to these sequences. DBP measured after 6 weeks and 4-8 hours after last dose.

  16. Carry-over Problem • The period 2 values could still be being influenced by the period 1 treatment • Hence a comparison of period 1 and period 2 results would provide a biased measurement of the effect of treatment • However, if we knew what the magnitude carry-over was we could take account of it • Hence carry-over is a nuisance parameter and a prime candidate for the Bayesian approach

  17. Unfortunately • None of the authors noted that the carry-over effect has to last for six weeks • Nor did any of the discussants whether Bayesian or frequentist • However the treatment effect only has to last for 4-8 hours • The ratio of one to the other is at least 126 • You cannot use an uninformative prior for carry-over and be coherent

  18. Anyone who is not shocked by quantum theory has not understood a single word. Niels Bohr Anyone who is not shocked by the Bayesian theory of statistical inference has not understood a single word Stephen Senn

  19. A Bayesian Lady Tasting Wine Paper by Lindley. Lindley, D. The Analysis of Experimental Data, Teaching Statistics, 15, 22-25 (1993) “The lady is a wine expert, testified by her being a Master (sic) of Wine, MW. She was given 6 pairs of glasses (not cups). One member of each pair contained some French claret. The other had a Californian Cabernet Sauvignon Merlot Blend.” see also Lindley, D. A Bayesian lady tasting tea. In Statistics an Appreciation, David and David (ed) Iowa State University Press (1984).

  20. Lindley’s Prior for Wine Tasting

  21. ‘At this point I can only speak for myself though I hope many will agree with me. You may freely disagree and still be sensible.’ Lindley I do disagree Either the Lady knows something about wine or she hasn’t a clue. If she has, I think that she can repeat the trick of correct identification with high probability. If she is a charlatan, there is a small probability that she may have a fine palate

  22. The Difference between Mathematical and Applied Statistics Mathematical statistics is full of lemmas whereas applied statistics is full of dilemmas.

  23. Senn’s Prior for Wine Tasting

  24. Place Your Bets • Imagine the lady has to distinguish between 20 pairs of glasses. • You are given £100,000 to place at evens either for or against the following • The lady will choose correctly in 12, 13,14, 15 or 16 pairs. • How do you choose?

  25. An Example of Howson and Urbach’s • Consider example of die rolled 600 times • Results are • 100, sixes, fives, fours and threes • 123 twos • 77 ones • Pearson-Fisher chi-square statistic is 10.58

  26. Howson and Urbach’s Conclusion ...one is, therefore, under no obligation to reject the null hypothesis, even though that hypothesis has pretty clearly got it badlywrong, in particular, in regard to the outcomes two and one” (p136, my italics). From the second edition

  27. An Analysis Using Good’s Approach • Lump of probability on fair die • Symmetric Dirichlet prior over alternative • Do not commit yourself to particular value of k, (Dirichlet parameter) • Instead plot Bayes factor as function of K • This is a sort of Type II likelihood

  28. Bayes factor as function of prior Parameter of symmetric Dirichlet

  29. Conclusion • If you had witnessed the die being rolled you would not necessarily conclude it was unfair • If you were asked to decide whether these were results from a real die or one some philosophers had written down in a book you might decide on the latter • This is because the Dirichlet distribution could not model your prior distribution • It is somewhat unfair of H&U to claim that the frequentist approach has pretty clearly got it badly wrong • I think that they would have great difficulty honestly specifying a prior distribution that allowed them to ‘get it right’ for this example and not look foolish for others

  30. Am I being unfair? • Yes, if my aim is to claim that Bayesian methods are particularly bad • They are not • We all make errors in our search for errors and I am no exception • No, if my aim is to counter the claim that Bayesian statistics is uniquely good • In particular if the argument is that the only requirement for inference is coherence

  31. Perfection and Goodness • The DeFinetti theory is a theory of how to remain perfect • You have a prior probability of all possible sequences of events • As events unfold you strike out the sequences that did not occur and renormalise • This is not, however a theory of how to become good

  32. If You are not Already a Bayesian You have a collection of priors which do not form a coherent set. You can only become Bayesian by trashing some of the priors until those that are left are coherent. But if this is a legitimate thing to do, it seems to me that it must remain a legitimate thing to do in the future. This is then a license to continue not being Bayesian. This then means that the Dutch book argument loses much of its force.

  33. The Date of Information Problem Statistician: Here is the result of the analysis of the trial you asked me to look at. I have added the likelihood to your prior. This is the posterior distribution. Physician: Excellent! Now could you please take the results of the previous trials and do a meta-analysis? Statistician. (after a pause) There is no need. The result I gave you is the meta-analysis. The previous trials are in your prior. Physician. (after a pause). If the previous trials are in my prior, they got into my prior without your help at all. Why did I need you to help with producing the posterior?

  34. The Bayesian Meta-Analyst’s Dilemma In general Pn-1 + Dn Pn Step 1: P0 + D1 P1, Step 2: P1 + D2P2 , Or equivalently, P0 + D1 + D2 P2 But suppose P0 already includes D1 then this analysis would be illegitimate (like analysing 50 values using a chi-square on the percentages).

  35. The Dilemma Continued So use step 2 only. But suppose that P1 does not include D1. This would be equivalent to analysing a contingency table of 200 observations using a chi-square on the percentages. Then the principle of total information has been violated. (Note, however that according to David Miller the principle of total information seems to be an independent principle which cannot be derived from maximising expected posterior utility except by imposing very artificial additional conditions.)

  36. Theory Elegant development based on coherence Claim that it is the only way to behave Claim to integrate all sources of information Requires (in my view) perfect temporal coherence Practice A rag-bag of computational tools Use of Bayes theorem but not therefore coherent Often surprisingly poor treatment of prior distributions Back to the drawing board allowed The Two Faces of (subjective) Bayes

  37. My conclusion • It is highly doubtful that the strong claims for Bayesian theory are a justification for Bayesian practice • This does not mean that Bayesian statistics as practised is not useful • The applied statistician needs a method that is useful in practice and not just in theory • I remain sceptical of its claims to be the only useful statistical approach not least because admitting this to be true would still leave you sorely puzzled to do in practice

  38. The Robot Turtle in the Corner • When the robot gets stuck the scientist gets up and gives it a kick • The robot does not know it is stuck • To avoid being stuck in the inferential corner it is useful for us to have different ways of making inferences • Where they disagree there is a warning that it is time to do some creative thinking

  39. Where this leaves me • Bayesian approach is excellent when you have to make decisions • If you are going to uses frequentist approaches to decision making you may need to use stopping rules • However, stopping rule adjustments are not a good way to summarise evidence • And the same is true of Bayesian analyses • I like randomisation but don’t make a fetish of it • I like the likelihood principle but don’t make a fetish of it • No (current) single approach to statistical inference seems to fit my needs as a jobbing statistician • I like (to the limits of my lesser ability) following George Barnard’s advice of being prepared to consider four

  40. Finally Frequentists think that it is the thought that counts whereas Bayesians count the thoughts.

More Related