Using and understanding numbers in health news and research. Heejung Bang, PhD Department of Public Health Weill Medical College of Cornell University. A rationale for today’s talk. Coffee is bad yesterday, but good today and bad again tomorrow.
Heejung Bang, PhD
Department of Public Health
Weill Medical College of Cornell University
Hardly a day goes by without a new headline about the supposed health risks or benefits of some thing…
Are these headlines justified?
Often, the answer is NO.
R. Peto phrases the nature of the conflict this way: “ supposed health risks or benefits of some thing…Epidemiology is so beautiful and provides such an important perspective on human life and death, but an incredible amount of rubbish is published,”
by which he means the results of observational studies that appear daily in the news media and often become the basis of public-health recommendations about what we should or should not do.
-- Ask people on the street “what is p-value?”
-- Only we may laugh if I make a statistical joke using 0.05, 1.96 and 95%, etc.
-- If there is no hypothesis, there is no test and no p-value.
-- you cannot say “it depends” all the times although it can be true.
-- 100% increase can be 1 → 2 cases
-- 20% event rate can be 1 out of 5 samples
-- Perhaps, for DNA and race, Watson should see the entire distribution or SD!
--- Title speaks for itself
--- some empirical evidence is almost always good to have
--- it is hard to fight with numbers (and age)!
See ‘Statistical flaw trips up study of bad stats’. Nature 2006
---this is a true cause!
A more profound quote from Hume is
‘All arguments concerning existence are founded on the relation of cause and effect.’
--- Every morning, we hear new causes of some disease are found.
-- there are an exceedingly large number of associated and correlated factors, compared to true causes.
-- a survey of 246 suggested coronary risk factors. Hopkins & Williams (1981)
-- I believe cancer >1000 risk factors.
‘Too many don’t do’ is no better than ‘do anything’.
vt. Throw (things) into disorder; mix up; confuse. (Oxford Dictionary)
-- first thing to do is ‘Use common sense’. Think about any other (hidden) factor or alternative explanation’.
Common sense is the basis for most of the ideas for designing scientific investigations. --- M Davidian
although we should not ignore the importance of serendipity in science
Neither associated nor correlated factors have this power.
‘Distinguishing Association from Causation:
A Backgrounder for Journalists’ from
American Council on Science and Health
There is nothing sinful about going out and getting evidence, like asking people how much do you drink and checking breast cancer records.
There’s nothing sinful about seeing if that evidence correlates.
There’s nothing sinful about checking for confounding variables.
The sin comes in believing a causal hypothesis is true because your study came up with a positive result, or believing the opposite because your study was negative.
In 1965, Hill proposed a set of the following causal criteria:
However, Hill also said “None of my nine viewpoints can bring indisputable evidence for or against the cause-and-effect hypothesis and none can be required as a sine qua non’.
-- Make group1: group2 = healthy people: sick people.
-- Oftentimes, treatment is bad in observational studies, why?
-- Do a survey among your friends only
-- People are different from the beginning?? (e.g., vegetarians vs. meat-lover, HRT users vs. non-users)
-- Vitamin C and cancer
-- Numerous other biases exist
This famous study has failed to replicate 16 or so times! Pauling received two Nobel.
-- To know the true effect of treatment within person, one should be treated and untreated at the same time.
i.e., retrospective rationalization.
--because real life stories can be complicated.
Remark: However, advanced statistical technique, causal inference, may help.
-- people tend to say ‘yes’, ‘moderately’
vs. Replicate or perish: New era
Replicability of the scientific findings can never be overemphasized. Results being ‘significant’ or ‘predictive’ without being replicable misinform the public and needlessly expend time and resources, and they are no service to investigators and science –S. Young
Given that we currently have too many findings, often with low credibility, replication and rigorous evaluation become as important as or even more important than discovery - J. Ioannidis (2006)
-- Pay more attention to 2nd study!
-- The Nurses Health Study, showing a 44% relative risk reduction in coronary disease in women receiving hormone therapy. Later refuted by Women's Health Initiative, which found that hormone treatment significantly increases the risk of coronary events.
-- Two large cohort studies, the Health Professionals Follow-Up Study and the Nurses Health Study, and a RCT all found that vitamin E was associated with a significantly reduced risk of coronary artery disease. But larger randomized trials subsequently showed no benefit of vitamin E on coronary disease
Modeling, the search for significance, the preference for novelty, and lack of interest in assumptions --- these norms are likely to generate a flood of nonreproducible results
ANSWER is total evidence.
RCT can provide strong evidence for a causal effect, especially if its findings are replicated by other studies
e.g., depression vs. obesity
-- Bonferroni might be the most hated statistician in history.
-- ‘Escaping the Bonferroni iron claw in ecological studies’ by Garcı´a et al. (2004)
vs. Type II (false negative: accepting H0 when it is false)
-- Controlling Type I is more important in stat and court. (e.g., innocent → guilty: disaster!)
-- In other fields, Type 2 can be more important.
-- You should always do subgroup analyses but never believe them. – R. Peto
-- Multiple testing adjustment and cross-validation may be solutions.
-- A priori chosen cutpoints or multiple testing adjustment can be solutions.
Lottery tickets should not be free. In random and independent events as the lottery, the probability of having a winning number depends on the N of tickets you have purchased. When one evaluates the outcome of a scientific work, attention must be given not only to the potential interest of the ‘significant’ outcomes but also to the N of ‘lottery tickets’ the authors have ‘bought’. Those having many have a much higher chance of ‘winning a lottery prize’ than of getting a meaningful scientific result. It would be unfair not to distinguish between significant results of well-planned, powerful, sharply focused studies, and those from ‘fishing expeditions’, with a much higher probability of catching an old truck tyre than of a really big fish. --- Garcı´a et al. (2004)
In the 1970s, every disease was reported to be associated with an HLA allele (schizophrenia, hypertension.... you name it!). Researchers did case control studies with 40 antigens, so there was a very high probability of at least one was significant result This result was reported without any mention of the fact that it was the most significant of 40 tests --- R. Elston
Association between reserpine (then a popular antihypertensive) and breast cancer. Shapiro (2004) gave the history. His team published initial results that were extensively covered by media with a huge impact on research community. When the results did not replicate, he confessed that the initial findings were chance due to thousands of comparisons involving hundreds of outcomes and hundreds of exposures. He hopes that we learned for the future from his mistake.
--possible ‘pick and choose’!
Usually these attempts through which the experimenter passed don't leave any traces; the public will only know the result that has been found worth pointing out; and as a consequence, someone unfamiliar with the attempts which have led to this result completely lacks a clear rule for deciding whether the result can or can not be attributed to chance.
The only thing to fear is fear itself……………………. …..………………………………and everything else
-- If not, more (interim) looks would lead what you want
-- think about # of genes!
Realistic strategies can be:
is universal solution for multiplicity and subgroup analyses (Vandenbroucke 2008)
In genome-wide analyses, it is a prerequisite for publication (Khoury et al. 2007)
-- However, replication is for someone else! The data analysis strategy of splitting the data into two parts, testing and verification, can be considered.
e.g., risk, hazard, odds, likelihood, rate, prevalence, incidence, valid, unbiased, consistent, cost-effective (≠cheap), efficient, SD vs. SE
responding to ‘File-drawer problem, revisited’ by Young & Bang (2004)
We are all responsible for all ---Dostoevsky (Rose’s Epi book)
(protocol and SAS output). --- W. Deming (& K. Griffin)
‘We know exactly why certain people commit suicide. We don’t know, within the ordinary concepts of causality, why certain others don't commit suicide. …. We know a great deal more about the causes of physical disease than we do about the causes of physical health.’ --- Scott Peck, MD, in the book ‘The Road Less Travelled’.