1 / 89

Evaluating Evidence in Medicine: What Can Go Wrong?

Evaluating Evidence in Medicine: What Can Go Wrong?. Skeptic’s Toolbox 2012 Harriet Hall, MD The SkepDoc. Overview. What constitutes evidence in medicine? What can go wrong in clinical studies? Why even “evidence-based medicine” is flawed. Is This Evidence?.

allie
Download Presentation

Evaluating Evidence in Medicine: What Can Go Wrong?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating Evidence in Medicine:What Can Go Wrong? Skeptic’s Toolbox 2012 Harriet Hall, MD The SkepDoc

  2. Overview • What constitutes evidence in medicine? • What can go wrong in clinical studies? • Why even “evidence-based medicine” is flawed.

  3. Is This Evidence?

  4. Is This Evidence? MRI Study of Salmon • A salmon was shown photographs of humans in social situations. It was asked to think about what emotion the individual in the photo must have been experiencing. • The salmon couldn’t talk, but: • On the fMRI scan, areas in the salmon’s brain lit up, indicating increased blood flow, indicating that the salmon was thinking.

  5. Is This Evidence That: • Salmon can see pictures? • Salmon know what human emotions are? • Salmon can identify emotions from pictures? • Salmon can respond to requests of what to think about?

  6. What’s Wrong With This Picture? The Salmon Was Dead and Gutted!

  7. Statistical Artifact • Each fMRI scan measures 50,000 voxels (3-D pixels) and each study involves thousands of scans. • If you mine the data, you can find practically anything you want. • Brain scans are the new phrenology • A blunt instrument • Scans are pooled to establish normal average • Often don’t mean what people think they mean

  8. Amen poster

  9. Would You Accept This Evidence? • I tried it. I got better. It worked for me. • Lots of people tried it and got better. • We gave it to a lot of people in a study and they improved. • We compared it to a no-treatment group or a usual-treatment group and it worked better. • We compared it to a placebo and it worked better. • The weight of evidence from a large body of studies shows that it works better than placebo

  10. Is This Evidence? • I tried it. I got better. It worked for me. • Anecdote. Plural of anecdote is not data. • Post hoc ergo propter hoc fallacy • Does Echinacea prevent colds? • Removing glucosamine didn’t remove effects • We gave it to a lot of people in a study and they improved. • Uncontrolled study. Maybe they would have improved without treatment. • Cold got better in a week with treatment, lasted 7 days without treatment.

  11. Is This Evidence? • Our study compared it to a no-treatment group or a usual-treatment group and it worked better. • Hawthorne effect: Doing something is better than doing nothing. • Our study compared it to a placebo and it worked better. • Was the study blinded? • Double blind, placebo-controlled randomized study is the Gold Standard. • BUT: What if we do a Gold Standard study on something totally implausible and it works better than a placebo?

  12. There’s A Lot of Evidence:A Fire Hose of Information • 21 million papers are listed in PubMed: • 700,000 more each year • One a minute • PubMed lists 23,000 journals, and there are many more not listed • You can find a study to support any belief.

  13. Never Believe One Study • Early positive studies often superseded by better, negative studies (HRT). • Ioannidis: Most published research findings are wrong.

  14. Ioannidis • The smaller the study, the less likely the research findings are to be true • The smaller the effect, the less likely the research findings are to be true. • The greater the financial and other interests, the less likely the research findings are to be true • The hotter a scientific field, (with more research teams involved), the less likely the research findings are to be true.

  15. Evaluating a Study • Ask a lot of questions • I’ll cover some

  16. Skeptics Question Everything

  17. What Kind of Study? • Case report • Case series • Case-control • Cohort • Epidemiologic • RCT • Placebo-controlled • Blinded (single or double)

  18. Who’s Paying? • Studies sponsored by pharmaceutical companies more likely to be positive • Subtle bias • Unpublished negative information • Studies by researchers with financial conflicts of interest (consulting fees, honoraria from pharmaceutical company) more likely to be positive 91% vs. 67%

  19. Big Pharma Distortion • Turner looked at all antidepressant studies registered with FDA • Published studies: 94% positive • Unpublished studies: 51% positive Evidence that antidepressants don’t work? No.

  20. Effect Size

  21. Turner vs. Kirsch • Kirsch said < .5 means ineffective • Effect size from journals: .41 • True effect size: .31 • Therefore antidepressants are not effective • Turner said glass not empty, 1/3 full • Patients’ responses not all-or-none; partial responses can be meaningful • Antidepressants DO work, just not as well as originally thought. • Kirsch supports psychotherapy, but its effect size is much less than .5.

  22. Scam Product Testing • In-house: by non-academics on company’s payroll • Worthless. Tweaked to get desired results • Independent testing companies: guns for hire • Minuscule effects touted as significant • Effects found, but not specific to product • Amino acids may improve muscle strength • Effects may not apply to average people (i.e. taping injuries)

  23. Are the Researchers Biased? • Homeopathy studies done by homeopaths • Chiropractic studies done by chiropractors • Surgical studies done by surgeons • Studies published in specialty journals for a biased audience

  24. Who Are the Subjects? • Self selection bias: who volunteers? • Believers? • Professional subjects? • Select group not typical of the general population. • Men only? No children? Limited age group? • Subjects with concurrent diseases not accepted • Subjects taking other medications not accepted.

  25. Were Negative Studies Suppressed? • File drawer effect • Negative studies not submitted for publication. • What if 4/5 studies were negative but only the positive one published? • Publication bias • Journals don’t like to publish negative studies. • Journals don’t like to publish replications that debunk original results. (Bem, Wiseman)

  26. Did Workers Mislead Author? • Technicians and subordinates know what the researcher hopes to find. • May try to please the boss, consciously or unconsciously • May circumvent blinding procedures • Can record 4.5 as 4 or 5. • Faking to make job easier (homeopathy prep)

  27. Did Workers Mislead Author? • Benveniste homeopathy study • Counting basophil degranulation under the microscope is somewhat subjective • Only one technician got positive results

  28. What Are the Odds? • 9 out of ten drugs in Phase I clinical trials fail. • 50% of drugs that reach Phase III trials fail. • A far higher percentage of promising drugs never make it to clinical trials; they fail in animal and in vitro trials.

  29. Do the Data Justify the Conclusion? • Teaching exercise: • Read the data section first • Draw your own conclusions • Read the paper’s conclusions • Scratch your head

  30. Do the Data Justify the Conclusion? Conclusion: low cholesterol kills children. The higher the cholesterol, the better for health.

  31. Do the Data Justify the Conclusion? • Sample of opportunity: data not collected systematically • Too few points to show correlation • Correlation doesn’t prove causation • Other explanations: • Hygiene • Poverty • Disease • Starvation • Genetic factors • Less access to medical care • Better explanation: undernourished children have abnormally low cholesterol levels

  32. Do the Data Justify the Conclusion? Conclusion: by the year 2038 100% of children will be autistic

  33. What Aren’t They Telling Us? • Selection methods • Randomization methods • Identity of placebo • Whether people were fooled by placebo • Proper blinding procedures? • Other factors • Glassware not thoroughly washed? • Contaminants in lab? • Mouse XMRV virus contaminated cell cultures in CFS study • Did they really do what they said they did?

  34. How Many Dropouts? - 10 total patients: 7 neg. 3 pos. = 30% pos. - 6 drop out because it’s not working - 30% success rate now looks like 75%

  35. Where Was the Study Done? Percent of Acupuncture Trials with Positive Results • Canada, Australia, New Zealand 30% • US 53% • Scandinavia 55% • UK 60% • Rest of Europe 78% • Asia 98% • Brazil, Israel, Nigeria 100%

  36. What was the sample size? • 1/3 of the chickens got better • 1/3 of the chickens stayed the same • What about the other third?

  37. Were There Errors in Statistics? • Wrong statistical test used • Errors in calculation

  38. What About Noncompliance? • Did all subjects take their pills? • Did they take them on time?

  39. Noncompliance • HIV Prophylaxis study in Africa • 95% said they usually or always took meds on time • Pill count data: 88% • Tests showed adequate plasma levels of drug: 15-26%

  40. Tooth Fairy Science • Are they trying to study something that doesn’t exist?

  41. Emily Rosa and the Emperor's New Clothes

  42. Inaccurate Measuring Methods? • Questionnaires rely on unreliable memories and patient honesty. • “30% less pain” • “I eat like a bird” • “Only one drink”

  43. Using a Bogus Test? Measuring the Components of ASEA • Amixture of 16 chemically recombined products of salt and water with completely new chemical properties. • They used a fluorescent indicator as a probe for unspecified “highly reactive oxygen species”

  44. How Many Endpoints Were There? • Multiple endpoints: some will show false correlations just by chance • Statistical corrections applied? • Inappropriate data mining? • The heart prayer study • 6 positive out of 26 factors studied • Inconsistent pattern

  45. Were Goalposts Moved? • AIDS prayer study: endpoint death • Not enough subjects died: AIDS drugs kept them alive • They went back and looked at a lot of other factors and found some apparent successes (i.e., fewer doctor visits) but no change in objective tests like CD4 count. • Only 40 patients. Study wasn’t designed to test non-death outcomes.

  46. Statistical Significance ≠Clinical Significance • Did the drug lower the BP by 1% or 30%? • Was the endpoint a lab value or a clinical benefit? • B vitamin supplements lower homocysteine but don’t lower risk of heart disease • PSA screening finds cancers; doesn’t improve survival • Are the results POEMS – Patient Oriented Evidence that Matters?

  47. Was There Fraud? • Dipak Das, resveratrol researcher • Review board found him guilty of 145 counts of fabrication or falsification of data • 12 of his papers retracted so far

  48. “I was blinded by work and my drive for achievement” • Hwang Woo-suk, stem cell researcher in South Korea, claimed to have cloned human embryonic stem cells • Fabricated crucial data • Embezzlement and bioethics law violations • Prison sentence (suspended) • 2 papers in Science retracted. • Fired from his job

  49. Columbia Prayer Study • Prayer doubled success of in vitro fertilization • Seriously flawed study • Convoluted design with 3 levels of overlapping prayer groups • No controls for prayers outside study • Investigated for lack of informed consent • Authors • Lobo, lead author, only learned of study 6-12 months after it was completed. Denied any involvement other than editorial help. • Cha severed his relationship with Columbia, refused to comment • Wirth: • Paranormal researcher with no medical degree • Con man who went to federal prison for fraud and conspiracy • Bruce Flamm debunked it in Skeptical Inquirer • Retracted by journal, but only years later • Still being cited as a valid study

  50. How Were the Data Reported? • NNT and NNH • Lipitor for primary prevention of heart attacks: • 19% Reduction • NNT 75-250, NNH 200. • Absolute risk vs. relative risk • Cellphones increase the risk of acoustic neuroma. Relative risk 200%. • Baseline risk is 1:100,000 • 200% of 1 is 2 • Absolute risk 1 more in 100,000, or 0.00001%

More Related