240 likes | 400 Views
Evaluating and grading evidence. Phil Wiffen UK & Chinese Cochrane Centre. Quality assessments. Need to be specific for each type of study design. What do we mean by quality for systematic reviews?. The authors have used recognised methods to avoid bias in all stages of the review
E N D
Evaluating and grading evidence Phil Wiffen UK & Chinese Cochrane Centre
Quality assessments • Need to be specific for each type of study design.
What do we mean by quality for systematic reviews? • The authors have used recognised methods to avoid bias in all stages of the review • The review is an overview of the worlds literature on the topic. • The conclusions and recommendations fairly reflect the results found in included RCTS
What do we mean by quality for RCTs? • A trial is measuring the effect of an intervention in a sample • This is used to estimate the effect in the population from which the sample came • Quality means how well we think the trial has estimated this effect • Also called ‘internal validity’
What we don’t mean by quality for RCTs • How well the estimate of effect is likely to be true for populations other than the one from which the trial sample came (external validity) • Quality of reporting-we mean how the trial was actually conducted, not how it was reported
Why worry about trial quality in a review? • Garbage in, garbage out • especially important if reviews are looked at uncritically • Even if the review follows high quality methods, the answer may be wrong if the component studies are poor quality
Tools for evaluating and grading evidence • Basic Concepts • Critical appraisal tools • Other systems • Scoring/grading tools
3 key elements to consider • Quality • Size • Validity
Critical appraisal tools • Critical Appraisal Skills Programme (CASP) Oxford • http://www.phru.nhs.uk/casp/critical_appraisal_tools.htm
Tools available for: • systematic reviews • randomised controlled trials • qualitative research studies • economic evaluation studies • cohort studies • case control studies • diagnostic test studies
Ten questions to make sense of a review Adapted from Oxman AD et al Users Guide to the Medical Literature VI How to use an overview. JAMA 1994; 272 (17): 1367-71 Answer : YES, NO, DONT KNOW A Are the results of the review valid ? • 1. Did the review address a clearly focused issue ? • e.g. the population , intervention and or outcomes • 2. Did the authors look for the appropriate sort of papers ? • Did they deal with the issues and have appropriate study design ? • Is it worth continuing ?? • 3. Do you think the important relevant studies were included ? • Look for search methods, reference list use, unpublished studies • and non English language • 4. Did the authors do enough to asses the quality of included studies ? • 5. If the results of studies have been combined, was it reasonable to do so ?
Ten questions to make sense of a review cont’d B What are the results? 6. What is the overall result of the review ? Is there a clear numerical expression ? 7. How precise are the results ? Confidence intervals ? C Will the results help my local situation ? 8. Can the results be applied locally ? 9. Were all important outcomes considered ? 10 Are the benefits worth the harms and costs ?
Additional questions to make sense of a review What size are the studies? Beware small numbers <20 ? <50? <500? Is there any bias not yet covered ? Do I need to look at the primary studies ?
Eleven questions to make sense of a trial Answer : YES, NO, DONT KNOW A Are the results of the trial valid ? • 1. Did the review address a clearly focused issue ? • e.g. the population , intervention and or outcomes • 2. Was the trial randomised ? • Was this done properly ? • 3. Were all the subjects who entered accounted for in the results ? • Check for withdrawals , dropouts • Is it worth continuing ?? • 4. Were the study personnel and patients ‘blind’ to treatments • 5 Were the groups similar at the start of the study ? • 6. Were groups equally treated apart from the intervention ?
Eleven questions to make sense of a trial B What are the results? 7. How large was the treatment effect ? 8. How precise was the estimate of the treatment effect ? Confidence intervals ? C Will the results help my local situation ? 9. Can the results be applied locally ? 10. Were all important outcomes considered ? 11 Are the benefits worth the harms and costs ?
Guidance on reporting of studies • Clinical Trials:CONSORT, • Systematic reviews:QUOROM • diagnostic accuracy studies: STARD • tumour marker prognostic studies: REMARK. • Other: GRADE
Rowbotham et al Gabapentin for the Treatment of Postherpetic Neuralgia: A Randomized Controlled Trial JAMA, Dec 1998; 280: 1837 - 1842.
Assessment scales for RCTS • At least 20 scales • 11 checklists • 7 guidance documents • (ARHQ Evidence Report/Technology Assessment: Number 47 (2006)) • Are they of value ? • Tools NOT rules !
scoring • randomised? score yes 1 appropriate? yes (table) 1 no (alternate) -1 • double-blind? yes 1 appropriate? yes (double-dummy) 1 no -1 • withdrawals described? yes 1 maximum 5 (Jadad/Oxford)
Results from empirical studies Juni et al. BMJ 2001;323:42-46
So…… • Inadequate allocation concealment is a big threat to validity • Failure to blind is probably a threat, particularly when the outcome is ‘subjective’ • We have little evidence about exclusions
Quality scales • Assign an overall quality score • Lots of scales (approx 40) • Problems • selection of items • often not based on likelihood of bias • often look at how things are reported • validity/reliability often not tested • estimate of effect of quality may depend on scale used
LMWH for General Surgery RR of deep vein thrombosis with 95% CI Nurmohamed Chalmers I Chalmers TC Imperiale Smith Jadad Cho Onghena Poynard Spitzer ter Riet Andrew Beckerman Jonas Reisch Detsky Brown Kleijnen Gøtzsche Evans Goodman Levine Koes Linde Colditz Scale All trials High quality trials Low quality trials RR < 1: LMWH better RR > 1: LMWH worse Jüni et al, JAMA 1999 0·5 0·75 1 1·25
Conclusions • Some assessment is better than none. • Critical appraisal tools are very helpful in determining usefulness • Quality scales are very attractive but can be misleading. Use with caution