Interpreting the Medical Literature: A real example Sandra E Moore, MD MSc FAAP Assistant Professor of Clinical Pediatrics Morehouse School of Medicine
Objectives • To determine how to critically evaluate the literature • To determine how to choose an appropriate questions to study
Where do you Start • Start with the patient -- a clinical problem or question arises out of the care of the patient • The question - Construct a well built clinical question derived from the case • The resource - Select the appropriate resource(s) and conduct a search • The evaluation - Appraise that evidence for its validity (closeness to the truth) and applicability (usefulness in clinical practice) • The patient - Return to the patient and integrate that evidence with clinical expertise, patient preferences and apply it to practice • Self-evaluation - Evaluate your performance with this patient
Background questions • Information about a disease or process that you should already know or can readily access • Who • What • When • Why • Where and • How
Foreground questions • compares two or more things/groups • Drugs • Treatment modalities • Groups based on exposures • Diagnostic tests or • The harms or benefits of two approaches
PICO format • P - Patient / Problem • What are the characteristics of the patient or population? • What is the condition or disease you are interested in? • I - Intervention (or exposure) • What do you want to do with this patient (e.g. treat, diagnose, observe)? • C – Comparison (or Control), if any • What is the alternative to the intervention (e.g. placebo, different drug, surgery)? • O – Outcome • What are the relevant outcomes (e.g. morbidity, death, complications)?
PP -ICONS • P – Problem • P - Patient / Problem • I - Intervention (or exposure) • C - Comparison (or Control), if any • O – Outcome (Disease - orientated Outcomes; DOEs or Patient Orientated Evidence that Matters; POEMs) • N - Number of Subjects • S - Statistics
Intervention Studies • Is the study valid? • What are the results? and • Is it applicable to the patient?
Clinical Scenario • You are in Continuity Clinic • Mother brings a 9 year old daughter in because of warts on hand. Mother read about the efficacy of duct tape in treating warts and wanted to know your opinion
Background Questions about Warts • Who? • What? • When? • Why? • Where? and • How?
What is our Question? children and warts duct tape nothing or other therapies resolution of warts • P (problem/patient) • Intervention • Control or comparisons • Outcomes Compared to standard treatment, is duct tape effective for the treatment (resolution) of warts in children?
PP –ICONS for Study • P – Problem • P - Patient / Problem • I - Intervention (or exposure) • C - Comparison (or Control), if any • O – Outcome (Disease orientated Outcomes; DOEs or Patient Orientated Evidence that Matters;POEMs) • N - Number of Subjects • S - Statistics
Search in Appropriate Database • Medline • Pubmed • Cochrane • InfoPOEMS (Patient-Oriented Evidence that Matters) • UpToDate • Md Consult • MSM Library • Focht DR III, et al. The efficacy of duct tape vs cryotherapy in the treatment of verruca vulgaris (the common wart). Arch Pediatr Adolesc Med October 2002;156:971-4.
Is duct tape effective for the treatment (resolution) of warts in children? Focht DR et.al • Objective To determine if application of duct tape isas effective as cryotherapy in the treatment of common warts. • Design A prospective, randomized controlled trial with2 treatment arms for warts in children. • Patients A total of 61 patients (age range, 3-22 years)were enrolled in the study from October 31, 2000, to July 25,2001; 51 patients completed the study and were available foranalysis. • Intervention Patients were randomized using computer-generatedcodes to receive either cryotherapy (liquid nitrogen appliedto each wart for 10 seconds every 2-3 weeks) for a maximum of6 treatments or duct tape occlusion (applied directly to thewart) for a maximum of 2 months. Patients had their warts measured atbaseline and with return visits. • Main Outcome Measure Complete resolution of the wart beingstudied. • Results Of the 51 patients completing the study, 26 (51%)were treated with duct tape, and 25 (49%) were treated withcryotherapy. Twenty-two patients (85%) in the duct tape armvs 15 patients (60%) enrolled in the cryotherapy arm had completeresolution of their warts (P = .05 by c2 analysis). The majorityof warts that responded to either therapy did so within thefirst month of treatment. • Conclusion Duct tape occlusion therapy was significantlymore effective than cryotherapy for treatment of the commonwart.
Is this a clinical relevant question? • Warts are common in children • Quick, effective, and inexpensive treatmentis not available ( maybe salicylicacid) • Although warts are medically benign, theyare unsightly and may cause a child to feel self-conscious • Parents/children often seek medical therapy • Up to 30% resolve in 10 weeks without therapy
Are the results of this therapy (intervention) study valid? • Was the assignment of patients to treatment randomized? • Were all the patients who entered the trial properly accounted for at its conclusion? • Were patients analyzed in the groups to which they were (originally) randomized? • Where there enough patients (N)? • Where the statistical test appropriate? • Were the results statistically significance?
Was the assignment of patients to treatment randomized? • Random allocation comes closest to insuring the creation of groups of patients who will be similar in their risk of the events you hope to prevent. • Randomization balances the groups for prognostic factors (ie. disease severity) which eliminates over-representation of any one characteristic within the study groups. • Randomization should also be concealed from the clinicians and researchers of the study to help eliminate conscious or unconscious bias. • Were the patient randomized in the duct tape study?
Number of Subjects (N) • The number of subjects is crucial to whether accurate statistics can be generated from the data. • Too few patients in a research study may not be enough to show that a difference actually exists between the intervention and comparison groups (known as the "power" of a study). • Many studies are published with less than 100 subjects, which is usually inadequate to provide reliable statistics. • A good rule of thumb is 400 subjects
Did the study have sufficient power? • The power of a test refers to its ability to detect what it is looking for. The probability of correctly rejecting the null hypothesis (i.e. the probably of finding a difference if it truly exist) • Alpha (a): Usually set to be 0.05, although this is somewhat arbitrary. This is the probability of a type I error, that is the probability of rejecting the null hypothesis given that that the null hypothesis is true. In other words it is the probability of thinking we have found something when it is not really there. • Power = 1 – Beta (b), where beta is the probability of a type II error (acceptance of a false null hypothesis). Typically 80% (0.8) is considered adequate power. • Typically increase power by increasing sample size • No mention of power, a or b in duct tape study
Were all the patients who entered the trial properly accounted for at its conclusion? • All patients who started the trial should be accounted for at the end of the trial. If patients are not accounted for, the validity of the study may be jeopardized. • A good study will have better than 80% follow-up for their patients. • Patients may drop out of a study for various reasons. If these patients are not included in the results, they can make the treatment look better than it really is (and vice versa). • To be sure of a study's conclusions, lost patients should be assigned to the "worst-case" outcomes and the results recalculated. The results are still valid if the recalculations do not change the end results. This is referred to as “Intention to Treat”
Statistical Test in our study • Relative risk reduction (RRR): the percent reduction in events in the treated group compared to the control group event rate. • Not a good way to compare outcomes • Amplifies small differences and makes insignificant findings appear significant • Doesn’t reflect the baseline risk of the outcome event • Can make weak results look good • Popular and will be reported in almost every journal article • Can mislead you • RRR would be (85 percent – 60 percent) / (60 percent) x 100 = 42% • i.e. duct tapes appears to be 42% more effective than cryotherapy in treating warts • Absolute risk reduction (ARR): the difference in the outcome event rate between the control group and the experimental group. • A better statistic to evaluate outcome, as it does not amplify small differences, but shows the true difference between the experimental and control interventions • ARR for the wart study is the outcome event rate (complete resolution of warts) for duct tape (85 percent) minus the outcome event rate for cryotherapy (60 percent) = 25 percent
Statistical Significance • Are the result statistically significant? • Statistical test p < 0.05 • Confidence interval (CI) should not include “0” • Statistical significance DOES NOT mean clinical significance • Results of Duct tape study had a p value < .05 • CI not listed • The calculated 95% CI for this study'sreported treatment effect is 1.1 to 48.1, and we can state with95% confidence that the true treatment effect is somewhere betweenthese 2 values. • Maximally conservative estimate, including patients lost to follow-up, CI contains 0, and therefore difference not statistically significant
Key terminology for estimating the size of the treatment effect • Relative Risk (RR) is the risk of the outcome in the treated group (Y) compared to the risk in the control group. = Y / X • Relative Risk Reduction (RRR) is the percent reduction in risk in the treated group (Y) compared to the control group (X) = 1 - Y / X x 100% • Absolute Risk Reduction (ARR) is the difference in risk between the control group (X) and the treatment group (Y) = X – Y • Number Needed to Treat (NNT) is the number of patients that must be treated over a given period of time to prevent one adverse outcome = 1 / (X - Y)
Intention to Treat (ITT) • Intention to treat: subjects are analyzed according to the categories into which they were originally randomized. • Assumes worse case scenario • Benefits of a treatment are more difficult to demonstrate with intention-to-treat analysis • Helps to mitigate differences by including subjects who are unlikely to have experienced benefit from the intervention • Six patients from cryotherapy group and 4 patients from the duct tape group were lost to follow-up (16% of patients). • Worst case scenario: 6 cryotherapy patients had wart resolution and the 4 duct tape patients had residual wart. • Wart resolution would then be: duct tape 78% and cryotherapy 68% (95% CI, -17 to 28) – therefore not a statistically significant difference between the two treatments.
Number needed to treat (NNT) • Number needed to treat (NNT): number of patients who must be treated to prevent one adverse outcome OR the number of patients who must be treated for one patient to benefit • = 1/ARR. In the case of the duct tape study 1/.25 = 4 • The lower the NNT the better (intervention studies 10 is good, 5 is excellent; for preventive studies 20 is good)
Interpreting our Wart Study • Is this an important clinical question? • Are the patients studied similar to our patient? • Was the intervention acceptable? • Were the outcomes clinically relevant? • Was the assignment of patients to treatment randomized? • Were all the patients who entered the trial properly accounted for at its conclusion? • Were patients analyzed in the groups to which they were (originally) randomized? • Where there enough patients (N)? • Are the results statistically significance? • Are the results clinically significant? • Is the NNT appropriate?
Back to the patient • Would you recommend duct tape to your patient?
REFERENCES • The efficacy of duct tape vs cryotherapy in the treatment of verruca vulgaris (the common wart). Focht DR et al Arch Pediatr Adolesc Med. 2002;156:971-974 • Krejcie RV, Morgan DW. Determining sample size for research activities. Educational and Psychological Measurement. 1970;30:607-610. • Is Duct Tape Occlusion Therapy as Effective as Cryotherapy for the Treatment of the Common Wart? Ringold et al Arch Pediatr Adolesc Med. 2002;156(10):975-977. • Efficacy of Duct Tape vs Placebo in the Treatment of Verruca Vulgaris (Warts) in Primary School Children. de Haen et al. Arch Pediatr Adolesc Med 2006;160:1121-1125 • Interpreting negative results from an underpowered clinical trial: warts and all. Van Cleave et al. Arch Pediatr Adolesc Med 2006;160:1126-1129 • Studies Should Report Estimates of Treatment Effects With Confidence Intervals. Cummings Arch Pediatr Adolesc Med 2007;161:518-519.
Online References/Tools • http://www.med.umich.edu/pediatrics/ebm/Jcguide.htm • http://southmed.usouthal.edu/library/ebmclass/rotationswinterspring.htm • http://www.hsl.unc.edu/services/tutorials/ebm/Evidence.htm • http://www.aafp.org/fpm/20050700/37howt.html • http://healthlinks.washington.edu/ebp/pico.html • http://www.aafp.org/fpm/20040500/47asim.html • http://www.jeremymiles.co.uk/misc/power/ • http://www.graphpad.com/www/book/Choose.htm