Critical Appraisal Dr. Chris Hall – Facilitator Dr. Dave Dyck R3 March 20/2003
Objectives: • Review study design and the advantages/ disadvantages of each • Review key concepts in hypothesis, measurement, and analysis • Article appraisal • Treatment articles • Diagnosis articles • Harm articles • Overviews/meta-analysis • Survive the next hour and still be able to smile
Study Design: • Ecological studies • Case Reports • Case Series • Cross-Sectional Studies • Case Control and Retrospective Cohort Studies • Prospective Cohort Studies • Randomized Controlled Trials
Ecological Studies: • Studies of a group rather than individual subjects • Supplies data on exposure and disease as a summary measure of the total population as an aggregate eg. Incidence studies • Berkson’s Bias: ie. The correlation between the variables is not the same on the individual level as it is for the group. Therefore you cannot link exposures to disease on an individual basis • Also, difficult to account for confounding variables
Case Reports • Submission of individual cases with rare or interesting findings • ++++ subject to bias (selection / submission and publication) • Should not infer causality or suggest practice change
Case Series: • A group of “consecutive cases” with unifying features • Selection bias = what constitutes a case, is it truly consecutive, response bias • Publication bias • Measurement bias (presence of ‘disease’ or exposure may be variable)
Cross Sectional Studies: • Ie. Prevalence study • Presence or absence of a specific disease compared with one or several variables within a defined population at a specific point in time
Cross Sectional Studies disadvantages: • Subject to selection bias (see HO) • Cause and effect cannot be determined (see HO) (ie. Don’t know whether the exposure occurs before the outcome or the outcome occurs before the exposure) • Temporal trends may be missed (seasonal variations) • Previous deaths, drop-outs, and migration are not counted; and short lived, transient outcomes are underrepresented. Thus, CSS are best suited to study chronic, non-fatal conditions.
Cross sectional studies – advantages: • Can do quickly • May provide enough of an association between an exposure/outcome to generate a hypothesis which can be studied by another method. • Useful for descriptive/analytical studies
Case Control Studies: • Starts now and goes back in time • Start with the outcome and ask or find out about prior exposure • Specific hypothesis usually tested • Select all cases of a specific disease during a certain time and select a number of controls who represent general population then determine exposure to factor in each odds ratio • May match controls to patients (but can never be sure of similar baseline states)
CCS cont • Odds ratio provides an estimate of the relative risk (esp when disease is rare) • Thus, use CCS only when disease is rare (< 10% of population) • As OR increases (>1) greater risk • As OR decreases (<1) reduced risk
CCS advantages: • Small # needed (good for rare diseases or when outcomes are rare or delayed) • Quick • Inexpensive • Can study many factors
CCS disadvantages: • Problems selecting/matching controls • Only an estimate of relative risk • No incidence rates • Biases (? Unequal ascertainment of exposure between cases and controls) • Ie recall bias= cases are more likely to remember exposure than controls • Selection bias = cases and controls should be selected according to predetermined, strict, objective criteria
Cohort Study (prospective) • Start with 2 groups free of disease and follow forward for a period of time • 1 group has the factor (eg. Smoking) the other group does not • Define 1 or more outcomes (eg. Lung CA) • Tabulate the # of persons who develop the outcome • Provides estimates of incidence, relative risk, and attributable risk
Relative risk / Attributable risk • Relative risk = measures the strength of association between exposure and disease • Attributable risk = measures the number of cases of disease that can be attributed to exposure • Given a constant relative risk, attributable risk rises with incidence of the disease in members of the population who are not exposed
Cohort Study • Cannot by itself establish causation, but can show an association between a factor and an outcome • Generally provides stronger evidence for causation than case control studies
Cohort Study advantages: • Lack of bias in factor • Uncovers natural history • Can study many diseases • Yields incidence rates, relative, and attributable risk • Allows for more control of confounding variables
Cohort Study Disadvantages: • Possible bias in ascertainment of disease. • Need large numbers and long follow-up • Easy to lose patients in follow-up (attrition of subjects). This may introduce bias if lost subjects are different from those who continue to be followed • Hard to maintain comparable follow-up for all levels of exposure
Cohort Study disadvantages cont. • Expensive • Locked into the factor(s) measured • Measurement bias (eg. Unblinded physician who looks harder for + outcomes in the exposed pt) • Confounding variables still present
Randomized Control Trials: • To test the hypothesis that an intervention (treatment or manipulation) makes a difference. • An experimental group is manipulated while a control group receives a placebo or standard procedure • All other conditions are kept the same between the groups
RCTs • Goals= • Prevention (to decrease risk of disease or death) • Therapeutic (decrease symptoms, prevent recurrences, decrease mortality) • Diagnostic (evaluate new diagnostic procedures)
RCT problems: • Ethical issues • Difficulty to test an intervention that is already widely used • Randomization • Blinding techniques (may be difficult due to common SE of drugs) • Control group (placebo, conventional tx, specific tx) • Subject selection and issues of generalizability • Are refusers different in some way
Key Terms for diagnostic tests: • Sensitivity= proportion with the disease identified by the test • Specificity= proportion without the disease with a negative test
Sensitivity= a/a+c Specificity=d/b+d
Other key terms: • Positive Predictive Value= This is the probability of having the disease given a positive test (a/a+b) • Negative Predictive Value= The probability of not having the disease given a negative test (d/c+d)
Statistical Hypothesis: • Null Hypothesis • Hypothesis of no difference between a test group and a control group (ie. There is no association between the disease and the risk factor in the population) • Alternative Hypothesis • Hypothesis that there is some difference between a test group and control group
Measurements and Analysis: • Sampling bias = selecting a sample that does not truly represent the population • Sampling size = contributes to the credibility of “positive” studies and the power of “negative studies”. Increasing the sample size decreases the probability of making type I and type II errors.
Errors • Type I Error (alpha error) = the probability that a null hypothesis is considered false when it is actually true. (ie. Declaring an effect to be present when it is not) This probability is represented by the p value or alpha; the probability the difference is due to chance alone.
Errors cont. • Type II Error (Beta Error) = the probability of accepting a null hypothesis as true when it is actually false (ie. Declaring a difference/effect to be absent when it is present) • The probability that a difference truly exists • Reflects the power (1-Beta) of a study
Significance: • Statistical Significance: determination by a statistical test that there is evidence against the null hypothesis. • The level of significance depends on the values chosen for alpha error • Usually alpha<.05 and beta<.20 (studies rarely aim for power >80%)
Significance cont. • Clinical Significance: statistical significance is necessary but not sufficient for clinical significance which reflects the meaningfulness of the difference (eg. A statistically significant 1mm Hg BP reduction is not clinically significant) • Also includes such factors as cost, SE.
Other terms: • Accuracy= how closely a measurement approaches the true value • Reliability= how consistent or reproducible a measurement is when performed by different observers under the same conditions or the same observer under different conditions • Validity= describes the accuracy and reliability of a test (ie. The extent to which a measurement approaches what it is designed to measure)
Appraising an article (JAMA): • 3 basic stages • 1) the validity – are the conclusions justified? • 2) the message – what are the results? • 3) the utility – can I generalise the findings to my patients?
Are the results valid? – (therapy article) • Primary guides • Was the assignment of patients to treatment randomized? • Were all patients who entered the trial properly accounted for and attributed at its conclusion? • Was follow-up complete? • Were patients analyzed in the groups to which they were randomized? Ie. Intention to treat analysis
Are the results valid? • Secondary guides: • Were patients, their clinicians, and study personnel “blind” to treatment? (avoids bias) • Were the groups similar at the start of the trial? (randomization not always effective if sample size small) • Aside from the experimental intervention, were the groups treated equally? (ie. Cointerventions)
What are the results? • How large was the treatment effect? • Relative risk reduction vs absolute risk reduction
Eg. • Baseline risk of death without therapy=20/100 = .20 = 20% (X = .20) • Risk with therapy reduced to 15/100 = .15 = 15% (Y = .15) • Absolute Risk Reduction = (X-Y) = .20-.15 = .05 (5%) • Relative Risk = (Y/X) = .15/.20 = .75 • Relative Risk Reduction = [1-(Y/X)] x 100% = [1-(.75)] x 100% = 25%
Number needed to treat = NNT • To calculate simply take the inverse of the absolute risk reduction • In last example= 1/.05 = 20 is the NNT
What are the results? Cont. • How precise was the estimate of treatment effect? • Use confidence intervals (CI) = a range of values reflecting the statistical precision of an estimate (eg. A 95% CI has a 95% chance of including the true value) • CI narrow as sample size increases eg. In last example of 100 patients with 20 pts dying in the control group and 15 in the tx group the 95%CI for the RRR was -38% - 59%. If 1000 patients were enrolled in each group with 200 dying in the controls and 150 in the tx group the 95% CI for the RRR is 9%-41%.
CI cont • If CI cross 0 they are generally unhelpful in making conclusions • When is the sample size big enough? • If the lower boundary of the CI is still clinically significant to you (in + studies) • (or if the upper CI boundary is not clinically significant in negative studies)