Critical appraisal
1 / 68

Critical Appraisal - PowerPoint PPT Presentation

  • Uploaded on

Critical Appraisal. Dr. Chris Hall – Facilitator Dr. Dave Dyck R3 March 20/2003. Objectives:. Review study design and the advantages/ disadvantages of each Review key concepts in hypothesis, measurement, and analysis Article appraisal Treatment articles Diagnosis articles Harm articles

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Critical Appraisal' - nat

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Critical appraisal

Critical Appraisal

Dr. Chris Hall – Facilitator

Dr. Dave Dyck R3

March 20/2003


  • Review study design and the advantages/ disadvantages of each

  • Review key concepts in hypothesis, measurement, and analysis

  • Article appraisal

    • Treatment articles

    • Diagnosis articles

    • Harm articles

    • Overviews/meta-analysis

  • Survive the next hour and still be able to smile

Study design
Study Design:

  • Ecological studies

  • Case Reports

  • Case Series

  • Cross-Sectional Studies

  • Case Control and Retrospective Cohort Studies

  • Prospective Cohort Studies

  • Randomized Controlled Trials

Ecological studies
Ecological Studies:

  • Studies of a group rather than individual subjects

  • Supplies data on exposure and disease as a summary measure of the total population as an aggregate eg. Incidence studies

  • Berkson’s Bias: ie. The correlation between the variables is not the same on the individual level as it is for the group. Therefore you cannot link exposures to disease on an individual basis

  • Also, difficult to account for confounding variables

Case reports
Case Reports

  • Submission of individual cases with rare or interesting findings

  • ++++ subject to bias (selection / submission and publication)

  • Should not infer causality or suggest practice change

Case series
Case Series:

  • A group of “consecutive cases” with unifying features

  • Selection bias = what constitutes a case, is it truly consecutive, response bias

  • Publication bias

  • Measurement bias (presence of ‘disease’ or exposure may be variable)

Cross sectional studies
Cross Sectional Studies:

  • Ie. Prevalence study

  • Presence or absence of a specific disease compared with one or several variables within a defined population at a specific point in time

Cross sectional studies disadvantages
Cross Sectional Studies disadvantages:

  • Subject to selection bias (see HO)

  • Cause and effect cannot be determined (see HO) (ie. Don’t know whether the exposure occurs before the outcome or the outcome occurs before the exposure)

  • Temporal trends may be missed (seasonal variations)

  • Previous deaths, drop-outs, and migration are not counted; and short lived, transient outcomes are underrepresented. Thus, CSS are best suited to study chronic, non-fatal conditions.

Cross sectional studies advantages
Cross sectional studies – advantages:

  • Can do quickly

  • May provide enough of an association between an exposure/outcome to generate a hypothesis which can be studied by another method.

  • Useful for descriptive/analytical studies

Case control studies
Case Control Studies:

  • Starts now and goes back in time

  • Start with the outcome and ask or find out about prior exposure

  • Specific hypothesis usually tested

  • Select all cases of a specific disease during a certain time and select a number of controls who represent general population  then determine exposure to factor in each  odds ratio

  • May match controls to patients (but can never be sure of similar baseline states)

Ccs cont
CCS cont

  • Odds ratio provides an estimate of the relative risk (esp when disease is rare)

  • Thus, use CCS only when disease is rare (< 10% of population)

  • As OR increases (>1)  greater risk

  • As OR decreases (<1)  reduced risk

Ccs advantages
CCS advantages:

  • Small # needed (good for rare diseases or when outcomes are rare or delayed)

  • Quick

  • Inexpensive

  • Can study many factors

Ccs disadvantages
CCS disadvantages:

  • Problems selecting/matching controls

  • Only an estimate of relative risk

  • No incidence rates

  • Biases (? Unequal ascertainment of exposure between cases and controls)

    • Ie recall bias= cases are more likely to remember exposure than controls

    • Selection bias = cases and controls should be selected according to predetermined, strict, objective criteria

Cohort study prospective
Cohort Study (prospective)

  • Start with 2 groups free of disease and follow forward for a period of time

  • 1 group has the factor (eg. Smoking) the other group does not

  • Define 1 or more outcomes (eg. Lung CA)

  • Tabulate the # of persons who develop the outcome

  • Provides estimates of incidence, relative risk, and attributable risk

Relative risk attributable risk
Relative risk / Attributable risk

  • Relative risk = measures the strength of association between exposure and disease

  • Attributable risk = measures the number of cases of disease that can be attributed to exposure

  • Given a constant relative risk, attributable risk rises with incidence of the disease in members of the population who are not exposed

Cohort study
Cohort Study

  • Cannot by itself establish causation, but can show an association between a factor and an outcome

  • Generally provides stronger evidence for causation than case control studies

Cohort study advantages
Cohort Study advantages:

  • Lack of bias in factor

  • Uncovers natural history

  • Can study many diseases

  • Yields incidence rates, relative, and attributable risk

  • Allows for more control of confounding variables

Cohort study disadvantages
Cohort Study Disadvantages:

  • Possible bias in ascertainment of disease.

  • Need large numbers and long follow-up

  • Easy to lose patients in follow-up (attrition of subjects). This may introduce bias if lost subjects are different from those who continue to be followed

  • Hard to maintain comparable follow-up for all levels of exposure

Cohort study disadvantages cont
Cohort Study disadvantages cont.

  • Expensive

  • Locked into the factor(s) measured

  • Measurement bias (eg. Unblinded physician who looks harder for + outcomes in the exposed pt)

  • Confounding variables still present

Randomized control trials
Randomized Control Trials:

  • To test the hypothesis that an intervention (treatment or manipulation) makes a difference.

  • An experimental group is manipulated while a control group receives a placebo or standard procedure

  • All other conditions are kept the same between the groups

Critical appraisal

  • Goals=

    • Prevention (to decrease risk of disease or death)

    • Therapeutic (decrease symptoms, prevent recurrences, decrease mortality)

    • Diagnostic (evaluate new diagnostic procedures)

Rct problems
RCT problems:

  • Ethical issues

  • Difficulty to test an intervention that is already widely used

  • Randomization

  • Blinding techniques (may be difficult due to common SE of drugs)

  • Control group (placebo, conventional tx, specific tx)

  • Subject selection and issues of generalizability

  • Are refusers different in some way

Key terms for diagnostic tests
Key Terms for diagnostic tests:

  • Sensitivity= proportion with the disease identified by the test

  • Specificity= proportion without the disease with a negative test

Critical appraisal

Sensitivity= a/a+c


Other key terms
Other key terms:

  • Positive Predictive Value= This is the probability of having the disease given a positive test (a/a+b)

  • Negative Predictive Value= The probability of not having the disease given a negative test (d/c+d)

Statistical hypothesis
Statistical Hypothesis:

  • Null Hypothesis

    • Hypothesis of no difference between a test group and a control group (ie. There is no association between the disease and the risk factor in the population)

  • Alternative Hypothesis

    • Hypothesis that there is some difference between a test group and control group

Measurements and analysis
Measurements and Analysis:

  • Sampling bias = selecting a sample that does not truly represent the population

  • Sampling size = contributes to the credibility of “positive” studies and the power of “negative studies”. Increasing the sample size decreases the probability of making type I and type II errors.


  • Type I Error (alpha error) = the probability that a null hypothesis is considered false when it is actually true. (ie. Declaring an effect to be present when it is not)

    This probability is represented by the p value or alpha; the probability the difference is due to chance alone.

Errors cont
Errors cont.

  • Type II Error (Beta Error) = the probability of accepting a null hypothesis as true when it is actually false (ie. Declaring a difference/effect to be absent when it is present)

    • The probability that a difference truly exists

    • Reflects the power (1-Beta) of a study


  • Statistical Significance: determination by a statistical test that there is evidence against the null hypothesis.

  • The level of significance depends on the values chosen for alpha error

  • Usually alpha<.05 and beta<.20 (studies rarely aim for power >80%)

Significance cont
Significance cont.

  • Clinical Significance: statistical significance is necessary but not sufficient for clinical significance which reflects the meaningfulness of the difference (eg. A statistically significant 1mm Hg BP reduction is not clinically significant)

  • Also includes such factors as cost, SE.

Other terms
Other terms:

  • Accuracy= how closely a measurement approaches the true value

  • Reliability= how consistent or reproducible a measurement is when performed by different observers under the same conditions or the same observer under different conditions

  • Validity= describes the accuracy and reliability of a test (ie. The extent to which a measurement approaches what it is designed to measure)

Appraising an article jama
Appraising an article (JAMA):

  • 3 basic stages

    • 1) the validity – are the conclusions justified?

    • 2) the message – what are the results?

    • 3) the utility – can I generalise the findings to my patients?

Are the results valid therapy article
Are the results valid? – (therapy article)

  • Primary guides

    • Was the assignment of patients to treatment randomized?

    • Were all patients who entered the trial properly accounted for and attributed at its conclusion?

    • Was follow-up complete?

    • Were patients analyzed in the groups to which they were randomized? Ie. Intention to treat analysis

Are the results valid
Are the results valid?

  • Secondary guides:

    • Were patients, their clinicians, and study personnel “blind” to treatment? (avoids bias)

    • Were the groups similar at the start of the trial? (randomization not always effective if sample size small)

    • Aside from the experimental intervention, were the groups treated equally? (ie. Cointerventions)

What are the results
What are the results?

  • How large was the treatment effect?

    • Relative risk reduction vs absolute risk reduction

Critical appraisal

  • Baseline risk of death without therapy=20/100 = .20 = 20% (X = .20)

  • Risk with therapy reduced to 15/100 = .15 = 15% (Y = .15)

  • Absolute Risk Reduction = (X-Y) = .20-.15 = .05 (5%)

  • Relative Risk = (Y/X) = .15/.20 = .75

  • Relative Risk Reduction = [1-(Y/X)] x 100% = [1-(.75)] x 100% = 25%

Number needed to treat nnt
Number needed to treat = NNT

  • To calculate simply take the inverse of the absolute risk reduction

  • In last example= 1/.05 = 20 is the NNT

What are the results cont
What are the results? Cont.

  • How precise was the estimate of treatment effect?

    • Use confidence intervals (CI) = a range of values reflecting the statistical precision of an estimate (eg. A 95% CI has a 95% chance of including the true value)

    • CI narrow as sample size increases eg. In last example of 100 patients with 20 pts dying in the control group and 15 in the tx group the 95%CI for the RRR was -38% - 59%. If 1000 patients were enrolled in each group with 200 dying in the controls and 150 in the tx group the 95% CI for the RRR is 9%-41%.

Ci cont
CI cont

  • If CI cross 0 they are generally unhelpful in making conclusions

  • When is the sample size big enough?

    • If the lower boundary of the CI is still clinically significant to you (in + studies)

    • (or if the upper CI boundary is not clinically significant in negative studies)

What if no ci reported
What if no CI reported?

  • 1) use the p value = as the p value decreased below .05, the lower bound of the 95% confidence limit for the RRR rises above 0

  • 2) If the standard error (SE) of the RRR is presented it is easy to calculate the CI as 2xSE +/- point estimate (RRR)

  • 3) Calculate CI yourself or with a statistician

Will the results help me in caring for my patients
Will the results help me in caring for my patients?

  • Can the results be applied to my patient population?

  • Were all clinically important outcomes considered? Ie. Mortality, morbitity, quality of life endpoints

  • Are the likely treatment benefits worth the potential harm and costs? Ie. What is the patient’s baseline risk if left untreated. (NNT is helpful here)

Are the results valid1
Are the results valid?

  • Primary guides:

    • Was there an independent, blind comparison with a reference standard? (ie. Gold standard)

    • Did the patient sample include an appropriate spectrum of patients to whom the diagnostic test will be applied in clinical practice?

Are the results valid2
Are the results valid?

  • Secondary guides

    • Did the results of the test being evaluated influence the decision to perform the reference standard? Ie verification bias eg. Pioped = normal, near normal, low prob V/Q scans had only 69% going on for pulmonary angiogram whereas more positive V/Q scans had 92% going on for angiograms

    • Were the methods for performing the test described in sufficient detail to permit replication?

What are the results1
What are the results?

  • Are likelihood ratios for the test results presented or data necessary for their calculation included?

  • Likelihood ratio = the ratio between the likelihoods of having the disease, and not having the disease, with a + test

Likelihood ratios
Likelihood Ratios:

  • LR>10 and <.1 generate large and often conclusive changes from pretest to posttest probability

  • LR of 5-10 and .1-.2 generate moderate shifts in pretest and posttest probability

  • LR of 2-5 and .5-.2 generate small (but sometimes important) changes in probability

  • LR of 1-2 and .5-1 are generally insignificant

Bayesian analysis
Bayesian analysis

  • Makes use of LR to change pretest probabilities to posttest probabilities. (can use Fagan’s nomogram):

Will the results help me in caring for my patients1
Will the results help me in caring for my patients?

  • Will the reproducibility of the test result and its interpretation be satisfactory in my setting?

  • Are the results applicable to my patient?

  • Will the results change my management?

  • Will patients be better off as a result of the test?

Articles about harm
Articles about Harm?

  • 1st – what is the study design (RCT, cohort, case control, case series, etc)

    • Most important is that there is an appropriate control population

Are the results valid3
Are the results valid?

  • Were the exposures and outcomes measured in the same way in the groups being compared? (minimize recall/interviewer bias)

  • Was follow-up sufficiently long and complete?

  • Is the temporal relationship correct?

  • Is there a dose response gradient?

What are the results2
What are the results?

  • How strong is the association between exposure and outcome? Ie. Relative risk (if >1= increase in risk associated with exposure and <1= decrease in risk associated with exposure)

  • How precise is the estimate of risk? Ie. CI

What are the implications for my practice
What are the implications for my practice?

  • Are the results applicable to my practice?

  • What is the magnitude of the risk?

  • Should I attempt to stop the exposure?

Overviews systemic reviews and meta analysis
Overviews, Systemic Reviews, and Meta-analysis

  • Did the overview address a focussed clinical question?

  • Were the criteria used to select articles for inclusion appropriate? - these should be revealed in the paper

  • Is it unlikely that important, relevant studies were missed? (avoids publication bias- a higher likelihood for studies with positive results to be published)

  • Was the validity of the included studies appraised? (peer review does not guarantee the validity of published research)

Critical appraisal

  • Were assessments of studies reproducible? (better if there are more reviewers who are deciding which articles to include)

  • Were the results similar from study to study? (can use “tests of homogeneity” statistical analysis)


  • What are the overall results of the overview? (are studies weighted according to their size?)

  • There should be a summary measure which clearly conveys the practical importance of the result – eg. RRR, LR, NNT etc.

  • How precise were the results? CI still very helpful

Will the results help me in caring for my patients2
Will the results help me in caring for my patients?

  • Can the results be applied to my patient care? (subgroup analysis should be critiqued closely)

  • Were all clinically important outcomes considered? ( a clinical decision will require considering all outcomes both good and bad)

  • Are the benefits worth the harms and costs?