Diagnostic Testing
420 likes | 627 Views
Diagnostic Testing. Ethan Cowan, MD, MS Department of Emergency Medicine Jacobi Medical Center Department of Epidemiology and Population Health Albert Einstein College of Medicine. The Provider Dilemma.
Diagnostic Testing
E N D
Presentation Transcript
Diagnostic Testing Ethan Cowan, MD, MS Department of Emergency Medicine Jacobi Medical Center Department of Epidemiology and Population Health Albert Einstein College of Medicine
The Provider Dilemma • A 26 year old pregnant female presents after twisting her ankle. She has no abdominal or urinary complaints. The nurse sends a UA and uricult dipslide prior to you seeing the patient. What should you do with the results of these tests?
The Provider Dilemma • Should a provider give antibiotics if either one or both of these tests come back positive?
Why Order a Diagnostic Test? • When the diagnosis is uncertain • Incorrect diagnosis leads to clinically significant morbidity or mortality • Diagnostic test result changes management • Test is cost effective
Clinician Thought Process • Clinician derives patient prior prob. of disease: • H & P • Literature • Experience • “Index of Suspicion” • 0% - 100% • “Low, Med., High”
Probability of Disease 0% 100% Testing Zone P(+) P(-) Threshold Approach to Diagnostic Testing • P < P(-) Dx testing & therapy not indicated • P(-) < P < P(+) Dx testing needed prior to therapy • P > P(+) Only intervention needed Pauker and Kassirer, 1980, Gallagher, 1998
Probability of Disease 0% 100% Testing Zone P(+) P(-) Threshold Approach to Diagnostic Testing • Width of testing zone depends on: • Test properties • Risk of excess morbidity/mortality attributable to the test • Risk/benefit ratio of available therapies for the Dx Pauker and Kassirer, 1980, Gallagher, 1998
Reliability Inter observer Intra observer Correlation B&A Plot Simple Agreement Kappa Statistics Validity Sensitivity Specificity NPV PPV ROC Curves Test Characteristics
Reliability • The extent to which results obtained with a test are reproducible.
Reliability Not Reliable Reliable
Intra rater reliability • Extent to which a measure produces the same result at different times for the same subjects
Inter rater reliability • Extent to which a measure produces the same result on each subject regardless of who makes the observation
Correlation (r) • For continuous data • r = 1 perfect • r = 0 none O1 O1 = O2 O2 Bland & Altman, 1986
Correlation (r) • Measures relation strength, not agreement • Problem: even near perfect correlation may indicate significant differences between observations O1 r = 0.8 O1 = O2 O2 Bland & Altman, 1986
Bland & Altman Plot O1 – O2 • For continuous data • Plot of observation differences versus the means • Data that are evenly distributed around 0 and are within 2 STDs exhibit good agreement 10 0 -10 [O1 + O2] / 2 Bland & Altman, 1986
a b c d Simple Agreement Rater 1 Rater 2 • Extent to which two or more raters agree on the classifications of all subjects • % of concordance in the 2 x 2 table (a + d) / N • Not ideal, subjects may fall on diagonal by chance - + total - a + b + c + d total a + c b + d N
a b c d Kappa Rater 1 Rater 2 • The proportion of the best possible improvement in agreement beyond chance obtained by the observers • K = (pa – p0)/(1-p0) • Pa = (a+d)/N (prop. of subjects along the main diagonal) • Po = [(a + b)(a+c) + (c+d)(b+d)]/N2 (expected prop.) - + total - a + b + c + d total a + c b + d N
K=1 K > 0.80 0.60 < K < 0.80 0.40 < K < 0.60 0 < K < 0.40 K = 0 K < 0 Perfect Excellent Good Fair Poor Chance (pa = p0) Less than chance Interpreting Kappa Values
n11 n12 ... n1C n21 n22 ... n2C . . . . ... ... . . nC1 nC2 ... nCC Weighted Kappa Rater 1 Rater 2 1 2 ... C total • Used for more than 2 observers or categories • Perfect agreement on the main diagonal weighted more than partial agreement off of it. 1 n1. 2 n2. . . . . C nC. total n.1 n.2 ... n.C N
Validity • The degree to which a test correctly diagnoses people as having or not having a condition • Internal Validity • External Validity
Validity Valid, not reliable Reliable and Valid
Internal Validity • Performance Characteristics • Sensitivity • Specificity • NPV • PPV • ROC Curves
2 x 2 Table Disease Status TP = True Positives FP = False Positives total noncases cases positives Test Result + TP FP negatives - FN TN total cases noncases N TN = True Negatives FN = False Negatives
Gold Standard • Definitive test used to identify cases • Example: traditional agar culture • The dipstick and dipslide are measured against the gold standard
Sensitivity (SN) Disease Status • Probability of correctly identifying a true case • TP/(TP + FN) = TP/ cases • High SN, Negative test result rules out Dx (SnNout) total noncases cases positives Test Result + TP FP negatives - FN TN total cases noncases N Sackett & Straus, 1998
Specificity (SP) Disease Status • Probability of correctly identifying a true noncase • TN/(TN + FP) = TN/ noncases • High SP, Positive test result rules in Dx (SpPin) total noncases cases positives Test Result + TP FP negatives - FN TN total cases noncases N Sackett & Straus, 1998
Problems with Sensitivity and Specificity • Remain constant over patient populations • But, SN and SP convey how likely a test result is positive or negative given the patient does or does not have disease • Paradoxical inversion of clinical logic • Prior knowledge of disease status obviates need of the diagnostic test Gallagher, 1998
Positive Predictive Value (PPV) Disease Status • Probability that a labeled (+) is a true case • TP/(TP + FP) = TP/ total positives • High SP corresponds to very high PPV (SpPin) total noncases cases positives Test Result + TP FP negatives - FN TN total cases noncases N Sackett & Straus, 1998
Negative Predictive Value (NPV) Disease Status • Probability that a labeled (-) is a true noncase • TN/(TN + FN) = TP/ total negatives • High SN corresponds to very high NPV (SnNout) total noncases cases positives Test Result + TP FP negatives - FN TN total cases noncases N Sackett & Straus, 1998
Vulnerable to Disease Prevalence (P) Shifts Do not remain constant over patient populations As P PPV NPV As P PPV NPV Predictive Value Problems Gallagher, 1998
Flipping a Coin to Dx AMI for People with Chest Pain ED AMI Prevalence 6% SN = 3 / 6 = 50%SP = 47 / 94 = 50% PPV= 3 / 50 = 6%NPV = 47 / 50 = 94% Worster, 2002
Flipping a Coin to Dx AMI for People with Chest Pain CCU AMI Prevalence 90% SN = 45 / 90 = 50% SP = 5 / 10 = 50% PPV= 45 / 50 = 90%NPV = 5 / 50 = 10% Worster, 2002
1.0 Sensitivity (TPR) 0.0 0.0 1.0 1-Specificity (FPR) Receiver Operator Curve • Allows consideration of test performance across a range of threshold values • Well suited for continuous variable Dx Tests
Receiver Operator Curve • Avoids the “single cutoff trap” Sepsis Effect No Effect WBC Count Gallagher, 1998
Area Under the Curve (θ) 1.0 • Measure of test accuracy • (θ) 0.5 – 0.7 no to low discriminatory power • (θ) 0.7 – 0.9 moderate discriminatory power • (θ) > 0.9 high discriminatory power Sensitivity (TPR) 0.0 0.0 1.0 1-Specificity (FPR) Gryzybowski, 1997
Problem with ROC curves • Same problems as SN and SP “Reverse Logic” • Mainly used to describe Dx test performance
Physical Exam + OR CT Scan - - + No Appy Appy Appendicitis Example • Study design: • Prospective cohort • Gold standard: • Pathology report from appendectomy or CT finding (negatives) • Diagnostic Test: • Total WBC Cardall, 2004
Appendicitis Example SN 76% (65%-84%) SP 52% (45%-60%) PPV 42% (35%-51%) NPV 82% (74%-89%) Cardall, 2004
Physical Exam + OR CT Scan - - + No Appy Appy Appendicitis Example • Patient WBC: • 13,000 • Management: • Get CT with PO & IV Contrast Cardall, 2004
Follow UP • CT result: acute appendicitis • Patient taken to OR for appendectomy
But, was WBC necessary? Answer given in talk on Likelihood Ratios