Screening and Prognostic Tests

Screening and Prognostic Tests Thomas B. Newman, MD, MPH October 20, 2005

Overview • Questions from last time; administrative stuff • Screening tests • Introduction • Biases in observational studies • Biases in randomized trials • Conclusion – ecologic view • Prognostic tests • Differences from diagnostic tests and risk factors • Quantifying prediction: calibration and discrimination • Value of information • Common problems

TN Biases • “When your only tool is a hammer, you tend to see every problem as a nail.” • Biggest gains in longevity have been PUBLIC HEALTH interventions, not interventions aimed at individuals • Biggest threats are still public health threats • Interventions aimed at individuals are overemphasized

Cultural characteristics "We live in a wasteful, technology driven, individualistic and death-denying culture." --George Annas, New Engl J Med, 1995

What is screening? • Common definition: testing to detect asymptomatic disease • Better definition*: application of a test to detect a potential disease or condition in people with no known signs or symptoms of that disease or condition. • Disease vs condition • Asymptomatic vs no known signs or symptoms *Common screening tests. David M. Eddy, editor. Philadelphia, PA: American College of Physicians, 1991

Types of screening • Unrecognized symptomatic disease screening: what IS making the person sick. • Disease screening: what WILL make the person sick. • Risk factor screening: what MIGHT make the person sick.

Examples and overlap • Continuum related to both certainty and timing of symptoms • May vary with age • Unrecognized symptomatic disease: vision and hearing problems in young children; iron deficiency anemia, depression • Presymptomatic disease: neonatal hypothyroidism, syphilis, HIV • Risk factor: hypercholesterolemia, hypertension • Somewhere between: prostate cancer, breast carcinoma in situ, more severe hypertension

Disease vs. Risk factor screening. 1

Disease vs. Risk factor screening. 2

Disease vs. Risk factor screening. 3 *May be political as well as scientific decision

Possible harms from screening • To all tested • To those with negative results • To those with positive results • To those not tested • See course book

Forces behind excessive screening -1 • Companies selling machines to do the test • Companies selling the test itself • Companies selling products to treat the condition • Clinicians who treat the condition • Politicians who are (or want to appear) sympathetic

Forces behind excessive screening -2 • Disease research and advocacy groups • Academics who study the condition • Clinicians doing or interpreting the test • Managed care organizations • The public

E-mail excerpt -1 > PLEASE, PLEASE, PLEASE TELL ALL YOUR FEMALE FRIENDS AND RELATIVES TO INSIST ON A CA-125 BLOOD TEST EVERY YEAR AS PART OF THEIR ANNUAL PHYSICAL EXAMS. Be forewarned that their doctors might try to talk them out of it, saying, "IT ISN'T NECESSARY." > > …Insist on the CA-125 BLOOD TEST; DO NOT take "NO" for an answer!

Biases in Observational Studies of Screening Tests • Volunteer bias • Lead time bias • Length time bias • Stage migration bias • Pseudodisease

Volunteer Bias • People who volunteer for studies differ from those who do not • Examples • HIP Mammography study: women who volunteered for mammography had lower heart disease death rates • Coronary drug project: Men who took their medicine had about half the mortality of men who didn't, whether they were on drug or placebo

Lead time bias Source: EDITORIAL: Finding and Redefining Disease.Effective Clinical Practice, March/April 1999. Available at: ACP- Online http://www.acponline.org/journals/ecp/marapr99/primer.htm accessed 8/30/02

Length Bias (Different natural history bias) • Screening picks up prevalent disease • Prevalence = incidence x duration • Slowly growing tumors have greater duration in presymptomatic phase, therefore greater prevalence • Therefore, cases picked up by screening will be disproportionately those that are slow growing

Length bias Source: EDITORIAL: Finding and Redefining Disease.Effective Clinical Practice, March/April 1999. Available at: ACP- Online http://www.acponline.org/journals/ecp/marapr99/primer.htm

Length Bias Slower growing tumor with better prognosis ? Early detection Higher cure rate

Stage migration bias Old tests New tests

Stage migration bias • Also called the "Will Rogers Phenomenon" • "When the Okies left Oklahoma and moved to California, they raised the average intelligence level in both states." -- Will Rogers • Documented with colon cancer at Yale • Other examples abound – the more you look for disease, the higher the prevalence and the better the prognosis • More generally, be careful with stratified analyses Best reference on this topic: Black WC and Welch HG. Advances in diagnostic imaging and overestimation of disease prevalence and the benefits of therapy. NEJM 1993;328:1237-43.

A more general example of Stage Migration Bias • VLBW (< 1500 g), LBW (1500-2499g) and NBW (>= 2500g) fetuses exposed to Factor X all have decreased mortality compared with those not exposed • Is factor X good? • Maybe not! Factor X could be cigarette smoking! • Smoking moves babies to lower birthweight strata • Compared with other causes of LBW (i.e., prematurity) it is not as bad

Pseudodisease • A condition that looks just like the disease, but never would have bothered the patient • In an individual treated patient it is impossible to distinguish pseudodisease from successfully treated asymptomatic disease • Existence of pseudodisease can only be detected in groups of treated patients • Treating pseudodisease can only cause harm because (by definition) it is unnecessary

Example: Mayo Lung Project (MLP) • RCT of lung cancer screening • Enrollment 1971-76 • 9,211 male smokers • Two study arms • Intervention arm: chest x-ray and sputum cytology every 4 months for 6 years (75% compliance) • Usual care (control) arm: at trial entry only, a recommendation to receive same tests annually

MLP Extended Follow-up Results* • Intervention group: more cancers diagnosed at early, resectable stage • Better survival of those with lung cancer *Marcus et al., JNCI 2000;92:1308-16

MLP Extended Follow-up Results* • Slight increase in lung-cancer mortality (P=0.09 by 1996) *Marcus et al., JNCI 2000;92:1308-16

What happened? • Lead-time bias? • Length bias? • Volunteer bias? • Overdiagnosis (pseudodisease) Black, WC. Overdiagnosis: An unrecognized cause of confusion and harm in cancer screening. JNCI 2000;92:1280-1

NHLBI National Lung Screening Trial • 46,000 participants randomized in 2 years • Equal randomization • Three annual screens • Spiral CT versus chest x-ray!

Each year, 182,000 women are diagnosed with breast cancer and 43,300 die. One woman in eight either has or will develop breast cancer in her lifetime... If detected early, the five-year survival rate exceeds 95%. Mammograms are among the best early detection methods, yet 13 million women in the U.S. are 40 years old or older and have never had a mammogram. 39,800 Clicks per mammogram (Sept, ’04)

RCTs of screening tests, Example: Mammography • New York TimesExpert Panel Cites Doubts On Mammogram's Worth • Washington Post • Mammography Review Shatters the Status Quo Doubts About Its Value Alarm Many

Is screening for breast cancer with mammography justifiable?* • Meta-analysis of randomized trials • Methodologic issues raised • Quality of randomization • Post-randomization exclusions • Choice of outcome variable: Breast cancer mortality vs. total mortality *Gotzsche P,Olsen O. Lancet 2000;355:1293

Poor Quality Randomization. Example: Edinburgh trial • Randomization by practice (N=87?), not by woman • 7 practices changed allocation status • Highest SES • 26% of women in control group • 53% of women in screening group • 26% reduction in cardiovascular mortality in mammography group

Example 2: Biased post-randomization exclusion for previous beast cancer • New York Trial • N=853 in screened group • N=336 in control group • Breast cancer mortality difference at 18 years: 44 deaths • Edinburgh trial • N=338 in screened group • N=177 in control group

Explanation for differences in NY Trial* • In screened group women with previous breast cancer excluded at entry • In control group, women with previous breast cancer excluded only if they developed breast cancer • Thus, women with previous breast cancer in who did NOT develop breast cancer were included in the denominator of the control group but not the mammography group • Therefore, bias against mammography * Fletcher SW, Gilmore JG. Mammography screening for breast cancer. NEJM 2003;348:1672-80. (Appendix 2)

Problems with breast cancer mortality as an endpoint • Assignment of cause of death is subjective • Unblinded in NY, Two-county trials • Treatment may have effects on other causes of death

Meta-analysis of radiotherapy for early breast cancer* • Meta-analysis of 40 RCTs • Central review of individual-level data; N = 20,000 • Breast cancer mortality reduced (20-yr ARR 4.8%; P = .0001) • Mortality from other causes increased (20-yr ARR -4.3%; P = 0.003) *Early Breast Cancer Trialists Collaborative Group. Lancet 2000;355:1757

Mastectomies Radiotherapy

13-year total mortality, > 50 y.o. Breast cancer deaths, 7 yr

NCI Position* • “NCI recommends mammography for women starting in their 40s” -- Dr. Peter Greenwald, NCI director of cancer prevention • "Everyone agrees that mammography detects breast cancer when it's smaller, when it's earlier. There's no debate about that," Greenwald added. "And everybody agrees mammography detects more cancers. • "The debate is whether that has an impact on mortality later on. It is the only real method that we have, other than clinical exam, that's useful as screening for early detection in healthy women." *Washington Post, January 24, 2002

Cancer mortality vs Total mortality in RCTs

TN Conclusions on Screening • Screening decisions are heavily influenced by politics, economics, emotion and wishful thinking • Most screening occurs without informed consent • High quality RCTs are needed • Low power to discern effect on total mortality • Big debate about efficacy. But even if proponents are right, much screening is not cost-effective and its disadvantages are consistently downplayed

Cost per QALY • Mammography, age 40-50: $105,000* • Mammography, age 50-69: $21,400* • Smoking cessation counseling: $2000** • HIV prevention in Africa: $1-20*** *Salzman P et al. Ann Int Med 1997;127:955-65 (Based on optimistic assumptions about mammography.) **Cromwell J et al. JAMA 1997;278:1759-66 ***Marseille E et al. Lancet 2002; 359: 1851-56

Return to George Annas* • Need to begin to think differently about health. Two dysfunctional metaphors: • Military metaphor – battle disease, no cost too high for victory, no room for uncertainty • Market metaphor -- medicine as a business; health care as a product; success measured economically *Annas G. Reframing the debate on health care reform by replacing our metaphors. NEJM 1995;332:744-7

Ecology metaphor • Sustainability • Limited resources • Interconnectedness • More critical of technology • Move away from domination, buying, selling, exploiting • Focus on the big picture • Populations rather than individuals • Causes rather than symptoms

Assessment of Prognostic Tests Difference from diagnostic tests and risk factors Quantifying accuracy Value of prognostic information Common problems

Potential confusion: “cross-sectional” means 2 things Cross-sectional sampling means sampling does not depend on either the predictor variable or the outcome variable. (E.g., as opposed to case-control sampling) Cross-sectional time dimension means that predictor and outcome are measured at the same time -- opposite of longitudinal

Longitudinal rather than cross-sectional time dimension Incidence rather than prevalence Sensitivity, specificity, prior probability confusing Time to an event may be important Harder to quantify accuracy in individuals Exceptions: short time course, continuous outcomes Difference from Diagnostic Tests

Difference from Risk Factors • Causality not important • Absolute risk very important • Sampling scheme makes a much bigger difference because absolute risks are less generalizable than relative risks • Can be informative even if no bad outcomes!

How accurate are the predicted probabilities? Assemble a group Compare actual and predicted probabilities Calibration is important for decision making and giving information to patients Like absolute risk in this way – less generalizable Quantifying Prediction 1: Calibration

Screening and Prognostic Tests

Screening and Prognostic Tests

Presentation Transcript

Systematic Review of Prognostic Tests

Screening for albuminuria and microalbuminuria: epidemiological and prognostic interest

Reliability of Screening Tests

Diagnostic and Screening Tests

Employment Screening Tests

Conditional Probability and Screening Tests

Baye’s Rule and Medical Screening Tests

Critiquing for Evidence-based Practice: Diagnostic and Screening Tests

Multiple Tests, Multivariable Decision Rules, and Prognostic Tests

SCREENING TESTS

Lesson #10 Screening Tests

In-vitro Colorectal Cancer Screening Tests Market

Screening Tests for Cancer – Aruna Scan and Diagnostics

Cancer Screening Tests @ Radlabs

Positive predictive value of screening tests

Trimming screening tests and modern psychometrics

Systematic Review of Prognostic Tests

Reliability of Screening Tests

Baye’s Rule and Medical Screening Tests

Improving screening tests with modern psychometrics

Lecture 6 Diagnostic Tests and Screening

Scans and Scams: Necessary and Unnecessary Screening Tests