ESTIMATING HOSPITAL QUALITY OF CARE: ISSUES & APPLICATIONS Sharon-Lise T. Normand Health Care Policy, Harvard Medical School & Biostatistics, Harvard School of Public Health Quantify and compare the attributes of health care delivery and subsequent outcomes of health care
Acknowledgements • Mass-DAC (www.massdac.org) • Matt Cioffi, Jennifer Grandfield, Ann Lovett, Treacy Silverstein, Robert Wolf, Katya Zelevinsky • David Shahian, Massachusetts General Hospital & Harvard Medical School • Barbara J. McNeil, Brigham and Women’s Hospital • Armando Teixeira-Pinto (University of Porto, Portugal). • Paul Dreyer, Massachusetts Department of Public Health, Division of Health Care Quality
Context: US Annual growth in health $ exceeds GDP by 2.7%. December 12, 2006 NY Times: “Medicare links doctors’ pay to practices” “After years of trying to rein in the runaway cost of the Medicare program, Congress has decided to use a carrot instead of a stick to change doctors’ behavior. • Doctor’s payment rates frozen • Can qualify for a 1.5% bonus in 2nd half of 2007 using measures specified by the government
Context November 21, 2007: CMS proposes to implement a Medicare hospital value-based program based on: • Benchmark to define high level of performance. • Attainment threshold, range, and score. • Each hospital will have a score comprised of the clinical process measures, outcomes, and surveys.
Context • April 14, 2008: “CMS will no longer reimburse hospitals for 9 conditions” (surgical site infection, collapsed lungs, etc.). • Estimates saving $50 million per year. • Hospitals need to report on a total of 73quality measures or face a loss of 2 percentage points in any payment increase.
Why Cardiac Surgery? • Total cost of cardiovascular diseases and stroke in 2002 estimated $329 billion (twice that of all cancers). • CABG surgery accounts for about 60% of all cardiac surgeries. • Cost of a CABG operation is $16,000; hospital reimbursement is $26,000 - $40,000. • Between 1979 and 1999, number of cardiovascular operations and procedures increased by 413%. • 30-day mortality = “operative” mortality common endpoint.
30-DAY MORTALITY IN 13 HOSPITALS FOLLOWING ISOLATED CABG SURGERY, MASSACHUSETTS, USA (2002)
Statistical Concerns • No separation of within and between hospital variance. • No randomization of patients to hospitals. • What is the comparison group? • Inaccuracy of estimates from small hospitals. • Regression to the mean. • Non-independence of observations within a hospital.
Approach: Fully Bayesian Hierarchical Model ~ Prior and ~ Prior (a-priori independent) Problem: Almost always true that 2 > 0. How large is meaningful? Solution: Don’t test; rather estimate.
Interpreting Between-Hospital Standard Deviation. Interpretation: Odds of dying in a high mortality hospital is seven times that of dying in a low mortality hospital.
Priors for Between-Hospital Precision or Standard Deviation (Mass-DAC) E(-2 ) = 1 E( ) = 0.75 Range in odds from 1 to 358 Mode = 0 (no differences); Median = 0.39 (range in odds 4.6) E( ) = 0.21
How to Identify Outlying Hospitals? Complete Exchangeability: random effects arise from a normal model. • Estimate model and compute posterior interval estimates of risk-standardized quantities for each hospital. • Cross-validation: predict number of mortalities at dropped hospital based on parameter estimates obtained from all remaining hospitals and all data.
Identifying Outlying Institutions Posterior Estimates Under Complete Exchangeability Assumption, 0i ~ N(,2) StandardizedMortality Incidence Rate (SMIR)
Identifying Outlying Institutions Predictive Cross-Validation Under Complete Exchangeability: number dying where
What is the comparison group? Indirect Standardization: • Expected rate = what the mortality rate would have been at a hospital given its actual distribution of patients across risk strata, but replacing mortality rates for these strata with those derived from all hospitals in the state. Counterfactualpopulation is all hospitals. • Cross-Validated Expected rate: what the mortality rate would have been at a hospital given its actual distribution of patients across risk strata, but replacing mortality rates for these strata with those derived from all other hospitals in the state. Counterfactual population is hospital’s peers.
95% Posterior SMIR Intervals Following Isolated CABG Surgery, Massachusetts (2002).
Why such a large interval estimate (Hospital 13) with 419 Isolated CABG admissions? Numerator Denominator
Between-Hospital Variation Based on 2002 MA data: • Estimated between-hospital standard deviation was 0.205. • Odds of dying in a “high” mortality hospital is twice that in a “low” mortality hospital. • Results insensitive to choice of priors for between-hospital variance component.
IN 2003 CROSS-VALIDATED P-VALUES: (predicted = 2.05% versus observed = 4.31%)
What Did Evidence Indicate? 2003: • 95% SMIR interval just includes 2.22%. • Cross-validated p-value small (0.01). • Cross-validated prediction of 2.05 versus observed of 4.31. • Between-hospital variation in risk-adjusted rates reduced by 50% when UMass Memorial Medical Center is eliminated. 2002+2003 Combined Data: • Lower limit of SMIR interval = state rate of 2.25%. • Cross-validated p-value small (0.0004). • Cross-validated prediction of 1.79 versus observed of 3.92. • Between-hospital variation reduced by almost 75% when UMass Memorial Medical Center is eliminated.
Remarks • Counterfactual uses the entire state as the reference population. • Pair-wise comparisons of estimates therefore may not be statistically valid. • Explicitly separate sampling variability from between-hospital variability.
Risk as measured by estimated Propensity Score Shahian and Normand, Circulation (2008)
What about process measures? • A process measure reports on what was “done” to or for the patient. • Did the patient receive clinically-needed beta-blocker therapy? • Did the patient receive smoking cessation counseling if the patient smokes? • An outcome measure reports on the consequences of care.
Profiling: unit-specific measures. • Estimation and classification problem. • Several process-based measures for a provider, e.g., Yik = kth measure at ith hospital • Strategy now – create a composite measure: • Calculate raw sum score, yi = (k yik) /(k nik) • Identify 90th percentile of distribution, .90 • Bonus if yi .90. • In 2005, CMS paid bonuses totaling $8.85 million to 123 “superior-performing” hospitals based on this “raw-sum score.”
Statistical Concerns • No separation of within and between hospital variance. • No randomization of patients to hospitals. • Inaccuracy of estimates from small hospitals. • Non-independence of observations within a hospital. • Multiple measures per person and per hospital. • yi = (k yik) /(k nik) may not be a sufficient statistic.
Hospital Quality: Raw sum scores versus latent scores, 2004 (USA). Spearman Correlation AMI: 0.99 CHF: 0.92 Pneumonia: 0.91
Comparing Composites:(Teixeira-Pinto and Normand, Statistics in Medicine (2008)) 2005 Data
Model Checking: posterior predictive p-values. Teixeira-Pinto and Normand, Statistics in Medicine (2008)
Concluding Remarks • Empirical problem has real policy implications (UMass Memorial Cardiac Surgery Program voluntarily closed for several months in 2005). • CMS using the Bayesian approach to assess quality at all non-federal acute care US hospitals using mortality following heart failure, heart attack, and pneumonia (www.hospitalcompare.hhs.gov). • Challenges conveying “uncertainty” and justifying use of posterior shrinkage estimates. • Policy focus will now on using multiple “outcomes” for each institution. • Details found in Normand and Shahian, Statistical Science, 2007;22(2):206-226.
Sensitivity to Prior Specification of Between-Institution Variance. MA Isolated CABG Data (2002)