More practical metrics for standardizing health outcomes in effectiveness research
Download
1 / 35

More Practical Metrics for Standardizing Health Outcomes in Effectiveness Research - PowerPoint PPT Presentation


  • 66 Views
  • Uploaded on

More Practical Metrics for Standardizing Health Outcomes in Effectiveness Research. John E. Ware, Jr., PhD, Professor and Chief Division of Measurement Sciences, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' More Practical Metrics for Standardizing Health Outcomes in Effectiveness Research' - hilda-warren


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
More practical metrics for standardizing health outcomes in effectiveness research

More Practical Metrics for Standardizing Health Outcomes in Effectiveness Research

John E. Ware, Jr., PhD, Professor and Chief

Division of Measurement Sciences, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA

Track A - Patient Reported Outcome Measurement and Comparative Effectiveness Research to Reform: Achieving Health System Change

AHRQ 2009 Annual Conference, Bethesda MD September 13-16, 2009


What is the relationship between health care expenditures outcomes
What is the Relationship Between Effectiveness ResearchHealth Care Expenditures & Outcomes?

Health

Outcome

Expenditures for Health Care ($)


More health care is not always better

Health Insurance Experiment Revealed: Effectiveness Research

More Health Care is Not Always Better

“Flat of the Curve”

Health

Outcome

Expenditures for Health Care ($)


When the same outcome costs more
When the Same Outcome Costs More, Effectiveness Research

Payers & Consumers:

Want to

Pay Less

Health

Outcome

Expenditures for Health Care ($)


Who is most vulnerable with aggressive cost containment

Health Insurance Effectiveness Research

Experiment (HIE)

(1974-1981)

Medical Outcomes

Study (MOS)

(1986-1990)

Health

Decline

  • Most vulnerable in

  • the MOS:

  • Chronically ill

  • Elderly

  • Poor

  • Non- white

Who is Most Vulnerable with Aggressive Cost Containment?

Well

Well off

Young

Cost Containment

Expenditures for Health Care ($)


4 year physical health outcomes favored ffs hmo for chronically ill medicare in the mos

These percentages, Effectiveness Research

better & worsewould

be only about 5% due

to measurement error

4-Year Physical Health Outcomes Favored FFS > HMO for Chronically-Ill Medicare in the MOS

Source: Ware, Bayliss, Rogers et al., JAMA, 1996; 276:1039-1047


When outcomes vary at the same price
When Outcomes Vary at the Same Price Effectiveness Research

Payers & Consumers

Want the Best

Outcomes

Health

Outcome

Expenditures for Health Care ($)


To compare health care effectiveness we need health outcomes rulers

7 Effectiveness Research

6

5

4

3

2

1

To Compare Health Care EffectivenessWe Need Health Outcomes “Rulers”

Better

Health

Outcome

Same

Worse

Expenditures for Health Care ($)


Continuum of disease specific and generic health measures
Continuum of Disease-specific and Generic Health Measures Effectiveness Research

Generic Functioning, Well-being and Evaluation

Impact of Disease-specific Problems

Specific Symptoms

Clinical Markers

(1)

(2)

(3)

(4)

Adapted from: Wilson and Cleary, JAMA, 1995

Ware, Annual Rev. Pub. Health, 1995

9


Continuum of disease specific and generic health measures1
Continuum of Disease-specific and Generic Health Measures Effectiveness Research

Shortness of Breath

Over the last 4 weeks I have had shortness of breath

Almost every day

Several days a week

A few days a month

Not at all

Spirometry

Generic Functioning, Well-being and Evaluation

Impact of Disease-specific Problems

Specific Symptoms

Clinical Markers

dd

(1)

(2)

(3)

(4)

Adapted from: Wilson and Cleary, JAMA, 1995

Ware, Annual Rev. Pub. Health, 1995

10


Continuum of disease specific and generic health measures2
Continuum of Disease-specific and Generic Health Measures Effectiveness Research

Shortness of Breath

Respiratory-specific

Spirometry

Generic Functioning, Well-being and Evaluation

Impact of Disease-specific Problems

Specific Symptoms

Clinical Markers

dd

Over the last 4 weeks I have had shortness of breath

Almost every day

Several days a week

A few days a month

Not at all

How much did your lung/respiratory problems limit your usual activities or enjoyment of everyday life?

Not at all

A little

Moderately

Extremely

(1)

(2)

(3)

(4)

Adapted from: Wilson and Cleary, JAMA, 1995

Ware, Annual Rev. Pub. Health, 1995

11


Continuum of disease specific and generic health measures3
Continuum of Disease-specific and Generic Health Measures Effectiveness Research

Shortness of Breath

Respiratory-specific

Generic

Spirometry

Generic Functioning, Well-being and Evaluation

Impact of Disease-specific Problems

Specific Symptoms

Clinical Markers

dd

Over the last 4 weeks I have had shortness of breath

Almost every day

Several days a week

A few days a month

Not at all

How much did your lung/respiratory problems limit your usual activities or enjoyment of everyday life?

Not at all

A little

Moderately

Extremely

In general, would you say your health is…

Excellent

Very good

Good

Fair

Poor

(1)

(2)

(3)

(4)

Adapted from: Wilson and Cleary, JAMA, 1995

Ware, Annual Rev. Pub. Health, 1995

12


There is more to the continuum
There is More to the Continuum Effectiveness Research

Generic Functioning, Well-being and Evaluation

Impact of Disease-specific Problems

Specific Symptoms

Clinical Markers

(1)

(2)

(3)

(4)


Prediction and risk management pros are among the best predictors
Prediction and Risk Management: Effectiveness ResearchPROs are among the Best Predictors

Generic Functioning, Well-being and Evaluation

Impact of Disease-specific Problems

Future health

Inpatient expenditures

Outpatient expenditures

Job loss

Response to treatment

Return to work

Work productivity

Mortality

(3)

(4)

Health-Related QOL (HR-QOL)


What do we need for comparative effectiveness research
What Do We Need for Comparative Effectiveness Research? Effectiveness Research

  • Outcomes that matter to patients

  • Practical measures

  • Coverage of a wide range

  • Greater precision

  • Comparability of scores

  • Ease of interpretation

Physical activity limitations

Symptoms of psychological distress

Physical well-being

Life satisfaction

Emotional behavior

Role disability due to physical problems

Psychological well-being

General health perceptions

Physical mobility

Role disability due to emotional problems

Satisfaction with physical condition

Social activities with friends/relatives


Content of widely used patient reported outcome measures

SIP Effectiveness Research

= Sickness Impact Profile (1976)

HIE

= Health Insurance Experiment surveys (1979)

NHP

= Nottingham Health Profile (1980)

QLI

= Quality of Life Index (1981)

COOP

= Dartmouth Function Charts (1987)

DUKE

= Duke Health Profile (1990)

Content of Widely-Used Patient-Reported Outcome Measures

Psychometric

Utility Related

SIP

HIE

NHP

COOP

DUKE

MOS

MOS

QWB

EURO

HUI

SF-6D

CONCEPTS

PROMIS

FWBP

SF

-36

-QOL

Physical functioning

l

l

l

l

l

l

l

l

l

l

l

l

Social functioning

l

l

l

l

l

l

l

l

l

l

l

Role functioning

l

l

l

l

l

l

l

l

l

l

l

Psychological distress

l

l

l

l

l

l

l

l

l

l

l

Health perceptions (general)

l

l

l

l

l

l

Pain (bodily)

l

l

l

l

l

l

l

l

l

l

Energy/fatigue

l

l

l

l

l

l

l

l

Psychological well-being

l

l

l

l

l

Sleep

l

l

l

l

Cognitive functioning

l

l

l

l

Quality of life

l

l

l

Reported health transition

l

l

l

MOS SF-36

= MOS 36-Item Short-Form Health Survey (1992)

= Quality of Well-Being Scale (1973)

PROMIS = Patient Reported Outcomes Measurement Information

System

QWB

= Quality of Well-Being Scale (1973)

EUROQOL

= European Quality of Life Index (1990)

HUI

= Health Utility Index (1996)

MOS FWBP

= MOS Functioning and Well-Being

Profile (1992)

SF-6D

= SF-36 Utility Index (Brazier, 2002)

Source: Adapted from Ware, 1995


What do we need for comparative effectiveness research1
What Do We Need for Effectiveness ResearchComparative Effectiveness Research?

  • Outcomes that matter to patients

  • Practical measures

  • Coverage of a wide range

  • Greater precision

  • Comparability of scores

  • Ease of interpretation


What do we need for comparative effectiveness research2

“Ceiling Effect” Effectiveness Research

What Do We Need for Comparative Effectiveness Research?

  • Outcomes that matter to patients

  • Practical measures

  • Coverage of a wide range

  • Greater precision

  • Comparability of scores

  • Ease of interpretation


A practical solution in 1999 computerized dynamic health assessment

r = 0.938 Effectiveness Research

N = 1016

Criterion

Score

Dynamic 5-Item Headache

Pain Measure

A Practical Solution in 1999: Computerized Dynamic Health Assessment

IRT/CAT will spawn a new generation of static tools

“Ceiling Effect”

r = 0.536

N = 1016

Criterion

Score

3 SD units

No

Disability

Skewed 5-Item Headache

Pain Measure

Ware JE, Jr, et al. Med Care. 2000;38:1173-82.


What do we need for comparative effectiveness research3

Criterion Effectiveness Research

VAS

What Do We Need for Comparative Effectiveness Research?

  • Outcomes that matter to patients

  • Practical measures

  • Coverage of a wide range

  • Greater precision

  • Comparability of scores

  • Ease of interpretation


What do we need for comparative effectiveness research4
What Do We Need for Effectiveness ResearchComparative Effectiveness Research?

  • Outcomes that matter to patients

  • Practical measures

  • Coverage of a wide range

  • Greater precision

  • Comparability of scores

  • Ease of interpretation


Practical solution in 2000 cross calibration of headache pain disability measures
Practical Solution in 2000: Effectiveness ResearchCross-Calibration of Headache Pain Disability Measures

Theta (θ) [Best Possible Estimate]

Scales 20 30 40 50 60 70

HDI16 43 73 91 98 100

HIMQ 74 53 31 17 8 2

MIDAS 58 28 5 1 0 0

MSQ 31 53 79 92 96 99

DYNHA-5 (+) 23 32 41 51 58 66

Note: Direction of scoring shown with arrows Source: Ware, Bjorner & Kosinski, Medical Care, 2000


We need the health equivalent of a two sided tape measure
We Need the Health Equivalent of a Effectiveness ResearchTwo-Sided Tape Measure

52 centimeters = 20.5 inches

and Public-Private Partnerships That Meet

the Needs of Research and Business


What do we need for comparative effectiveness research5

What do the results mean? Effectiveness Research

What Do We Need for Comparative Effectiveness Research?

  • Outcomes that matter to patients

  • Practical measures

  • Coverage of a wide range

  • Greater precision

  • Comparability of scores

  • Ease of interpretation


Pro validation must be comprehensive
PRO Validation Must be Comprehensive Effectiveness Research

  • Diagnosis

  • Disease severity

  • Responders

  • Treatments

Measures

In Question

Gold

Standard

  • Work productivity

  • Costs of care

  • Mortality

  • Self- evaluated health

Other

Measures

& Methods

  • Diagnosis

  • Disease severity

  • Responders

  • Treatments

  • Work productivity

  • Costs of care

  • Mortality

  • Self-evaluated health

Gold

Standard

Other

Measures

& Methods

Causes

Consequences

Adapted from: Ware JE, Jr. and Keller SD: Interpreting general health measures, in: Quality of Life and

Pharmacoeonomics in Clinical Trials. Philadelphia, PA: Lippincott-Raven Publishers; 1995: Chapter 47.


What do differences in treatment effectiveness mean
What Do Differences in Treatment Effectiveness ResearchEffectiveness Mean?

Asthma After

Rx

Asthma Before

Rx

50% reduction in disease burden

33% reduction in hospitalization

Substantial increase in work productivity

Subsequent cost savings

Congestive Heart Failure

Chronic Lung Disease

Diabetes

Type II

Treatment

Average Adult

Average Well Adult

30

40

50

Physical Component Summary (PCS)


Matching methods to applications choosing the right horse for the course
Matching Methods to Applications: Effectiveness Research“Choosing the Right Horse for the Course”

  • Population monitoring

  • Group-Level outcomes monitoring

  • Patient-level measurement/management


Matching methods to applications
Matching Methods to Applications Effectiveness Research

Patient-Level

Management

Group-Level

Outcomes

Monitoring

Population

Monitoring

7

7

7

6

6

5

5

5

Noisy

Individual

Classification

Very Accurate

Individual

Classification

4

4

3

3

3

2

2

Most Functionally Impaired

1

1

1

Single-Item

Multi-Item

Scale

“Item Pool”

(CAT Dynamic)


Solutions
Solutions Effectiveness Research

  • Improved psychometrics (Item response theory – IRT)

  • Computerized adaptive testing (CAT) software

  • The Internet (and other connectivity)

Business Week. November 26, 2001.


First construct better metrics

Source: Effectiveness Research

Business Week 11/26/01

First, Construct Better Metrics

% @ Ceiling:

  • Comprehensive Item “Pools”

  • IRT Cross Calibration of Items

2008 “PF Ruler” < 3 % @ Ceiling

NEW

PF

1990 “PF Ruler” > 30% @ Ceiling

PF-10

1980 “PF Ruler” > 75% @ Ceiling

+

=

ADL

SIP

FIM

Physical Functioning (PF)


Precision varies across static and dynamic forms and across score levels
Precision Varies Across “Static” and Dynamic Forms and Across Score Levels

PF-1 (“Static”)

PF-2 (“Static”)

PF-10 (“Static”)

PF CAT-10

PF “Criterion”

(Item Bank)

Rheumatoid

Arthritis

6.0

0.75

5.0

4.0

Standard

Error

0.90

3.0

Reliability

0.95

2.0

1.0

0

10 20 30 40 50 60 70 80

Physical Function (PF), Mean = 50

Source: Rose M, Bjorner JB, Becker J, Fries JF and Ware JE. Evaluation of a preliminary physical function item bank supported expected

advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS). Journal of Clinical Epidemiology, 2008, 61, 17-33.


2 nd solution assess health dynamically
2 Across Score Levelsnd Solution, Assess Health Dynamically

Patient scores here

CAT

CAT = Computerized Adaptive Testing


What are the advantages of dynamic assessments
What are the Advantages of Across Score LevelsDynamic Assessments?

  • More accurate risk screening

  • Reliable enough to monitor individual outcomes

  • Brevity of a short form – 90% reduction in respondent burden

  • Elimination of “ceiling” & “floor” effects

  • Can be administered using various data collection technologies

  • Markedly reduced data collection costs

  • Monitor data quality in real time


3rd solution the internet
3rd Solution: The Internet Across Score Levels

www.asthmacontroltest.com

/

www.amIhealthy.com

Reference – Headache Impact: MS Bayliss, JE Dewey, R Cady etal., A.Study of the Feasibility of Internet Administration of a computerized

health survey: The Headache Impact Test (HIT), Quality of Life Research, 2003, 12: 953-961

References – Asthma Control: Nathan RA, Sorkness CA, Kosinski M et al., “Development of the Asthma Control Test: A survey for assessing

asthma control. Journal of Allergy and Clinical Immunology. 2004;113: 59-65.


Conclusions
Conclusions Across Score Levels

  • Patient-reported outcomes (PROs) are very useful

  • Standardization of concepts & metrics is enabling comparisons across treatments & settings

  • Increasing widespread use proves that more practical tools will be adopted

  • Promising technological advances include: item response theory (IRT), computerized adaptive testing (CAT) and Internet-based data capture


ad