methods to analyze real world databases and registries
Skip this Video
Download Presentation
Methods to analyze real world databases and registries

Loading in 2 Seconds...

play fullscreen
1 / 77

Methods to analyze real world databases and registries - PowerPoint PPT Presentation

  • Uploaded on

Methods to analyze real world databases and registries. Hilal Maradit Kremers, MD MSc Mayo Clinic, Rochester, MN. Clinical Research Methodology Course NYU-Hospital for Joint Diseases December 11, 2008. Disclosure. Research funding from National Institutes of Health (RA) Amgen (psoriasis)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Methods to analyze real world databases and registries' - emily

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
methods to analyze real world databases and registries

Methods toanalyze real worlddatabases and registries

Hilal Maradit Kremers, MD MSc

Mayo Clinic, Rochester, MN

Clinical Research Methodology Course

NYU-Hospital for Joint Diseases

December 11, 2008


Research funding from

  • National Institutes of Health (RA)
  • Amgen (psoriasis)
  • Pfizer (pulmonary arterial hypertension)
  • Terminology
  • Clinical trials versus observational studies and registries
  • Types of observational studies in rheumatic diseases
    • Descriptive epidemiology (incidence, prevalence)
    • Disease definitions (i.e. classification criteria)
    • Examining outcomes (including effectiveness of therapy) and risk factors (environmental, genetic)
  • Tips when interpreting results

“Real-world databases”

& registries




terminology of related observational research disciplines
Terminology of related observational research disciplines

Health Services












terminology clinical medicine versus epidemiology
Terminology: Clinical medicine versus epidemiology


  • Natural history of the disease
  • Signs and symptoms
  • Diagnosis (how and when)
  • Current clinical practice
  • Clinical literature
  • Drug-induced illnesses


  • Distribution and determinants of diseases in populations
    • Study design
    • Data collection
    • Measurement
    • Analyses
    • Interpretation
    • Critical review
clinical trials versus observational studies and registries8
Clinical trials versus observational studies and registries



Exposure -


Exposure +

Exposure -




Exposure +

Disease -




Disease +

why do we need registries
Why do we need registries
  • Limitations of pre-marketing trials
  • Unresolved issues from pre-marketing studies
  • New signals or inconsistent signals from post-marketing surveillance
  • Evolving concerns about safety
  • Establishing risk-benefit margins
  • Learn about use, Rx decisions, compliance and other physician/patient behaviors
  • To evaluate a risk management program
clinical trial vs observational studies registries four toos
Clinical trial vs observational studies/registries – four “toos”
  • Too few
  • Too brief
  • Too simple
  • Too median-aged
implications of four toos
Implications of four “toos”
  • Relative effectiveness unknown
    • Effectiveness in comparison to alternative therapies
  • Surrogate vs. clinical endpoints
    • Bone mineral density, blood pressure, lipid levels, tumor size, joint counts vs radiographic damage
  • Infrequent adverse events
  • Long latency adverse events
    • DES & adenocarcinoma of vagina
  • Special populations
    • Women, children, elderly, multiple comorbidities
  • Drug use in clinical practice
what is a registry
What is a registry?
  • Definition of a registry
    • An organized system that uses observational study methods to collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by particular disease, condition or exposure, and that serves a predetermined scientific, clinical, or policy purpose(s).
  • Different types of registries
    • Disease registry
    • Product registry
    • Health services registry
  • Pregnancy registries

Registries for Evaluating Patient Outcomes. AHRQ Publication No. 07-EHC001. May 2007.

purpose of a registry
Purpose of a registry
  • Describe the natural history of disease
  • Determine clinical effectiveness or cost effectiveness of health care products, drugs and services
  • Measure or monitor safety and harm
  • Measure quality of care
registry types
Registry types
  • Disease registry
    • Patients who have the same diagnosis
    • e.g. all RA or SLE patients or rheumatic diseases
  • Product registry
    • Patients who have been exposed to biopharmaceutical products or medical devices
  • Health services registry
    • Patients who have had a common procedure, clinical encounter or hospitalization (TKA-THA registries)
registries useful when
Registries useful when:
  • Outcome is relatively common, well-defined and ascertainable & serious
  • Extensive drug exposure
  • Appropriate reference group
  • Data on relevant covariates ascertainable
  • Minimal channeling (preferential prescribing of a new drug to patients at a higher risk)
  • Minimal confounding by indication
  • Onset latency <2-3 years
  • Required drug exposure <2-3 years
  • Not an urgent drug safety crisis
registries may not be useful when
Registries may not be useful when:
  • Outcome: poorly-defined, difficult to validate outcomes (depression, psychosis)
  • Exposure
    • Rare drug exposure
    • Intermittent exposure
    • OTC drugs, herbals
  • Significant confounding by indication
    • Antidepressants and suicides
    • Inhaled beta-agonists and asthma death
  • Certain settings
    • Specialty clinics, in-hospital drug use
consequences of not doing registries or observational studies
Consequences of not doing registries or observational studies
  • Arguing over case reports
  • Lack of data on real benefit-risk balance
  • Less effective and usually biased decision-making
  • Possibly false conclusions
  • Law suits
observational study designs
Drug exposed patients

Case reports

Case series



Ecological studies

Exposed vs. unexposed


Prospective cohort


Observational study designs
ecological studies time series
Ecological studies – time series
  • When drug is predominant cause of the disease
  • Changes in outcomes following an abrupt change in drug exposure, as result of a policy or regulatory change, publications, media coverage
  • Reported Cases of Reye's Syndrome in Relation to the Timing of Public Announcements

Belay et al. NEJM 1999; 340:1377

Ecological studies – time seriesSecular trends in NSAID use and colorectal cancer incidence

Lamont: Cancer J 2008:14(4):276-277

ecological studies time series rofecoxib celecoxib and myocardial infarction
Ecological studies – time seriesRofecoxib-celecoxib and myocardial infarction

Brownstein et al. PLoS ONE. 2007:2(9):e840.

summary ecological studies
Summary: ecological studies


  • Complexity of disease causation
  • Confounding by the “ecological fallacy”


  • Cost ↓, time ↓, using routinely collected data
  • New hypotheses about the causes of a disease and new potential risk factors (e.g. air pollution)
  • Provides estimates of causal effects that are not attenuated by measurement error
  • Some risk factors for disease operate at the population level (i.e. SES status)
Diseased (RA) N=9Prevalence:Proportion of individuals in a defined population who have a particular disease at a given point in time

Population on 1/1/2005


Prevalence = 9/100

Prevalence = Incidence of disease x Duration of disease

Diseased individuals

Incidence: Proportion of new cases of a disease or health-related condition in a population-at-risk over a specified period of time

Population on 1/1/2005


1 year f-up

Exclude prevalent cases leaving

N=91 at risk

Incidence=2 cases/91 person-years

New-onset disease during 1 yr f-up

Diseased individuals on 1/1/2005


incidence of ra in olmsted county mn 1955 2005
Incidence of RA in Olmsted County, MN (1955-2005)

Gabriel et al. A&R 2008: 58(9):S453

incidence of psa by age and sex 1970 2000
Incidence of PSA by age and sex (1970-2000)





Incidence rate (per 100,000)












Wilson et al. AC&R 2009: in press.

incidence study requires keeping track of both the numerator denominator
Incidence study requires keeping track of both the numerator & denominator!

Population on 1/1/2005


1 yr

1 yr

  • Residents who die or move out of the city
  • New residents (i.e. new folks who move into the city)
  • All new-onset disease while living in the city
  • Possible in few locations in the world
mortality analyses
Mortality analyses
  • RA: 124 studies in 84 unique cohorts1
  • 15 key points in interpretation1
    • Incident vs prevalent cases
    • Population-based vs clinic-based
    • SMR
  • Cause-specific mortality2
  • 3 time dimensions in mortality analyses3
    • Duration of RA
    • Timing of onset of RA relative to death
    • Calendar time

1 Sokka et al. Clinical Exp Rheum 2008;26(Suppl. 51): S35-S61

2 Aviña-Zubieta et al. A&R 2008; 59:1690-1697

3 Ward. A&R 2008; 59: 1687-1689

mortality in incidence cohorts prevalence cohorts
Mortality in incidence cohorts < prevalence cohorts

1 Sokka et al. Clinical Exp Rheum 2008; 26 (Suppl. 51): S-35-S-61

2 Aviña-Zubieta et al. A&R 2008; 59:1690-1697

referral bias population based vs clinic based cohorts
Mild diseaseReferral bias: Population-based vs clinic-based cohorts

Reality in the population


What the GP sees


What the rheumatologist

sees! N= 40

  • Observed deaths ÷ expected deaths
  • Strongly influenced by choice of data to calculate expected deaths
    • Age and gender specific
    • Time period
    • Complete follow-up
  • Example:
    • RA cohort assembled between 1970-1990 with follow-up until 2000
    • Expected mortality derived from US mortality rates between 1970-1990
trends in ra mortality vs expected






Mortality Rate (per 100 py)

Mortality Rate (per 100 py)















Calendar Year

Calendar Year

Trends in RA Mortality vs. Expected*







Gonzalez A, et al. Arthritis Rheum 2007;56(11):3583-587

observed expected mortality in ra
Observed: expected mortality in RA

Expected (non-RA)

Observed (RA)

Survival (%)







Years after RA incidence

Gabriel et al. A&R 2003; 48:54-58

time disease duration and cv mortality in ra
Time: disease duration and CV mortality in RA

Maradit Kremers A&R 2005; 52: 722-732

summary incidence prevalence and mortality
Summary: incidence, prevalence and mortality


  • Underlying data source
    • Population-based or not
    • Incident vs prevalent cases
  • Methodology
    • Case ascertainment
    • Completeness of follow-up
  • Comparison data!
disease definitions and classification criteria in rheumatic diseases

Disease definitions and classification criteria in rheumatic diseases

Developed using observational study methodologies

dynamic nature of rheumatic diseases


2 or more


3 or more

Cumulative incidence, %


4 or more


5 or more








Years since RA incidence

Dynamic nature of rheumatic diseases
  • 25% who initially met RA criteria still had evidence of RA 3-5 years later

O’Sullivan et al. Ann Intern Med 1972; 76: 573-7.

Mikkelsen et al. A&R 1969; 12: 87-91.

Lichtenstein et al. J Rheumatol 1991; 18: 989-93.

Icen et al. J Rheumatol 2008.

typical vs desired methodology for classification criteria
Typical vs desired methodology for classification criteria



Observe disease evolution

Patients with established disease

Patients with new-onset disease



Compare characteristics

Patients with other established rheumatic diseases

Patients with other new-onset rheumatic diseases

Observe disease evolution

examining outcomes and risk factors in rheumatic diseases

Examining outcomes and risk factors in rheumatic diseases

Cohort Studies (outcomes)

Registries (outcomes)

Case-control studies (risk factors)

types of cohort studies
Types of Cohort Studies
  • Designated by the timing of data collection in the investigator’s time:
    • Prospective
    • Retrospective (historical)
    • Mixed
  • Mayo studies: retrospective
  • Registries: prospective
types of cohort studies43

begins study


begins study


begins study

Types of Cohort Studies

Selection of








Mixed (P+R)


All designs feasible either as ad hoc registries or in automated database studies.

cohort study design options
Cohort study: design options
  • Prospective vs. retrospective
  • Entry into cohort: closed or open
  • Timing of exposure: new users or not
  • Source of un-exposed cohort
    • Internal
    • External
      • drug exposed subjects only, registries
cohort study steps
Cohort Study: Steps
  • Cohort identification
      • Define subjects & follow-up period
  • Risk factor/drug exposure measurement throughout follow-up
  • Outcome (disease) ascertainment
  • Confounder measurements (throughout follow-up)
  • Analysis
step 1 cohort identification
Step 1 - Cohort identification
  • Trade-off between external and internal validity
  • Retrospective vs. prospective
    • Consider feasibility and costs
  • Follow-up
    • Tracking of drug changes over time
    • Losses to follow-up, esp. if likely to be differential (different for drug users and non-users)
step 2 risk factor drug exposure measurement
Step 2 – Risk factor/Drug exposure measurement
  • New versus old users
    • Ability to account confounders before drug started
    • Ability to quantify outcomes early after starting the drug (compliance, early drop-offs due to intolerance)
  • Incomplete drug exposure
    • E.g. One time measurement of DMARD use and mortality
  • Drug exposure metric
    • Ever/never, dose (average, cumulative), duration
  • Reference group
    • Non-users, past users, users of other drugs
  • Misclassification of episodic use
step 2 drug exposure measurement methods
Step 2 - Drug exposure measurement methods
  • Interviews
      • Face-to-face, phone or self-administered
      • Excellent to capture current use but not for past use or changing drug use over time
      • Loss of memory – cognitively intact subjects & regularly used drugs
  • Biological testing
      • Blood or urine
      • Excellent to capture current use but not for past use
      • Non-differential (unless disease affects the assay)
  • Pharmacy or claims records
  • Medical records
step 2 pharmacy or claims records for drug exposure
Step 2 - Pharmacy or claims records for drug exposure
  • Drugs obtained by prescription
  • Drug details available
  • Accurate & complete for both past and current drug exposure
  • Temporal tracking possible
  • Limitation  compliance
    • Prescription filled and drug taking
  • Validation studies are necessary
step 2 misclassification of drug exposure
Step 2 - Misclassification of drug exposure

MD prescription


Free sample

15 days

Rx fill for 30 days

Patient used for 40 days

Refill for 30 days

Used 20 days






30 days

30 days

Claims data

+15 days rule

step 2 summary aspects of drug exposure measurement
Step 2 summary: Aspects of drug exposure measurement
  • Completeness & accuracy
  • Response rate
  • Temporal change over time
  • Special populations
  • Details of the drug
  • Details of utilization
  • Availability & cost (reimbursement)
  • Differential or non-differential
step 3 outcome ascertainment
Step 3 – Outcome ascertainment
  • Low specificity – methods used to find outcomes incorrectly includes subjects without the outcome
    • Validation of outcomes in database studies
  • Low sensitivity - incomplete (and potentially differential) identification of outcomes
    • increased diagnostic surveillance (e.g. NSAIDs and GI events)
    • Under-diagnosed & un-treated conditions
  • Timing of disease onset (protopathic bias)
Step 3 challenges: Protopathic bias



Stomach pain







Study start

nsNSAID = non-specific NSAID

step 3 outcome of interest in rheumatology
Step 3 – Outcome of interest in rheumatology
  • Beneficial effects/effectiveness
    • Disease progression
  • Adverse effects
    • Mortality
    • Cardiovascular morbidity
    • Infections
    • Lymphomas & solid malignancies
    • Autoimmunity
    • GI events (NSAIDs)
    • Pregnancy outcomes
step 3 consistency in outcome definitions infections in ra
Step 3 - Consistency in outcome definitions – infections in RA

Askling: Curr Opin Rheumatol 2008; 20(2): 138–144

step 3 challenges differential misclassification of outcome
Step 3 challenges: Differential misclassification of outcome
  • Cohort study: May result from misclassification of outcome/disease free (specificity) or incomplete identification of persons with outcome (sensitivity) in exposed and unexposed subjects
    • Under-diagnosed conditions
    • Example: Patients with RA, especially those on biologics are more likely to see their doctors more often and more likely to be examined for labs, or CVD
  • Using medication-taking as a surrogate of outcome can be problematic
step 3 outcome ascertainment competing risk of death
Step 3 – Outcome ascertainmentCompeting risk of death

Melton et al. Osteoporos Int. 2008 Sep 17.

step 4 confounder measurements
Step 4 – Confounder measurements

What is a confounder?

  • The clinical condition which determines drug selection (channeling) and is linked to the adverse event
    • Indication
    • Severity
    • Contraindication

Drug Exposure

Adverse event

Confounder: INDICATION

step 4 confounding by indication
Step 4 - Confounding by indication
  • The indications for drug use, because of their natural association with prognosis, may confound the comparison so that it looks as if the treatment causes the disease

“You’d better avoid antihypertensive treatment because treated patients have higher stroke rates”

step 4 confounding by disease severity
Step 4 - Confounding by disease severity
  • The severity of RA is a confounder because:
    • Associated with use of biologics
    • Independent risk factor for CVD
    • Not in causal pathway



Rheumatoid arthritis (RA) severity


step 4 confounding by contraindication
Step 4 - Confounding by contraindication
  • MD’s perception of the patient’s tendency to develop peptic ulcer & bleeding is a confounder because:
    • Associated with NSAID choice
    • Independent risk factor for GI bleeding
    • Not in causal pathway

Celebrex vs


GI bleeding

MD perception of risk

step 4 confounding by indication63
Step 4 - Confounding by indication
  • Prescription Channeling
    • New versus older products
    • Example: Comparison of the risk of upper GI bleeding among coxibs versus traditional NSAIDs
      • Coxibs preferentially prescribed to patients at high risk for upper GI bleeding

Moride et al. Arthritis Res Ther. 2005;7:R333-342.

step 4 extent of confounding by indication
Potential for confounding by indication

e.g. coxibs and

GI events

e.g. coxibs and

CV events

Intentionality of treatment effect by prescriber

Step 4 – Extent of confounding by indication

Schneeweiss. Clin Pharmacol Ther 2007: 82:143–156

step 5 analysis
Step 5 - Analysis
  • Conventional methods to control for confounding
    • Randomization (clinical trials)
    • Restriction - homogeneous study population
    • Matching - select controls comparable to cases re. confounders
    • Stratified analysis
    • Statistical modeling
  • Sensitivity analyses
  • Active-competing comparator designs
  • Propensity scores
  • Marginal structural models
  • Instrumental variable analysis
example sensitivity analysis
Example: Sensitivity analysis

Setoguchi Am Heart J 2008;156:336-41

example propensity score
Example: Propensity score

Wiles Arthritis Rheum 2001;44:1033-42

example marginal structural models to examine mtx and cv death
Example:Marginal structural models to examine MTX and CV Death


Hazard ratio (95% CI)

  • All Cause Mortality 191 0.8 (0.6-1.0)s

0.4 (0.2-0.8)*

  • Cardiovascular Mortality 84 0.3 (0.2-0.7)*
  • Non-CV Mortality 107 0.6 (0.2-1.2)*

s Unadjusted

* Adjusted for: age, sex, RF, calendar year, duration of disease, smoking, education, HAQ score, patient global assessment, joint counts, ESR, and prednisone status and number of other DMARDs used

Choi HK, et al. Lancet 2000;359:1173-7

cohort studies example exposed cohort only
Cohort studies exampleExposed cohort only
  • Usually prospective
  • Biologics registries by Pharma
    • All patients getting one or more biologics
    • Typically one-armed cohort: No comparator
      • 9882 patients on anti-TNF observed for ~2-3 years
      • 25 new onset psoriasis (what does this mean?)*
    • Total denominator known; ?total # effects
  • Comparison data
    • External and typically not the same sampling frame as patients on biologics

* Harrison et al. Ann Rheum Dis April 2008.

cohort studies example exposed comparison cohort
Cohort studies exampleExposed & comparison cohort
  • NSAIDs and GI bleeding
    • Cohort of patients taking NSAID of interest compared with one or more other NSAIDs
    • Rate of GI bleeding during follow up period compared
  • Glucocorticoids and risk of CVD in RA patients
    • Cohort of RA patients taking glucocorticoids – comparison of users vs non-users
    • Rate of CVD during follow-up compared

Solomon et al. Arthritis Rheum 2006;54:1378-89

Davis et al. Arthritis Rheum 2007;56:820-830

Cohort studies example Analysis within existing cohort
  • Identify general population cohort study where extensive longitudinal data available
    • Nurses Health Study, Framingham Study, Physician’s Health Study, National Databank of Rheum Diseases
  • REP - Rochester Epidemiology Project: Cohort is the Olmsted County population
  • Advantages: If data collected, analysis only
  • Disadvantages: Biases, confounding relative to nature of population + lack of key covariates
Cohort studies example Database cohort study
  • Most common form in pharmacoepidemiology
  • Usually retrospective, but can be mixed
  • Many large multi-purpose databases are used
    • HMO, Managed Care (Puget Sound, United Health Care)
    • Electronic medical records (GPRD, MediPlus)
    • Provincial health plans (Saskatchewan)
  • Advantages: large, data already exists, complete for billable services
  • Disadvantages: Claims = diagnoses


summary cohort studies there is a difference between relative versus absolute risk
Summary: Cohort studiesThere is a difference between relative versus absolute risk

Rate difference increasing but rate ratio constant

Rate difference is constant but rate ratio decreasing




(e.g. RA)


(e.g. non-RA)

30 40 50 60 70 80


consider these before you believe the results
Consider these before you believe the results!
  • If negative study
    • Power
    • Outcome & exposure definition
    • Comparison group
    • Non-differential misclassification
    • Replication
consider these before you believe the results77
Consider these before you believe the results!
  • If positivestudy
    • Confounding
    • Channeling
    • Differential misclassification
    • Generalizability
    • Implications
    • Replication