Methods to analyze real world databases and registries
1 / 77

Methods to analyze real world databases and registries - PowerPoint PPT Presentation

  • Uploaded on

Methods to analyze real world databases and registries. Hilal Maradit Kremers, MD MSc Mayo Clinic, Rochester, MN. Clinical Research Methodology Course NYU-Hospital for Joint Diseases December 11, 2008. Disclosure. Research funding from National Institutes of Health (RA) Amgen (psoriasis)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Methods to analyze real world databases and registries' - emily

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Methods to analyze real world databases and registries l.jpg

Methods toanalyze real worlddatabases and registries

Hilal Maradit Kremers, MD MSc

Mayo Clinic, Rochester, MN

Clinical Research Methodology Course

NYU-Hospital for Joint Diseases

December 11, 2008

Disclosure l.jpg

Research funding from

  • National Institutes of Health (RA)

  • Amgen (psoriasis)

  • Pfizer (pulmonary arterial hypertension)

Outline l.jpg

  • Terminology

  • Clinical trials versus observational studies and registries

  • Types of observational studies in rheumatic diseases

    • Descriptive epidemiology (incidence, prevalence)

    • Disease definitions (i.e. classification criteria)

    • Examining outcomes (including effectiveness of therapy) and risk factors (environmental, genetic)

  • Tips when interpreting results

Terminology l.jpg

“Real-world databases”

& registries




Terminology of related observational research disciplines l.jpg
Terminology of related observational research disciplines

Health Services












Terminology clinical medicine versus epidemiology l.jpg
Terminology: Clinical medicine versus epidemiology


  • Natural history of the disease

  • Signs and symptoms

  • Diagnosis (how and when)

  • Current clinical practice

  • Clinical literature

  • Drug-induced illnesses


  • Distribution and determinants of diseases in populations

    • Study design

    • Data collection

    • Measurement

    • Analyses

    • Interpretation

    • Critical review

Clinical trials versus observational studies and registries8 l.jpg
Clinical trials versus observational studies and registries



Exposure -


Exposure +

Exposure -




Exposure +

Disease -




Disease +

Why do we need registries l.jpg
Why do we need registries

  • Limitations of pre-marketing trials

  • Unresolved issues from pre-marketing studies

  • New signals or inconsistent signals from post-marketing surveillance

  • Evolving concerns about safety

  • Establishing risk-benefit margins

  • Learn about use, Rx decisions, compliance and other physician/patient behaviors

  • To evaluate a risk management program

Clinical trial vs observational studies registries four toos l.jpg
Clinical trial vs observational studies/registries – four “toos”

  • Too few

  • Too brief

  • Too simple

  • Too median-aged

Implications of four toos l.jpg
Implications of four “toos”

  • Relative effectiveness unknown

    • Effectiveness in comparison to alternative therapies

  • Surrogate vs. clinical endpoints

    • Bone mineral density, blood pressure, lipid levels, tumor size, joint counts vs radiographic damage

  • Infrequent adverse events

  • Long latency adverse events

    • DES & adenocarcinoma of vagina

  • Special populations

    • Women, children, elderly, multiple comorbidities

  • Drug use in clinical practice

What is a registry l.jpg
What is a registry?

  • Definition of a registry

    • An organized system that uses observational study methods to collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by particular disease, condition or exposure, and that serves a predetermined scientific, clinical, or policy purpose(s).

  • Different types of registries

    • Disease registry

    • Product registry

    • Health services registry

  • Pregnancy registries

Registries for Evaluating Patient Outcomes. AHRQ Publication No. 07-EHC001. May 2007.

Purpose of a registry l.jpg
Purpose of a registry

  • Describe the natural history of disease

  • Determine clinical effectiveness or cost effectiveness of health care products, drugs and services

  • Measure or monitor safety and harm

  • Measure quality of care

Registry types l.jpg
Registry types

  • Disease registry

    • Patients who have the same diagnosis

    • e.g. all RA or SLE patients or rheumatic diseases

  • Product registry

    • Patients who have been exposed to biopharmaceutical products or medical devices

  • Health services registry

    • Patients who have had a common procedure, clinical encounter or hospitalization (TKA-THA registries)

Registries useful when l.jpg
Registries useful when:

  • Outcome is relatively common, well-defined and ascertainable & serious

  • Extensive drug exposure

  • Appropriate reference group

  • Data on relevant covariates ascertainable

  • Minimal channeling (preferential prescribing of a new drug to patients at a higher risk)

  • Minimal confounding by indication

  • Onset latency <2-3 years

  • Required drug exposure <2-3 years

  • Not an urgent drug safety crisis

Registries may not be useful when l.jpg
Registries may not be useful when:

  • Outcome: poorly-defined, difficult to validate outcomes (depression, psychosis)

  • Exposure

    • Rare drug exposure

    • Intermittent exposure

    • OTC drugs, herbals

  • Significant confounding by indication

    • Antidepressants and suicides

    • Inhaled beta-agonists and asthma death

  • Certain settings

    • Specialty clinics, in-hospital drug use

Consequences of not doing registries or observational studies l.jpg
Consequences of not doing registries or observational studies

  • Arguing over case reports

  • Lack of data on real benefit-risk balance

  • Less effective and usually biased decision-making

  • Possibly false conclusions

  • Law suits

Observational study designs l.jpg

Drug exposed patients studies

Case reports

Case series



Ecological studies

Exposed vs. unexposed


Prospective cohort


Observational study designs

Ecological studies time series l.jpg
Ecological studies – time series studies

  • When drug is predominant cause of the disease

  • Changes in outcomes following an abrupt change in drug exposure, as result of a policy or regulatory change, publications, media coverage

  • Reported Cases of Reye's Syndrome in Relation to the Timing of Public Announcements

Belay et al. NEJM 1999; 340:1377

Slide21 l.jpg

Ecological studies – time series studiesSecular trends in NSAID use and colorectal cancer incidence

Lamont: Cancer J 2008:14(4):276-277

Ecological studies time series rofecoxib celecoxib and myocardial infarction l.jpg
Ecological studies – time series studiesRofecoxib-celecoxib and myocardial infarction

Brownstein et al. PLoS ONE. 2007:2(9):e840.

Summary ecological studies l.jpg
Summary: studies ecological studies


  • Complexity of disease causation

  • Confounding by the “ecological fallacy”


  • Cost ↓, time ↓, using routinely collected data

  • New hypotheses about the causes of a disease and new potential risk factors (e.g. air pollution)

  • Provides estimates of causal effects that are not attenuated by measurement error

  • Some risk factors for disease operate at the population level (i.e. SES status)

Studies on descriptive epidemiology of rheumatic diseases l.jpg

Studies on descriptive epidemiology of rheumatic diseases studies




Slide25 l.jpg

Diseased (RA) N=9 studies

Prevalence:Proportion of individuals in a defined population who have a particular disease at a given point in time

Population on 1/1/2005


Prevalence = 9/100

Prevalence = Incidence of disease x Duration of disease

Diseased individuals

Slide26 l.jpg
Incidence: studiesProportion of new cases of a disease or health-related condition in a population-at-risk over a specified period of time

Population on 1/1/2005


1 year f-up

Exclude prevalent cases leaving

N=91 at risk

Incidence=2 cases/91 person-years

New-onset disease during 1 yr f-up

Diseased individuals on 1/1/2005


Incidence of ra in olmsted county mn 1955 2005 l.jpg
Incidence of RA in Olmsted County, MN (1955-2005) studies

Gabriel et al. A&R 2008: 58(9):S453

Incidence of psa by age and sex 1970 2000 l.jpg
Incidence of PSA by age and sex (1970-2000) studies





Incidence rate (per 100,000)












Wilson et al. AC&R 2009: in press.

Incidence study requires keeping track of both the numerator denominator l.jpg
Incidence study requires keeping track of both the numerator & denominator!

Population on 1/1/2005


1 yr

1 yr

  • Residents who die or move out of the city

  • New residents (i.e. new folks who move into the city)

  • All new-onset disease while living in the city

  • Possible in few locations in the world

Mortality analyses l.jpg
Mortality analyses & denominator!

  • RA: 124 studies in 84 unique cohorts1

  • 15 key points in interpretation1

    • Incident vs prevalent cases

    • Population-based vs clinic-based

    • SMR

  • Cause-specific mortality2

  • 3 time dimensions in mortality analyses3

    • Duration of RA

    • Timing of onset of RA relative to death

    • Calendar time

1 Sokka et al. Clinical Exp Rheum 2008;26(Suppl. 51): S35-S61

2 Aviña-Zubieta et al. A&R 2008; 59:1690-1697

3 Ward. A&R 2008; 59: 1687-1689

Mortality in incidence cohorts prevalence cohorts l.jpg
Mortality in incidence & denominator!cohorts < prevalence cohorts

1 Sokka et al. Clinical Exp Rheum 2008; 26 (Suppl. 51): S-35-S-61

2 Aviña-Zubieta et al. A&R 2008; 59:1690-1697

Referral bias population based vs clinic based cohorts l.jpg

Mild disease & denominator!

Referral bias: Population-based vs clinic-based cohorts

Reality in the population


What the GP sees


What the rheumatologist

sees! N= 40

Slide33 l.jpg
SMR & denominator!

  • Observed deaths ÷ expected deaths

  • Strongly influenced by choice of data to calculate expected deaths

    • Age and gender specific

    • Time period

    • Complete follow-up

  • Example:

    • RA cohort assembled between 1970-1990 with follow-up until 2000

    • Expected mortality derived from US mortality rates between 1970-1990

Trends in ra mortality vs expected l.jpg

5 & denominator!






Mortality Rate (per 100 py)

Mortality Rate (per 100 py)















Calendar Year

Calendar Year

Trends in RA Mortality vs. Expected*







Gonzalez A, et al. Arthritis Rheum 2007;56(11):3583-587

Observed expected mortality in ra l.jpg
Observed: expected mortality in RA & denominator!

Expected (non-RA)

Observed (RA)

Survival (%)







Years after RA incidence

Gabriel et al. A&R 2003; 48:54-58

Time disease duration and cv mortality in ra l.jpg
Time: & denominator! disease duration and CV mortality in RA

Maradit Kremers A&R 2005; 52: 722-732

Summary incidence prevalence and mortality l.jpg
Summary: & denominator! incidence, prevalence and mortality


  • Underlying data source

    • Population-based or not

    • Incident vs prevalent cases

  • Methodology

    • Case ascertainment

    • Completeness of follow-up

  • Comparison data!

Disease definitions and classification criteria in rheumatic diseases l.jpg

Disease definitions and classification criteria in rheumatic diseases

Developed using observational study methodologies

Dynamic nature of rheumatic diseases l.jpg

100 diseases


2 or more


3 or more

Cumulative incidence, %


4 or more


5 or more








Years since RA incidence

Dynamic nature of rheumatic diseases

  • 25% who initially met RA criteria still had evidence of RA 3-5 years later

O’Sullivan et al. Ann Intern Med 1972; 76: 573-7.

Mikkelsen et al. A&R 1969; 12: 87-91.

Lichtenstein et al. J Rheumatol 1991; 18: 989-93.

Icen et al. J Rheumatol 2008.

Typical vs desired methodology for classification criteria l.jpg
Typical vs desired methodology for classification criteria diseases



Observe disease evolution

Patients with established disease

Patients with new-onset disease



Compare characteristics

Patients with other established rheumatic diseases

Patients with other new-onset rheumatic diseases

Observe disease evolution

Examining outcomes and risk factors in rheumatic diseases l.jpg

Examining outcomes and risk factors in rheumatic diseases diseases

Cohort Studies (outcomes)

Registries (outcomes)

Case-control studies (risk factors)

Types of cohort studies l.jpg
Types of Cohort Studies diseases

  • Designated by the timing of data collection in the investigator’s time:

    • Prospective

    • Retrospective (historical)

    • Mixed

  • Mayo studies: retrospective

  • Registries: prospective

Types of cohort studies43 l.jpg

Investigator diseases

begins study


begins study


begins study

Types of Cohort Studies

Selection of








Mixed (P+R)


All designs feasible either as ad hoc registries or in automated database studies.

Cohort study design options l.jpg
Cohort study: design options diseases

  • Prospective vs. retrospective

  • Entry into cohort: closed or open

  • Timing of exposure: new users or not

  • Source of un-exposed cohort

    • Internal

    • External

      • drug exposed subjects only, registries

Cohort study steps l.jpg
Cohort Study: Steps diseases

  • Cohort identification

    • Define subjects & follow-up period

  • Risk factor/drug exposure measurement throughout follow-up

  • Outcome (disease) ascertainment

  • Confounder measurements (throughout follow-up)

  • Analysis

  • Step 1 cohort identification l.jpg
    Step 1 - Cohort identification diseases

    • Trade-off between external and internal validity

    • Retrospective vs. prospective

      • Consider feasibility and costs

    • Follow-up

      • Tracking of drug changes over time

      • Losses to follow-up, esp. if likely to be differential (different for drug users and non-users)

    Step 2 risk factor drug exposure measurement l.jpg
    Step 2 – Risk factor/Drug exposure measurement diseases

    • New versus old users

      • Ability to account confounders before drug started

      • Ability to quantify outcomes early after starting the drug (compliance, early drop-offs due to intolerance)

    • Incomplete drug exposure

      • E.g. One time measurement of DMARD use and mortality

    • Drug exposure metric

      • Ever/never, dose (average, cumulative), duration

    • Reference group

      • Non-users, past users, users of other drugs

    • Misclassification of episodic use

    Step 2 timing patterns of drug use l.jpg
    Step 2 - Timing: patterns of drug use diseases




    Step 2 drug exposure measurement methods l.jpg
    Step 2 - Drug exposure measurement methods diseases

    • Interviews

      • Face-to-face, phone or self-administered

      • Excellent to capture current use but not for past use or changing drug use over time

      • Loss of memory – cognitively intact subjects & regularly used drugs

  • Biological testing

    • Blood or urine

    • Excellent to capture current use but not for past use

    • Non-differential (unless disease affects the assay)

  • Pharmacy or claims records

  • Medical records

  • Step 2 pharmacy or claims records for drug exposure l.jpg
    Step 2 - Pharmacy or claims records for drug exposure diseases

    • Drugs obtained by prescription

    • Drug details available

    • Accurate & complete for both past and current drug exposure

    • Temporal tracking possible

    • Limitation  compliance

      • Prescription filled and drug taking

    • Validation studies are necessary

    Step 2 misclassification of drug exposure l.jpg
    Step 2 - Misclassification of drug exposure diseases

    MD prescription


    Free sample

    15 days

    Rx fill for 30 days

    Patient used for 40 days

    Refill for 30 days

    Used 20 days






    30 days

    30 days

    Claims data

    +15 days rule

    Step 2 summary aspects of drug exposure measurement l.jpg
    Step 2 summary: diseases Aspects of drug exposure measurement

    • Completeness & accuracy

    • Response rate

    • Temporal change over time

    • Special populations

    • Details of the drug

    • Details of utilization

    • Availability & cost (reimbursement)

    • Differential or non-differential

    Step 3 outcome ascertainment l.jpg
    Step 3 – Outcome ascertainment diseases

    • Low specificity – methods used to find outcomes incorrectly includes subjects without the outcome

      • Validation of outcomes in database studies

    • Low sensitivity - incomplete (and potentially differential) identification of outcomes

      • increased diagnostic surveillance (e.g. NSAIDs and GI events)

      • Under-diagnosed & un-treated conditions

    • Timing of disease onset (protopathic bias)

    Slide54 l.jpg

    Step 3 challenges: Protopathic bias diseases



    Stomach pain







    Study start

    nsNSAID = non-specific NSAID

    Step 3 outcome of interest in rheumatology l.jpg
    Step 3 – Outcome of interest in rheumatology diseases

    • Beneficial effects/effectiveness

      • Disease progression

    • Adverse effects

      • Mortality

      • Cardiovascular morbidity

      • Infections

      • Lymphomas & solid malignancies

      • Autoimmunity

      • GI events (NSAIDs)

      • Pregnancy outcomes

    Step 3 consistency in outcome definitions infections in ra l.jpg
    Step 3 - Consistency in outcome definitions – diseasesinfections in RA

    Askling: Curr Opin Rheumatol 2008; 20(2): 138–144

    Step 3 challenges differential misclassification of outcome l.jpg
    Step 3 challenges: Differential misclassification of outcome diseases

    • Cohort study: May result from misclassification of outcome/disease free (specificity) or incomplete identification of persons with outcome (sensitivity) in exposed and unexposed subjects

      • Under-diagnosed conditions

      • Example: Patients with RA, especially those on biologics are more likely to see their doctors more often and more likely to be examined for labs, or CVD

    • Using medication-taking as a surrogate of outcome can be problematic

    Step 3 outcome ascertainment competing risk of death l.jpg
    Step 3 – Outcome ascertainment diseasesCompeting risk of death

    Melton et al. Osteoporos Int. 2008 Sep 17.

    Step 4 confounder measurements l.jpg
    Step 4 – Confounder measurements diseases

    What is a confounder?

    • The clinical condition which determines drug selection (channeling) and is linked to the adverse event

      • Indication

      • Severity

      • Contraindication

    Drug Exposure

    Adverse event

    Confounder: INDICATION

    Step 4 confounding by indication l.jpg
    Step 4 - Confounding by indication diseases

    • The indications for drug use, because of their natural association with prognosis, may confound the comparison so that it looks as if the treatment causes the disease

      “You’d better avoid antihypertensive treatment because treated patients have higher stroke rates”

    Step 4 confounding by disease severity l.jpg
    Step 4 - Confounding by disease severity diseases

    • The severity of RA is a confounder because:

      • Associated with use of biologics

      • Independent risk factor for CVD

      • Not in causal pathway



    Rheumatoid arthritis (RA) severity


    Step 4 confounding by contraindication l.jpg
    Step 4 - Confounding by contraindication diseases

    • MD’s perception of the patient’s tendency to develop peptic ulcer & bleeding is a confounder because:

      • Associated with NSAID choice

      • Independent risk factor for GI bleeding

      • Not in causal pathway

    Celebrex vs


    GI bleeding

    MD perception of risk

    Step 4 confounding by indication63 l.jpg
    Step 4 - Confounding by indication diseases

    • Prescription Channeling

      • New versus older products

      • Example: Comparison of the risk of upper GI bleeding among coxibs versus traditional NSAIDs

        • Coxibs preferentially prescribed to patients at high risk for upper GI bleeding

    Moride et al. Arthritis Res Ther. 2005;7:R333-342.

    Step 4 extent of confounding by indication l.jpg

    Potential for confounding by indication diseases

    e.g. coxibs and

    GI events

    e.g. coxibs and

    CV events

    Intentionality of treatment effect by prescriber

    Step 4 – Extent of confounding by indication

    Schneeweiss. Clin Pharmacol Ther 2007: 82:143–156

    Step 5 analysis l.jpg
    Step 5 - Analysis diseases

    • Conventional methods to control for confounding

      • Randomization (clinical trials)

      • Restriction - homogeneous study population

      • Matching - select controls comparable to cases re. confounders

      • Stratified analysis

      • Statistical modeling

    • Sensitivity analyses

    • Active-competing comparator designs

    • Propensity scores

    • Marginal structural models

    • Instrumental variable analysis

    Example sensitivity analysis l.jpg
    Example: diseases Sensitivity analysis

    Setoguchi Am Heart J 2008;156:336-41

    Example propensity score l.jpg
    Example: diseases Propensity score

    Wiles Arthritis Rheum 2001;44:1033-42

    Example marginal structural models to examine mtx and cv death l.jpg
    Example: diseasesMarginal structural models to examine MTX and CV Death


    Hazard ratio (95% CI)

    • All Cause Mortality 191 0.8 (0.6-1.0)s

      0.4 (0.2-0.8)*

    • Cardiovascular Mortality 84 0.3 (0.2-0.7)*

    • Non-CV Mortality 107 0.6 (0.2-1.2)*

      s Unadjusted

      * Adjusted for: age, sex, RF, calendar year, duration of disease, smoking, education, HAQ score, patient global assessment, joint counts, ESR, and prednisone status and number of other DMARDs used

    Choi HK, et al. Lancet 2000;359:1173-7

    Cohort studies example exposed cohort only l.jpg
    Cohort studies example diseasesExposed cohort only

    • Usually prospective

    • Biologics registries by Pharma

      • All patients getting one or more biologics

      • Typically one-armed cohort: No comparator

        • 9882 patients on anti-TNF observed for ~2-3 years

        • 25 new onset psoriasis (what does this mean?)*

      • Total denominator known; ?total # effects

    • Comparison data

      • External and typically not the same sampling frame as patients on biologics

    * Harrison et al. Ann Rheum Dis April 2008.

    Cohort studies example exposed comparison cohort l.jpg
    Cohort studies example diseasesExposed & comparison cohort

    • NSAIDs and GI bleeding

      • Cohort of patients taking NSAID of interest compared with one or more other NSAIDs

      • Rate of GI bleeding during follow up period compared

    • Glucocorticoids and risk of CVD in RA patients

      • Cohort of RA patients taking glucocorticoids – comparison of users vs non-users

      • Rate of CVD during follow-up compared

    Solomon et al. Arthritis Rheum 2006;54:1378-89

    Davis et al. Arthritis Rheum 2007;56:820-830

    Slide71 l.jpg

    Cohort studies example diseasesAnalysis within existing cohort

    • Identify general population cohort study where extensive longitudinal data available

      • Nurses Health Study, Framingham Study, Physician’s Health Study, National Databank of Rheum Diseases

    • REP - Rochester Epidemiology Project: Cohort is the Olmsted County population

    • Advantages: If data collected, analysis only

    • Disadvantages: Biases, confounding relative to nature of population + lack of key covariates

    Slide72 l.jpg

    Cohort studies example diseasesDatabase cohort study

    • Most common form in pharmacoepidemiology

    • Usually retrospective, but can be mixed

    • Many large multi-purpose databases are used

      • HMO, Managed Care (Puget Sound, United Health Care)

      • Electronic medical records (GPRD, MediPlus)

      • Provincial health plans (Saskatchewan)

    • Advantages: large, data already exists, complete for billable services

    • Disadvantages: Claims = diagnoses


    Summary cohort studies there is a difference between relative versus absolute risk l.jpg
    Summary: diseases Cohort studiesThere is a difference between relative versus absolute risk

    Rate difference increasing but rate ratio constant

    Rate difference is constant but rate ratio decreasing




    (e.g. RA)


    (e.g. non-RA)

    30 40 50 60 70 80


    Summary keep in mind of major differences among registries l.jpg
    Summary: diseases Keep in mind of major differences among registries!

    Consider these before you believe the results l.jpg
    Consider these before you believe the results! diseases

    • If negative study

      • Power

      • Outcome & exposure definition

      • Comparison group

      • Non-differential misclassification

      • Replication

    Consider these before you believe the results77 l.jpg
    Consider these before you believe the results! diseases

    • If positivestudy

      • Confounding

      • Channeling

      • Differential misclassification

      • Generalizability

      • Implications

      • Replication