1 / 44

Biostatistical Considerations

Biostatistical Considerations. Nae-Yuh Wang, PhD ICTR Clinical Registries Workshop November 3, 2010. OVERVIEW. Descriptive vs Analytic Goals Selection of Controls Confounding Measurement errors Missing data. Purposes of Patient Registry. Document natural history of disease

Download Presentation

Biostatistical Considerations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biostatistical Considerations Nae-Yuh Wang, PhD ICTR Clinical Registries Workshop November 3, 2010

  2. OVERVIEW • Descriptive vs Analytic Goals • Selection of Controls • Confounding • Measurement errors • Missing data

  3. Purposes of Patient Registry • Document natural history of disease • Evaluate effectiveness of treatment • Monitoring safety • Measuring quality • Frequently multi purposes, addressing scientific, clinical, and policy questions

  4. Natural History of Disease • Document characteristics, management, and outcomes • May be variable across subgroups • May be variable over time • Change after new guidelines or treatments

  5. SMOKING PREVALENCE, %, AMONG U.S. MALE ADULTS*AND 1,213 WHITE MALE PHYSICIANS:THE PRECURSORS STUDY *CDC, National Health Interview Surveys, 18 years and older,1965-1994

  6. Effectiveness of Treatment • RCTs usually have well defined populations • RCTs usually are short term • Clinical effectiveness, cost effectiveness • Comparative effectiveness --- indirect comparisons on differences between treatments

  7. Safety Monitoring • Adverse event reporting relies on recognition of AE by clinician, and clinician’s effort in reporting --- frequently nonsystematic • Serves as active surveillance • Provides denominator to estimate incidence • Enables comparison to a reference rate

  8. Health Care Quality • Compare performance measures (treatments provided or outcomes achieved) against evidence based guidelines or benchmarks (adjusted survival, infection rates) between provider or patient subgroups • Identify disparity in access to care • Demonstrate opportunities for improvement • Establish payment differentials

  9. Types of Registry • Product registries (drug, device) • Health services registries (procedure) • Disease or condition registries • Patients defined by exposure to a product, procedure, or disease/condition • Frequently combination of types

  10. Design of Registry • Research questions, stakeholders, and practical factors (regulatory, political, funding) define purpose and type of registry, and other design considerations such as sampling plan, data collection, validity, sample size, and analytic approaches • Types define the patient population • Purposes define the outcomes • Outcomes define the duration

  11. Design of Clinical Research • Research questions (descriptive vs. hypothesis based --- analytic) • Population; outcome and exposure • Sampling (recruitment); measurements, duration and frequency of data collection • Internal / external validity (bias / generalizability) • Sample size (precision of estimates/degree of association, feasibility, resources) • Analytic plan

  12. Sampling Design • External validity (generalizability): all patients, patients from tertiary medical center only? Single center, multiple center? Which way is more representative of the target population under study? • Do I need controls? (CI in children, language outcomes vs. meningitis) • Selection of controls • Match or not to match?

  13. Cohort Design • Sampling based on predictor (exposure variables) of interest (collect as many exposure variables as possible). Good for rare exposure. • Follow up patients for outcomes, could study multiple outcomes (long time for outcomes to develop?) • Census (not feasible when population and per capita cost are large) • SRS, stratified RS (oversampling subgroup), Cluster RS (cluster characteristics as the aim), multistage RS • Nonrandom: case series / consecutive sampling

  14. Case-Control Design • Stratified RS based on case status • Oversampling cases, good for rare diseases • No long follow up for disease development • Study multiple exposure variables • Exposure ascertainment is key • Nested case-control study using existing registry • Selection of controls, match or not to match

  15. Measurement & Data Collection • Data from clinically based electronic sources only? • Linking from different sources (e.g., NDI searches) • Measurement (different labs) and coding consistency • Additional data collections --- potential confounders, nonclinical outcomes (e.g., QoL, QALY), medications

  16. Measurement & Data Collection • Research versus clinical protocol (BP, busy schedule) • New / changes in treatments and guidelines over time • Changes / improvement in measurement precision and generation of technology over time • Change of outcome definition over time (clinical designation or collect and record raw measures) • Analytic corrections could only be done if needed data / information are available

  17. Internal Validity --- Sources of Bias • Information bias: AE under reported if reporter (provider) will be viewed negatively on care quality. Self reported weight • Selection bias: patients included not representative (unintentional incentives for provider / patient), loss to follow-up, common exposure to unaccounted confound • Confounding by indication: newest drug to patients with worse prognosis • Survival bias: live long enough with exposure to be selected

  18. Internal Validity --- Sources of Bias Confounding: • CVD risk, age, gray hair • Controlled by matching through study design • Accounted for through stratification, covariate adjustment, or propensity score adjustment during analyses • Only work if data on confounders were collected, need to consider at design stage

  19. Internal Validity --- Sources of Bias Measurement errors: • Mean of 3 repeatedly measured BP readings used in RCT versus single BP used in clinic • Measured versus self reported body weight • Fruit / vegetable availability in an area used as proxy measure of fruit / vegetable consumption value • Areas measured by 2nd vs. 1st generation CT

  20. Measurement Errors • Nondifferentiable ME in outcome causes no bias. Greater variability in outcomes due to ME reduces statistical power • Differentiable ME in outcome causes violation of constant variance assumption in regression. • Nondifferentable ME in covariate causes underestimation of association (bias towards the null)

  21. ME in Covariate β* = λβ, where λ = 4

  22. ME in Covariate Models • E ( Y | X ) = μ ( Xβ ) • Classical error model: W = X + ε , X || ε (Note: non-differential) • X the measured weight, W the self reported weight • X the measured BMI, W the self reported BMI • Berkson error model: X = W + ε , W || ε (Note: non-differential) • X the “true” F/V consumption, W the proxy value

  23. ME in Covariate Models • Goal: E ( Y | X ) = μ ( Xβ ) • Actual: E ( Y | W ) = μ ( Wβ* ) • Need to correct the estimate of β* to get proper estimate of β • Need to quantify ME so proper correction of β* is possible: Validation: a subsample with both X and W Replications: repeated measures of W (e.g., BP) Transportability: information from another study if valid

  24. ME in Covariate • Non-differential ME key assumption, not testable without validation data • When covariate with ME in the model, covariates w/o ME may also be biased. Directions of such biases depend on directions of association among Y and covariates in the model • ME model could be complicated: combined classical & Berkson’s error model, additive versus multiplicative ME • Differential ME: bias direction depends on how ME relates to Y

  25. Design Considerations for ME • Conduct periodic validation study on small random sample of participants (e.g. self report vs. measured weight, outcomes coded by billing vs. coded under research protocol) • If not available from external sources, repeat assessments using old and new instruments in random sample of participants during transition to collect calibration data. • Sources of external validation/calibration data

  26. Missing Data • Inevitable in population research • Prevention is better than statistical treatments • Too much missing information invalidates a study • Validity of methods accommodating missing data depends on the missing data mechanism and the analytic approach

  27. Missing Data Mechanism • Missing completely at random (MCAR): Pr (missing) is unrelated to process under study • Missing at random (MAR): Pr (missing) depends only on observed data  potential “ignorability” • Not missing at random (NMAR): Pr (missing) depends on both observed and unobserved data  non-ignorable

  28. Simulations N = 100, repeated outcome: y0,y1 Group = 0, 1 (n = 50 / 50) FV = 0: y0 ~ N(0,1) if Group = 0 y0 ~ N(1,1) if Group = 1 FV = 1: y1 ~ N(0,1) if Group = 0 y1 ~ N(1,1) if Group = 1 E( y0) = E( y1) = 0.5, SD( y0) = SD( y1) = 1.12 Corr( y0,y1 | Group) = 0.6, Corr( y0,y1 ) = 0.68

  29. Analytic Approach • Likelihood approach • Mixed effects models • Mean model = Intercept + FV versus Intercept + FV + Group • Correlation model: Working independent (WI) versus Unstructured (UN) • Model-based versus robust SE

  30. Simulations Full Sample: MCAR: 25% random missing at FV1

  31. Simulations

  32. Simulations Full Sample: MAR1: 25% missing in Group 0 at FV1

  33. Simulations

  34. Simulations Full Sample: MAR2: 25% missing depends on values of y0

  35. Simulations

  36. Simulations Full Sample: NMAR: 25% missing depends on values of y1

  37. Simulations

  38. Simulations

  39. Observations MCAR: • Requires only correct mean model for valid inferences • Complete case analysis is valid, but not efficient for estimating fully observed variables • Approaches valid for MAR also valid under MCAR • Unlikely to be true in population based research

  40. Observations MAR: • Ignorablility of missing is possible but not given • Requires correct specification of likelihood (both mean and covariance model) for the observed data to achieve valid inferences • Empirically cannot be confirmed without auxiliary data

  41. Observations NMAR: • Empirically cannot be ruled out without auxiliary data • Likelihood, multiple imputation, propensity score, inverse weighting approach cannot completely eliminate bias • Need to conduct sensitivity analyses under various plausible NMAR scenarios to evaluate potential impacts on inferences

  42. Observations Observational studies face similar issues as RCTs with missing data: • Bias due to missing data  selection bias • Proper selection of analytic models may eliminate bias if the “selection” is based on observed data values, i.e. we have data to adjust for selection • Bias due to “selection” according to data values not observed will be hard to correct

  43. Sample Size Considerations • Descriptive: estimation precision • Hypothesis based: power to detect association • Design effects • Longitudinal correlations

More Related