1 / 41

Longitudinal Data Fall 2002

Longitudinal Data Fall 2002. Lecture 1- Overview of Semester. Introduction and Overview. Description of the class of statistical studies that we will focus on in this course. Overview of the various types of longitudinal data outcomes and statistical models.

tarak
Download Presentation

Longitudinal Data Fall 2002

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Longitudinal DataFall 2002 Lecture 1- Overview of Semester

  2. Introduction and Overview • Description of the class of statistical studies that we will focus on in this course. • Overview of the various types of longitudinal data outcomes and statistical models. • Specific examples of longitudinal data. • A demonstration of the pitfalls of ignoring repeated measures structure of data (one of punch-lines of course).

  3. Longitudinal Data • Focuses on changes in outcome variable and what explains such changes • Outcome can be simple (does the subject have the disease or not?) or more complex (CD4 levels in an HIV-infected subject) • Usually involves multiple observations on each subject

  4. Description of the types of data/models covered • Single event data using survival analysis. Examples: time to death, time to tumor recurrence. • Multiple event data – Poisson and negative binomial regression. Examples: number of sex partners, number of seizures, number of trips to the fridge. • Repeated Measures. • Continuous outcome data using mixed (explained later) linear regression models as well as ad hoc data reduction techniques • Binary and count data using extensions of logistic and Poisson regression

  5. Other Topics(time permitting) • Repeated measures serially made on only one unit (time series) • Causal inference in longitudinal data (what to do with time-dependent confounders) • Causal graph theory • Marginal structural models

  6. Example of Causal Diagram from HIV Study

  7. Disease Incidence Data • Individuals followed over time to detect change from disease-free state to incident case status. • Information on change all captured by when subject becomes incident case. • Leads to study of cumulative incidence, incidence and hazard rates. • Logistic regression, Poisson regression, and the Proportional hazards model.

  8. Disease Incidence or Mortality Data Outcome data is simple binary indicator of whether disease has begun or not So longitudinal information on Y is particularly simple If we know when we see then we know forall Y time up to 1 time when individual becomes diseased 0 tD

  9. Repeated Measures Data • More complex outcomes • Multiple observations per subject • Variants of linear regression models for continuous outcomes that involve thinking about the correlation structure across a subject’s measurements • Variants of logistic regression models with repeated binary outcomes

  10. Repeated measurements structure • Time when the measurement if made (time-structured - longitudinal studies) • Place (position, region) where the measurement is made (spatial data) • Subunit on which the measurement is made • Combinations

  11. Examples of combinations • Measurements over time at various positions (or in various regions) on the study unit • Measurements over time on various subunits • Measurements on subunits of subunits (hierarchical data structures) • In experimental studies, the repeated measurements over time and/or place and/or subunit may also occur under different experimental conditions.

  12. Working definition of repeated measures • The same outcome characteristic is measured more than once for each first-level study unit (in epidemiology, often the person).

  13. Regression • Course concentrates on estimating the association of an explanatory variable with an outcome variable. • Another way to think of this is trying to estimate how Y (the outcome) is related to one or more X’s (the explanatory variables. • The course concentrates on estimating how X affects the mean of Y (although other models are considered): E[Y|X] • Eg: Linear Model:

  14. Other Types of Regression (Mean) Models • Binary – typically logistic: • Counts – typically log-linear • In explanation, goal is to estimate b0, b1 (coefficients).

  15. Hazard Regression Models(Disease Incidence Data) • In survival analysis of disease incidence data, the typical regression approach models the hazard • The hazard (t) is: • Typical model is: or proportional hazards model

  16. Example – Disease Incidence Data

  17. Time until Tumor Recurrence vs. Treatment Type • 225 women who have been diagnosed with breast cancer and have the primary tumors removed • Given several different treatments and followed longitudinally until either the end of the study or the time of 1st recurrence. • Goal is to estimate difference in survival between the treatments.

  18. Example – Multiple Event Data

  19. A randomized, controlled trial of an in-home drinkingwater intervention among HIV+ persons • Pilot study of 50 HIV+ subjects who were randomized either active water filter or placebo device. • Followed longitudinally and the number of highly credible gastro-intestinal (HCGI) events were recorded in (on average) a 6 month period. • Purpose is to estimate the amount of HCGI attributable to drinking water among this population.

  20. Results of randomizedwater intervention among HIV+ persons

  21. Example 1 of Repeated Measures

  22. A Repeated Measures Approach to the Detection of the Acute Behavoiral Effects of Toluene at Low Concentrations, Fundamental and Applied Toxicology 25: 293-301 (1995). • 12 rats • Toluene, twice a week for 2 hours. Concentrations of 178, 300, 560, 1000, 1780, 3000 ppm • Outcome is rate of nose poking for food (if rat waits > 2 minutes, always gets food, but probability decreases as waiting time goes down). So, fast rat is dumb rat.

  23. Acute Behavoiral Effects of Toluene, cont. • Possible model: ith rat, jth time, Xij is dose of Toluene, Yij is the response time.

  24. Example 2 of Repeated Measures

  25. A Comparative study of four methods for analyzing repeated measures data, Statistics in Medicine 15: 1143-59 (1996). • Subjects are patients having just experienced myocardial infarction • Blood sampled at 6, 8, 16, 24, 36, 48, 72, 96 hours and 7 days after admission. • Outcome of interest is level of certain fatty acids. • Question of interest: What is the relationship between level of each fatty acid and time since admission? Also, does gender influence this relationship?

  26. Fatty Acids in Blood, cont. • Possible model: ith patient, jth time, Xij is time, Yij is the fatty acid.

  27. Example 3 of Repeated Measures

  28. Repeated measures designs in behavioral toxicology: application to chronic marijuana smoke exposure, Neurotoxicology and Teratology 12: 441-8 (1990). • 62 Rhesus monkeys • Outcome is performance on behavioral tasks (do they push the right button to get food) • 30 test sessions before dosing, 130 during dosing and 70 after dosing

  29. Marijuana, cont. • Dose groups are high (once a day), low (only on the weekends), 2nd hand smoke, and just air (smoke administered by mask over monkey). • 23 hours between exposure and testing.

  30. Marijuana, cont. • Possible model is: ith monkey, jth dosing session, kth time, Xi is estimated dose of marijuana for monkey i, Tijk is the time of measurement.

  31. Example 4 of Repeated Measures

  32. CD4 Count vs. Viral Load • Subjects: HIV+ • Repeated and irregular measurements of CD4 and viral load (time-structured repeated measures) • Data not always matched in time. • Goal is to find how CD4 varies with viral load and how this pattern varies in the population.

  33. CD4 Count vs. Viral Load, cont. • Possible Model: ith subject, jth measurement (at time t) time, Xij(t-d) is viral load at time t-d, Yij (t) is the CD4 count at time t.

  34. Why care about repeated measures? Residual Correlation.

  35. A demonstration of the potential pitfalls of ignoring the correlation with repeated measures data • We measure cholesterol 2 times each on m individuals. • Model is, for the jth measurement on individual ith, where, E(i)=0, E(eij)=0.

  36. Pitfalls, cont. • 2= variance between individuals (variance of i). • 2e = variance within an individual (variance of eij). • The correlation between measurements within an individual is (check for yourselves):

  37. Pitfalls, cont. • Estimate the mean as: • Naively estimate the variance of the average (ignoring correlation) as:

  38. Pitfalls, cont. • Expected value of this variance estimate is: • However, because of the correlation induced by repeated measurements on the same individual, the true variance of the sample average is:

More Related