Analysis of complex survey data
This presentation is the property of its rightful owner.
Sponsored Links
1 / 23

Analysis of Complex Survey Data PowerPoint PPT Presentation


  • 98 Views
  • Uploaded on
  • Presentation posted in: General

Analysis of Complex Survey Data. Day 4: Survival analysis and Cox proportional hazards models. Nonparametric Survival Analysis. Kaplan-Meier Method (also called Product-Limit Method) Life Table Method (also called Actuarial Method). Nonparametric Survival Analysis.

Download Presentation

Analysis of Complex Survey Data

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Analysis of complex survey data

Analysis of Complex Survey Data

Day 4: Survival analysis and

Cox proportional hazards models


Nonparametric survival analysis

Nonparametric Survival Analysis

  • Kaplan-Meier Method (also called Product-Limit Method)

  • Life Table Method (also called Actuarial Method)


Nonparametric survival analysis1

Nonparametric Survival Analysis

A statistical method to study time to an event

Divide risk period into many small time intervals

2) Treat each interval as a small cohort analysis

3) Combine the results for the intervals


Basic concepts of survival analysis

Basic Concepts of Survival Analysis

  • Censoring

  • Time to an event

  • Survival Function


Censoring

Censoring

  • At the end of study, subjects did not experience the event (outcome). Or subjects withdrew from a study (lost to follow up or died from other diseases).

  • Survival analysis assumes LTF and competing cause censoring is random (independent of exposure and outcome)

  • When using longitudinal complex surveys (e.g., PSID, AddHealth), survival analysis is most useful

  • We can also use it in cross-sectional studies when incorporating retrospective age of onset information.


Analysis of complex survey data

28

28

= .0284

=

985

1,000 - 15

28

= .0280

1,000

28

28

=

= .0282

992.5

1,000 – 7.5

Censoring

Example:

Cohort Size at Start : 1,000 for 1 year

Number with disease : 28

Number LTF: 15

If assume all dropped out on 1st day of study, rate of disease/y

=

If assume all dropped out on last day of study, probability of disease

=

If drop out rate is constant over the period best estimate of when dropped out is midpoint : probability of disease then is

=


Survival function

Survival Function

The probability of surviving beyond a specific time [i.e., S(t) = 1 – F(t)]

F(t) = cumulative probability distribution for endpoint (e.g., death)


Analysis of complex survey data

Probability for survival at each new time period =

Probability at that time period conditioned “surviving” to that interval

S4

q

S3

p

F

S2

o

F

S1

Probability survival to S4 =

n

n * o * p * q

F

Failures (F) = deaths or cases or losses to follow up

F


Life table method

Life Table Method

  • Time is partitioned into a fixed sequence of intervals (not necessarily of equal lengths)

A classical method of estimating the survival function in epidemiology and actuarial science

Interval lengths (arbitrary)

Larger the interval, larger the bias

Useful for large samples


Analysis of complex survey data

  • The LIFETEST Procedure

  • Stratum 1: platelet = 0

  • Life Table Survival Estimates

  • Conditional

  • Effective Conditional Probability

  • Interval Number NumberSample Probability Standard

  • [Lower, Upper) Failed Censored Size of Failure Error Survival Failure

  • 0 10 4 0 9.0 0.4444 0.1656 1.0000 0

  • 10 20 2 1 4.5 0.4444 0.2342 0.5556 0.4444

  • 20 30 0 0 2.0 0 0 0.3086 0.6914

  • 30 40 1 0 2.0 0.5000 0.3536 0.3086 0.6914

  • 40 50 0 0 1.0 0 0 0.1543 0.8457

  • 50 60 1 0 1.0 1.0000 0 0.1543 0.8457

N*

Effective sample size: whenever there is censoring (withdrawal or loss), we assume that, on average, those individuals who became lost or withdrawn during the interval were at risk for half the interval.

Thus, effective sample size (n*)= n – ½ (censoring #)

E.g., effective sample size (1st interval) = 9 – ½ (0) = 9

E.g., effective sample size (2nd interval) = 5 – ½ (1) = 4.5


Analysis of complex survey data

Cumulative Survival

  • The LIFETEST Procedure

  • Stratum 1: platelet = 0

  • Life Table Survival Estimates

  • Conditional

  • Effective Conditional Probability

  • Interval Number Number Sample Probability Standard

  • [Lower, Upper) Failed Censored Size of Failure Error Survival Failure

  • 0 10 4 0 9.0 0.4444 0.1656 1.0000 0

  • 10 20 2 1 4.5 0.4444 0.2342 0.5556 0.4444

  • 20 30 0 0 2.0 0 0 0.3086 0.6914

  • 30 40 1 0 2.0 0.5000 0.3536 0.3086 0.6914

  • 40 50 0 0 1.0 0 0 0.1543 0.8457

  • 50 60 1 0 1.0 1.0000 0 0.1543 0.8457

P(F)

Conditional Probability of Failure: Number failed / Effective Sample Size

e.g., P(F) (1st interval) = 4/9 = .44

e.g., P(F) (2nd interval) = 2/4.5 = .44

Survival probability (in each interval) = 1- failure probability (in each interval)

Cum Survival Prob (S(t)) = S (t-1) * S(t)

e.g., S(1) = 1 * (1-.4444) = 1* 0.5556 =.5556

e.g., S(2) = S(0)* S(1) * S(2)

S(2) =1*(1-.4444)* (1-.4444) =1 * .5556 * .5556 = .3086


Kaplan meier product limit method

Kaplan-Meier (Product-limit) Method

  • Time is partitioned into variable intervals

Whenever a case arises, set up a time interval. Use the actual censored and event times

If censored times > last event time, then the average duration will be underestimated using KM method


Kaplan meier method

Kaplan-Meier Method

Patient 1

died

Patient 2

Lost to follow-up

Patient 3

died

Patient 4

died

Patient 5

Lost to follow-up

Patient 6

died

4

10

14

24

Months Since Enrollment


Kaplan meier method1

Kaplan-Meier Method


Kaplan meier plot n 6

Kaplan-Meier Plot (N=6)

% Surviving

100

.833

80

.625

60

.417

40

20

.0

0

0

4

10

14

24

Months After Enrollment


Analysis of complex survey data

Kaplan-Meier Curve (N = 5,398)

.

Tort

No Fault 1

No Fault 2

“Effect of eliminating compensation for pain and suffering on the outcome of insurance claims for whiplash injury” Cassidy JD et al., N Engl J Med 2000;342:1179-1186


Analysis of complex survey data

Median Survival Time

Tort

No Fault 1

No Fault 2


Semi parametric methods

Semi-Parametric Methods

  • Not required to choose some particular probability distribution to represent survival time

  • Incorporate time-dependent covariates

    Example: exposure increases over time as with drug dosage or with workers in hazardous occupations


Cox proportional hazards model

Cox Proportional Hazards Model

  • 1. Proportional Hazards Model

Basic Model of the hazard for individual i at time t

hi(t) = 0(t) exp{β1xi1 + ….. + βkxik}

Linear function of fixed covariates

Baseline hazard function

Non-negative

Take the logarithm of both sides,

log hi(t) = (t) +β1xi1 + ….. + βkxik

No need to specify the functional form of baseline hazard function

log 0(t)


Cox proportional hazards model1

Cox Proportional Hazards Model

  • 1. Proportional Hazards Model

Consider the hazard ratio of two individuals i and j

hi(t) = 0(t) exp{β1xi1 + ….. + βkxik}

hi(t) = 0(t) exp{β1xj1 + ….. + βkxjk}

Hazard ratio = exp{β1(xi1 -xj1) ….. + βk(xik-xjk)}

  • Hazard functions are multiplicatively related, hazard ratio is constant over survival time.

  • Hazards of any two individuals are proportional.


Cox proportional hazards model2

Cox Proportional Hazards Model

  • 2. Partial Likelihood Estimation

Estimate the β coefficients of the Cox model without having to specify the baseline hazard function 0(t)

Partial likelihood depends only on the order in which events occur, not on the exact times of occurrence.

Partial likelihood estimates are not fully efficient because of loss of information about exact times of event occurrence


Analysis of complex survey data

Interpretation of Coefficients

  • No intercept h0(t): an arbitrary function of time. Cancel out of the estimating equations

eβ: Hazard ratio

Indicator variables (coded as 0 and 1)

Hazard ratio of the estimated hazard for those with a value of 1 in X to the estimated hazard for those with a value of 0 in X (controlling for other covariates)

Quantitative (Continuous) variables

Estimated percent change in the hazard for each one-unit increase in X. For example, variable AGE, eβ=1.5, which yields 100(1.5 - 1) =50. For each one-year increase in the age at diagnosis, the hazard of death goes up by an estimated 50 percent, controlling for other covariates.


Lab 4 estimating survival curves and cox models in sudaan

Lab 4: estimating survival curves and Cox models in SUDAAN


  • Login