1 / 39

# Assume the model: - PowerPoint PPT Presentation

Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 2: Nelson-Aalen, Kaplan-Meier and Aalen-Johansen estimators. Nelson-Aalen estimator. Assume the model:. at risk indicator. common hazard/intensity (non-negative function).

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Assume the model:' - alena

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Borgan and Henderson: Event History Methodology Lancaster, September 2006Session 2: Nelson-Aalen, Kaplan-Meier and Aalen-Johansen estimators

Assume the model:

at risk indicator

common hazard/intensity (non-negative function)

Aggregated counting process N(t) has intensity process

number at risk

Have the decomposition:

Estimating equation (when Y(t) > 0)

Thus (when Y(t) > 0 ):

This motivates the Nelson-Aalen estimator:

with

Nelson-Aalen estimator is a sum over the observed event times

(assuming no tied event times)

Males: Females:

We may write:

Thus we have the decomposition: estimator

systematic part

random part

The random part is the stochastic integral:

where is a predictable process

We may use properties of stochastic integrals to study the Nelson-Aalen estimator

In particular

Thus

The Nelson-Aalen estimator is approximately unbiased.

In order to study the variability of the Nelson-Aalen estimator (and a number of other estimators and test statistics), we need a concept of variability of a martingale M(t).

Such a concept of variability is the predictable variation process given by:

One may in general prove that estimator

is a martingale.

In particular therefore

and it follows that

For a estimatorcounting process martingale

we have

This motivates the important result:

We also need the predictable variation of the stochastic integral

We have, since H(t) is predictable:

This motivates the key result estimator

In particular for a counting process martingale

we have

For the Nelson-Aalen estimator we have

Thus

For estimation we replace by

and obtain

Martingale central limit theorem estimator

In order to derive the large sample distribution of the Nelson-Aalen estimator (and a number of other estimators and test statistics), we use the martingale central limit theorem (MCLT)

The classical CLT describes how sums of random variables (properly normalized) becomes approximately normally distributed as n increases. In a similar manner the MCLT describes how martingales (properly normalized) become approximately distributed as Gaussian martingales (transformed Brownian motions)

A estimatorGaussian martingale is a stochastic process X(t) in continuous time satisfying:

• its increment X(t) – X(s) over an interval (s,t] is normally distributed with mean zero and variance V(t) – V(s) for a continuous strictly increasing function V(t)(the variance function)

• its increments over non-overlapping intervals are independent

• its sample paths (realizations) are continuous

For the special case V(t) = t we get the classical Wiener process (or Brownian motion); cf next slide.

A counting process and its cumulative intensity process (left) and the corresponding counting process martingale (right).

(Based on n =10 simulated censored survival times)

Counting process martingales based on (left) and the corresponding counting process martingale (right). n =10, 50, 250 and 1250 simulated censored survival times:

Consider a stochastic integral (left) and the corresponding counting process martingale (right).

where

is a counting process martingale based on observing n individuals

Assume:

(convergence in probability)

(1) implies that the predictable variation process of

converges to the deterministic

function , while (2) states that its

jumps disappear in the limit.

Under assumptions (1) and (2) (plus some (left) and the corresponding counting process martingale (right).

regularity conditions) the process

converges in distribution to a Gaussian martingale

with variance function

In particular for given t0 the random variable

is approximately normally

distributed with mean 0 and variance

Large sample properties of Nelson-Aalen (left) and the corresponding counting process martingale (right).

We have

May use the MCLT with

Assume that

Then:

Thus (1) and (2) hold and the MCLT gives that (left) and the corresponding counting process martingale (right).

converges in distribution to a Gaussian martingale with variance function

In particular for given t0 the random variable

is approximately normally distributed around

with a variance that can be estimated as described earlier.

Survival functions, cumulative hazards, and product integrals: the general case

Uncensored survival time T

Survival function:

For the absolute continuous case, the hazard function is given by:

Cumulative hazard function:

We have the relations: integrals: the general case

For a general distribution the hazard rate is not defined, but we may define the cumulative hazard rate as (generalizing the first relation above):

How can the second relation be generalized?

Need integrals: the general caseproduct-integrals to achieve this generalization.

Partition [0,t] into small time intervals:

si-1

si

0

t

This is a product-integral.

For the integrals: the general casecontinuous case we have:

For the discrete case we have:

where

is the increment of the cumulative hazard (a step function) at s.

For the general case we have a mixture of the two.

The Kaplan-Meier estimator integrals: the general case

For right censored survival data we observe:

Model: the uncensored survival times Ti are i.i.d. with hazard

Counting and intensity processes:

Aggregated counting process: integrals: the general case

Intensity process:

with

the number at risk just before timet

Nelson-Aalen estimator: integrals: the general case

(a step function)

Plug this into the product-integral expression for the survival function:

(a finite product)

This is the Kaplan-Meier estimator

Kaplan-Meier estimator: Examples integrals: the general case

Males: Females:

Kaplan-Meier estimator: Properties integrals: the general case

May show that(this is Duhamel's equation)

Asymptotically:

Thus: integrals: the general case

The statistical properties for Kaplan-Meier may be derived from those of Nelson-Aalen:

Usually the variance is estimated by integrals: the general caseGreenwood's formula:

Only minor difference between the two variance estimators

Pointwise confidence intervals integrals: the general casefor S(t)

Linear:

Log-log-transformed:

The default confidence interval in R is based on the log-transformation, and that is a bad choice for Kaplan-Meier!

The Aalen-Johansen estimator integrals: the general case

A multivariate version of the Kaplan-Meier estimator applies to Markov processes.

Consider Markov process with states 0, 1, …, K.

transition intensities

Phj(s,t)transition probabilities

Transition probability matrix:

Aalen-Johansen estimator: integrals: the general case

Here is the matrix of Nelson-Aalen estimators with

If e.g. K=2 and a 1–>2 transition is observed at tj

The statistical properties of the estimator may be derived in a similar manner as for Kaplan-Meier.

Aalen-Johansen estimator: Examples integrals: the general case

Causes of death in Norway.

Estimates of

1) Cancer

2) Cardiovascular disease

3) Other medical

4) Alcohol abuse, violence, accidents

374 female diabetes patients in Denmark with disease onset before age 10 yrs

Use illness-death model with diabetic nephropathy as "disease state" (state 1). Estimate of P01(5,t):

(Markov assumption is dubious)