Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 2: Nelson-Aalen, Kaplan-Meier and Aalen-Johansen estimators. Nelson-Aalen estimator. Assume the model:. at risk indicator. common hazard/intensity (non-negative function).
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Borgan and Henderson: Event History Methodology Lancaster, September 2006Session 2: Nelson-Aalen, Kaplan-Meier and Aalen-Johansen estimators
Assume the model:
at risk indicator
common hazard/intensity (non-negative function)
Aggregated counting process N(t) has intensity process
number at risk
Have the decomposition:
Estimating equation (when Y(t) > 0)
Thus (when Y(t) > 0 ):
This motivates the Nelson-Aalen estimator:
Nelson-Aalen estimator is a sum over the observed event times
(assuming no tied event times)
We may write:
Thus we have the decomposition: estimator
The random part is the stochastic integral:
where is a predictable process
We may use properties of stochastic integrals to study the Nelson-Aalen estimator
The stochastic integral is a martingale. estimator
The Nelson-Aalen estimator is approximately unbiased.
Predictable variation of a martingale estimator
In order to study the variability of the Nelson-Aalen estimator (and a number of other estimators and test statistics), we need a concept of variability of a martingale M(t).
Such a concept of variability is the predictable variation process given by:
One may in general prove that estimator
is a martingale.
In particular therefore
and it follows that
For a estimatorcounting process martingale
This motivates the important result:
We also need the predictable variation of the stochastic integral
We have, since H(t) is predictable:
This motivates the key result estimator
In particular for a counting process martingale
Variance of the Nelson-Aalen estimator estimator
For the Nelson-Aalen estimator we have
For estimation we replace by
Martingale central limit theorem estimator
In order to derive the large sample distribution of the Nelson-Aalen estimator (and a number of other estimators and test statistics), we use the martingale central limit theorem (MCLT)
The classical CLT describes how sums of random variables (properly normalized) becomes approximately normally distributed as n increases. In a similar manner the MCLT describes how martingales (properly normalized) become approximately distributed as Gaussian martingales (transformed Brownian motions)
A estimatorGaussian martingale is a stochastic process X(t) in continuous time satisfying:
For the special case V(t) = t we get the classical Wiener process (or Brownian motion); cf next slide.
A counting process and its cumulative intensity process (left) and the corresponding counting process martingale (right).
(Based on n =10 simulated censored survival times)
Counting process martingales based on (left) and the corresponding counting process martingale (right). n =10, 50, 250 and 1250 simulated censored survival times:
Consider a stochastic integral (left) and the corresponding counting process martingale (right).
is a counting process martingale based on observing n individuals
(convergence in probability)
(1) implies that the predictable variation process of
converges to the deterministic
function , while (2) states that its
jumps disappear in the limit.
Under assumptions (1) and (2) (plus some (left) and the corresponding counting process martingale (right).
regularity conditions) the process
converges in distribution to a Gaussian martingale
with variance function
In particular for given t0 the random variable
is approximately normally
distributed with mean 0 and variance
Large sample properties of Nelson-Aalen (left) and the corresponding counting process martingale (right).
May use the MCLT with
Thus (1) and (2) hold and the MCLT gives that (left) and the corresponding counting process martingale (right).
converges in distribution to a Gaussian martingale with variance function
In particular for given t0 the random variable
is approximately normally distributed around
with a variance that can be estimated as described earlier.
Survival functions, cumulative hazards, and product integrals: the general case
Uncensored survival time T
For the absolute continuous case, the hazard function is given by:
Cumulative hazard function:
We have the relations: integrals: the general case
For a general distribution the hazard rate is not defined, but we may define the cumulative hazard rate as (generalizing the first relation above):
How can the second relation be generalized?
Need integrals: the general caseproduct-integrals to achieve this generalization.
Partition [0,t] into small time intervals:
This is a product-integral.
For the integrals: the general casecontinuous case we have:
For the discrete case we have:
is the increment of the cumulative hazard (a step function) at s.
For the general case we have a mixture of the two.
For right censored survival data we observe:
Model: the uncensored survival times Ti are i.i.d. with hazard
Counting and intensity processes:
Aggregated counting process: integrals: the general case
the number at risk just before timet
Nelson-Aalen estimator: integrals: the general case
(a step function)
Plug this into the product-integral expression for the survival function:
(a finite product)
This is the Kaplan-Meier estimator
Kaplan-Meier estimator: Examples integrals: the general case
Kaplan-Meier estimator: Properties integrals: the general case
May show that(this is Duhamel's equation)
Thus: integrals: the general case
The statistical properties for Kaplan-Meier may be derived from those of Nelson-Aalen:
Usually the variance is estimated by integrals: the general caseGreenwood's formula:
Only minor difference between the two variance estimators
Pointwise confidence intervals integrals: the general casefor S(t)
The default confidence interval in R is based on the log-transformation, and that is a bad choice for Kaplan-Meier!
The Aalen-Johansen estimator integrals: the general case
A multivariate version of the Kaplan-Meier estimator applies to Markov processes.
Consider Markov process with states 0, 1, …, K.
Transition probability matrix:
Aalen-Johansen estimator: integrals: the general case
Here is the matrix of Nelson-Aalen estimators with
If e.g. K=2 and a 1–>2 transition is observed at tj
The statistical properties of the estimator may be derived in a similar manner as for Kaplan-Meier.
Aalen-Johansen estimator: Examples integrals: the general case
Causes of death in Norway.
2) Cardiovascular disease
3) Other medical
4) Alcohol abuse, violence, accidents
374 female diabetes patients in Denmark with disease onset before age 10 yrs
Use illness-death model with diabetic nephropathy as "disease state" (state 1). Estimate of P01(5,t):
(Markov assumption is dubious)