flexible survival modeling in sas
Skip this Video
Download Presentation
Flexible Survival Modeling in SAS

Loading in 2 Seconds...

play fullscreen
1 / 35

Flexible Survival Modeling in SAS - PowerPoint PPT Presentation

  • Uploaded on

Flexible Survival Modeling in SAS. Presented to Nova Scotia Sas Users Group meeting, Feb 22, 2013 By Ron Dewar – Dalhousie University and Cancer Care Nova Scotia. beyond LIFETEST and PHREG. Flexible survival modeling: today’s objectives. Introduce time-to-event analysis

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Flexible Survival Modeling in SAS' - renee-solomon

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
flexible survival modeling in sas

Flexible Survival Modeling in SAS

Presented to Nova Scotia Sas Users Group meeting, Feb 22, 2013

By Ron Dewar – Dalhousie University and Cancer Care Nova Scotia


flexible survival modeling today s objectives
Flexible survival modeling:today’s objectives
  • Introduce time-to-event analysis
  • Survival analysis in sas
  • Survival modeling - some recent developments
  • Availability of software (stata, R)
  • Progress in converting to sas macros: stata stpm2, predict, rcsgen
  • The road ahead
time to event
  • Duration from start to event for each subject
  • Status at time (experienced event or still at risk)
  • Non-informative censoring: censoring does not change risk of eventually experiencing event
  • Covariate patterns among subjects at risk (at the time of an event)
  • Other aspects: left truncation (delayed entry)attained age as time scalefixed or time-varying covariatesstrata informative censoring (competing risks)
lifetest phreg
Lifetest, Phreg
  • Lifetestplotted output step/smoothed (survival, hazard)life table estimatessignificance testing between strata
  • Phreg (cox regression)left truncationcovariates (numeric and categorical), strata,Wald tests, LR tests, AIC, BICplotted output(survival, hazard as step functions)regression diagnostics
lifetest phreg cont
Lifetest, Phreg (cont.)
  • Covariate time-dependency is common and (maybe) more interesting
  • Hazard Ratio (HR) is difficult to interpret at an individual level
  • Interest in other survival functions – hazard, hazard differences, survival differences, relative survival, cumulative hazard, crude probability (of event)
  • Parametric representation of survival functions and time-dependency relations (out of sample prediction)
royston parmar survival model 2002 statistics in medicine 21 2175 2197
Royston – Parmar survival model2002 Statistics in Medicine 21: 2175–2197
  • Cumulative hazard scale – equiv. to hazard scale if no time-dependencies in covariates
  • Restricted cubic splines for cum. hazard
  • Implemented in Stata 11 as Stpm2 (model fitting) and Predict (post-estimation)
  • Similar (?) implementation in R
  • ‘cure’ models, left truncation, covariate time-dependency, strata, relative survival (excess hazard), net survival, other scales
development plan resources
Development plan, resources
  • Stata code and academic papers
  • Email support from module author
  • Replication of all features may not be feasible (limited programming resources, obscure stata features)
  • Basic functionality that replicates results
  • SAS code that can be used, understood, modified, enhanced by collaborators
proportional hazards model
Proportional Hazards Model

Breast cancer survival with proportional hazards

Odspdf file = “&loc.\doc\breast2004\cox regression.pdf”;

procphregdata = _events_;

class sstage (ref = first);

model years*_death_(0) = sstage age1 age2 age3;

title \'Breast cancer survival, 2004 – 2010, Nova Scotia’;

title2 ‘followed to end of 2011’

title3 \'Proportional Hazards model with stage, age\';


Odspdf close;


PHREG output

Number of Observations Read 5475Number of Observations Used 5475

Percent Total Event Censored Censored

5475 830 4645 84.84…


PHREG output

Parameter Standard Parameter DF Estimate Error Chi-Square Pr > ChiSq sstage II 1 0.77412 0.10221 57.3652 <.0001 sstage III 1 1.72777 0.10591 266.1493 <.0001

sstage IV 1 3.23040 0.10948 870.6004 <.0001

age1 1 0.18070 0.12381 2.1300 0.1444

age2 1 0.84798 0.11689 52.6287 <.0001 age3 1 1.58877 0.11833 180.2785 <.0001

outline of analysis steps r p model
Outline of analysis stepsR-P model
  • Create ‘standard’ dataset: dataset name, key variable names are fixed (%sas_stset)
  • Describe and fit model (%sas_stpm2)
  • Estimate functions of fitted model parameters (%predict)
  • plot predicted functions (eg, with Proc SGPlot)
fit above model using stpm2
Fit above model using stpm2()
  • 3 stage and age binary variables, hazard scale, 5 df for baseline
  • Stata command line

Stpm2 st2 st3 st4 age1 age2 age3, scale(hazard) df(5)

  • Sas macro call

%Sas_stmp2( st2 st3 st4 age1 age2 age3, scale=hazard, df = 5);

stpm2 baseline knots
Stpm2: baseline knots
  • Log cumulative hazard is parameterised with restricted cubic spline functions:

array z(*) rcs1 - rcsm < m spline variables> ;

array k(*) < m knot values >;

z(1) = y * log time to event;

do j = 2 to m ;

phi = (k(m) - k(j) )/(k(m) - k(1) );

z(j) = ( y > k(j) )*( y - k(j) )**3

- phi*( y> k(1) )*( y - k(1) )**3 - (1 - phi)*( y > k(m) )*( y - k(m) )**3; end;

how many knots to use where to put them
How many knots to use?Where to put them?
  • Too few: unrealistic representation of hazard
  • Too many: over-parameterisation. Unrealistic lumps and bumps
  • AIC and BIC may be helpful. LR tests are not. models are not nested
  • Some subject matter knowledge can be helpful
  • Choice of position probably doesn’t matter too much
  • ‘standard’ positions : centile points of cumulative distribution of times of non-censored events
programming in sas stpm2
Programming in %sas_stpm2()
  • Describe model in macro call
  • Internal macro strings drive subsequent processing:
    • compute spline functions
    • 1st derivatives
    • orthogonalisation
    • Define linear predictor, log likelihood
    • derive initial values for optimisation
    • fit model (maximum likelihood with proc nlmixed)
    • save results for later processing
key macro strings
Key macro strings
  • _null_ data step to build macro strings
  • call symput(‘macro_var’, string)
  • Linear predictor: independent variables parameters to be estimated
  • Log likelihood:

function to be maximised linear predictor 1st derivative of linear predictor censor indicator

linear predictor likelihood
Linear predictor, Likelihood

%Sas_stmp2( st2 st3 st4 age1 age2 age3, scale=hazard, df = 5);

Linear predictor: &xb.

cons*_cons + st2*_st2 + st3*_st3 + st4*_st4 + age1*_age1 + age2*_age2 + age3*_age3 + rcs1*_rcs1 + rcs2*_rcs2 + rcs3*_rcs3 + rcs4*_rcs4 + rcs5*_rcs5

1st derivative: &dxb.

rcs1*_drcs1 + rcs2*_drcs2 + rcs3*_drcs3 + rcs4*_drcs4 + rcs5*_drcs5

Log likelihood:

_death_*((&dxb.) + &xb.) – exp(&xb.)

linear predictor
Linear predictor


xb = ifc (&int., \'cons*_cons + \', \' \');

do _i_ = 1 to &n_cov.;

var = scan("&covar.",_i_);

xb= trim(xb)||\'\'||trim(var)||\'*_\'||trim(var)||\' + \' ;



do _i_ = 1 to &df.;

xb = trim(xb) ||\' rcs\'||put(_i_,1.)||\'*\' ||\'_rcs\'||put(_i_,1.)|| ifc(_i_< &df.," + ", " ");


example colon cancer
Example: colon cancer

* data must be sorted by unique ID;

procsort data = example;

by pid;


* set up a standardised survival dataset;

%sas_stset(example, censor(0), years , pid ) ;

* fit model of interest;

%sas_stpm2(st1 st2 st3 nsex, scale=hazard, df=3);

* predicted hazard, survival functions for IIb cases;

%predict(haz, hazard, at = st2:1 zero);

%predict(surv, survival, at = st2:1 zero);

example time varying covariate
Example: time-varying covariate

%sas_stpm2(st1 st2 st3 nsex, scale=hazard, df=3,

tvc= st2 st3 ,

dftvc= st2:21 );

* hazard prediction;

%predict(haz1, hazard, at = st2:1 zero);

* Hazard ratio prediction;

%predict(hr1, hratio, hrnum= st2:1 zero,

hrdenom= st2:0 zero);

example 2 time to initiation of chronic opiod use in new cancer patients
Example 2: time to initiation of chronic opiod use in new cancer patients
  • t0: date of new cancer diagnosis
  • t1: date of initiation of chronic opiod pain medication
  • Censor: death, end of study period (or 365 days)
  • Tier: cancer type grouped by 5-year survival probability
  • Age: 10-year age groups
  • Other covariates: year of diagnosis, urban/rural, sex%sas_stpm2(t2 t3 a1 a2 a3 a4 a5 a6 nsex nurb y, scale=hazard, df=3, tvc= a2 a6, dftvc = 2);
example 2
Example 2

%macro int(row =, col =, sel = ); …

%predict(surv, survival, at = &sel. nsex:1 nurb:1 y:3 zero);


And then, each for of the 21 combinations of 3 tiers X 7 age groups:

%int( row = t2, col = ag2, sel = a1:1 t2:1);

%int( row = t3, col = ag2, sel = a1:1 t3:1);

the road ahead
The road ahead
  • Documentation (!no, really?)
  • Consistency checking
  • Confidence intervals for cumulative functions (survival, cumulative hazard)
  • Out of sample estimation is inefficient
  • Other survival scales (cumulative log odds, probit…)
  • Cure models
  • Stratified analysis
  • Competing risks framework
the future
The future
  • Use of an optimsation routine that permits analytic 1st and 2nd derivatives (gradient & hessian) more efficient prediction out of sample prediction
  • Re-code string modification in %predict()