Lecture 20
Download
1 / 44

Lecture 20 - PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on

Lecture 20. Comparing groups Cox PHM. Comparing two or more samples. Anova type approach where τ is the largest time for which all groups have at least one subject at risk Data can be right-censored for the tests we will discuss. Notation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Lecture 20' - frayne


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Lecture 20

Lecture 20

Comparing groups

Cox PHM


Comparing two or more samples
Comparing two or more samples

  • Anova type approach

    where τ is the largest time for which all groups have at least one subject at risk

  • Data can be right-censored for the tests we will discuss


Notation
Notation

  • t1<t2<…tDbe distinct death times in all samples being compared

  • At time ti, let dij be the number of deaths in group j out of Yij individuals at risk. (j=1,..,K)

  • Define


Log rank test rationale
Log-Rank Test Rationale

  • Comparisons of the estimated hazard rate of the jth population under the null and alternative hypotheses

  • If the null is true, the pooled estimate of h(t) should be an estimator for hj(t)


Applying the test
Applying the Test

for j = 1,…,K

If all Zj(τ)’s are close to zero, then little evidence to reject the null.


Others
Others?

  • LOTS!

    • Gehan test

    • Fleming-Harrington

  • Not all available in all software worth trying a few in each situation to compare inferences


2 samples
2+ samples

  • Let’s look at a prostate cancer dataset

  • Prostate cancer clinical trial

    • 3 trt groups (doce Q3, doce weekly, Q3 mitoxantrone)

    • 5 PSA doubling times categories

    • outcome: overall survival



R survdiff
R: survdiff

#################################

# test for differences by trt grp

plot(survfit(st~trt), mark.time=F, col=c(1,2,3))

test1 <- survdiff(st~trt)

test2 <- survdiff(st~factor(trt, exclude=3))

test3 <- survdiff(st[trt<3]~trt[trt<3])



R survdiff1
R: survdiff

table(psadt)

plot(survfit(st~psadt), mark.time=F, col=1:5, lwd=rep(2,5))

legend(50,1,as.character(0:4), lty=rep(1,5), col=1:5, lwd=rep(2,5))

test1 <- survdiff(st~psadt)

test2 <- survdiff(st[psadt<3 & psadt>0]~psadt[psadt<3 & psadt>0])

test3 <- survdiff(st[psadt>2]~psadt[psadt>2])


Caveat
Caveat

  • Note that we are interested in the average difference (consider log-rank specifically)

  • What if hazards ‘cross’?

  • Could have significant difference prior to some t, and another significant difference after t: but, what if direction differs?


What about all those differences in our prostate cancer km curves
What about all those differences in our prostate cancer KM curves?

  • Not much evidence of crossing

  • if there isnt overlap, then tests will be somewhat consistent

  • log-rank: most appropriate for ‘proportional hazards’


Example
Example curves?

  • K&M 1.4

  • Kidney infection data

  • Two groups:

    • patients with percutaneous placement of catheters (N=76)

    • patients with surgical placement of catheters (N=43)



Log rank
Log-rank curves?


Comparisons
Comparisons curves?

p

0.11

0.96

0.53

0.24

0.26

0.002

0.24

0.002

0.002

0.004



Notice the differences
Notice the differences! curves?

  • Situation of varying inferences

  • Need to be sure that you are testing what you think you are testing

  • Check:

    • look at hazards?

    • do they cross?

  • Problem:

    • estimating hazards is messy and imprecise

    • recall: h(t)= derivative H(t)


Misconception
Misconception curves?

  • Survival curves crossing  telling about appropriateness of log-rank

  • Not true:

    • survivals crossing depends on censoring and study length

    • what if they will cross but t range isnt sufficient?

  • Consider:

    • Survival curves cross  hazards cross

    • Hazards cross  survivals may or may not cross

  • solution?

    • test in regions of t

    • prior to and after cross based on looking at hazards

    • some tests allow for crossing (Yang and Prentice 2005)


Cox propotional hazards model
Cox Propotional Hazards Model curves?

  • Names

    • Cox regression

    • semi-parametric proportional hazards

    • Proportional hazards model

    • Multiplicative hazards model

  • When?

    • 1972

  • Why?

    • allows adjustment for covariates (continuous or categorical) in a survival setting

    • allows prediction of survival based on a set of covariates

  • Analogous to linear and logistic regression in many ways


Cox phm notation
Cox PHM Notation curves?

  • Data on n individuals:

    • Tj : time on study for individual j

    • dj : event indicator for individual j

    • Zj : vector of covariates for individual j

  • More complicated: Zj(t)

    • covariates are time dependent

    • they may change with time/age


Basic model
Basic Model curves?

For a Cox model with just one covariate:


Comments on basic model
Comments on basic model curves?

  • h0(t):

    • arbitrary baseline hazard rate.

    • notice that it varies by t

  • β:

    • regression coefficient (vector)

    • interpretation is a log hazard ratio

  • Semi-parametric form

    • non-parametric baseline hazard

    • parametric form assumed only for covariate effects


Linear model formulation
Linear model formulation curves?

  • Usual formulation

  • Coding of covariates similar to linear and logistic (and other generalized linear models)


Why proportional
Why “proportional”? curves?

  • hazard ratio

  • Does not depend on t (i.e., it is a constant over time)

  • But, it is proportional (constant multiplicative factor)

  • Also referred to (sometimes) as the relative risk.


Simple example
Simple example curves?

  • one covariate: z = 1 for new treatment, z=0 for standard treatment

  • hazard ratio = exp(β)

  • interpretation: exp(β) is the risk of having the event in the new treatment group versus the standard treatment

  • Interpretation: at any point in time, the risk of the event in the new treatment group is exp(β) times the risk in the standard treatment group


Hazard ratio cap cyclophosphamide doxorubicin cisplatin versus paclitaxel
Hazard Ratio: curves?CAP (cyclophosphamide, doxorubicin, cisplatin) versus paclitaxel


Hazard ratios
Hazard Ratios curves?

  • Assumption: “Proportional hazards”

  • The risk does not depend on time.

  • That is, “risk is constant over time”

  • But that is still vague…..

  • Hypothetical Example: Assume hazard ratio is 0.5.

    • Patients in new therapy group are at half the risk of death as those in standard treatment, at any given point in time.

  • Hazard function= P(die at time t | survived to time t)


Hazard ratios1
Hazard Ratios curves?

  • Hazard Ratio =

    hazard function for New

    hazard function for Std

  • Makes assumption

    that this ratio is

    constant over time.


Interpretation again
Interpretation Again curves?

  • For any fixed point in time, individuals in the new treatment group are at half the risk of death as the standard treatment group.


Hazard ratio is not always valid
Hazard ratio is not always valid …. curves?

Hazard Ratio = .71


Refresher of coding covariates
Refresher of coding covariates curves?

  • This should be nothing new

  • Two kinds of ‘independent’ variables

    • quantitative

    • qualitative

  • Quantitative are continuous

    • need to determine scale

      • units

      • transformation?

  • Qualitative are generally categorical

    • ordered

    • nominal

    • coding affects the interpretation


Tests of the model
Tests of the model curves?

  • Testing that βk=0 for all k=1,..,p

  • Three main tests

    • Chi-square/Wald test

    • Likelihood ratio test

    • score(s) test

  • All three have chi-square distribution with p degrees of freedom


Example tax327
Example: TAX327 curves?

  • Randomized clinical trial of men with hormone-refractory prostate cancer

  • three treatment arms (Q3 docetaxel, weekly docetaxel, and Q3 mitixantrone)

  • other covariates of interest:

    • psa doubling time

    • lymph node involvement

    • liver metastases

    • number of metastatic sites

    • pain at baseline

    • baseline psa

    • tumor grade

    • alkaline phosphatase

    • hemoglobin

    • performance status



Cox phm approach
Cox PHM approach curves?

st <- Surv(survtime, died)

attach(data, pos=2)

reg1 <- coxph(st ~ trtgrp)

reg2 <- coxph(st ~ factor(trtgrp))

summary(reg2)

attributes(reg2)

reg2$coefficients

summary(reg2)$coef


Results
Results curves?

> summary(reg2)

Call:

coxph(formula = st ~ factor(trtgrp))

n= 1006

coef exp(coef) se(coef) z p

factor(trtgrp)2 0.105 1.11 0.0882 1.19 0.2300

factor(trtgrp)3 0.245 1.28 0.0863 2.84 0.0045

exp(coef) exp(-coef) lower .95 upper .95

factor(trtgrp)2 1.11 0.900 0.935 1.32

factor(trtgrp)3 1.28 0.783 1.079 1.51

Rsquare= 0.008 (max possible= 1 )

Likelihood ratio test= 8.12 on 2 df, p=0.0173

Wald test = 8.16 on 2 df, p=0.0169

Score (logrank) test = 8.19 on 2 df, p=0.0167


Multiple regression
Multiple regression curves?

  • In the published paper, the model included all covariates included in previous list


Fitting it in r
Fitting it in R curves?

reg3 <- coxph(st ~ factor(trtgrp) + liverny + numbersites + pain0c + pskar2c + proml + probs + highgrade + logpsa0 + logalkp0c + hemecenter + psadtmonthcat)

reg4 <- coxph(st ~ factor(trtgrp) + liverny + numbersites +

pain0c + pskar2c + proml + probs + highgrade + logpsa0 +

logalkp0c + hemecenter + factor(psadtmonthcat))


> reg3 curves?

Call:

coxph(formula = st ~ factor(trtgrp) + liverny + numbersites +

pain0c + pskar2c + proml + probs + highgrade + logpsa0 +

logalkp0c + hemecenter + psadtmonthcat)

coef exp(coef) se(coef) z p

factor(trtgrp)2 0.1230 1.131 0.1099 1.12 2.6e-01

factor(trtgrp)3 0.3784 1.460 0.1070 3.54 4.0e-04

liverny 0.4813 1.618 0.2168 2.22 2.6e-02

numbersites 0.4757 1.609 0.1430 3.33 8.8e-04

pain0c 0.3708 1.449 0.0925 4.01 6.1e-05

pskar2c 0.3167 1.373 0.1339 2.37 1.8e-02

proml 0.3132 1.368 0.1125 2.78 5.4e-03

probs 0.2568 1.293 0.0991 2.59 9.5e-03

highgrade 0.1703 1.186 0.0922 1.85 6.5e-02

logpsa0 0.1549 1.168 0.0312 4.96 7.0e-07

logalkp0c 0.2396 1.271 0.0483 4.96 7.0e-07

hemecenter -0.1041 0.901 0.0351 -2.96 3.1e-03

psadtmonthcat -0.0884 0.915 0.0430 -2.05 4.0e-02

Likelihood ratio test=205 on 13 df, p=0 n=641 (365 observations deleted due to missingness)

>


Proportional
proportional? curves?

  • recall we are making strong assumption that we have proportional hazards for each covariate

  • we can investigate this to some extent via graphical displays

  • but, limited for quantitative variables


Local tests
“Local” Tests curves?

  • Testing individual coefficients

  • But, more interestingly, testing sets of coefficients

  • Example:

    • testing the psa variables

    • testing treatment group (3 categories)

  • Same as previous:

    • Wald test

    • Likelihood ratio

    • Scores test


Tax327
TAX327 curves?

reg5 <- coxph(st ~ liverny + numbersites +

pain0c + pskar2c + proml + probs + highgrade + logpsa0 + logalkp0c + hemecenter + factor(psadtmonthcat))

lrt.trt <- 2*(reg4$loglik[2] - reg5$loglik[2])

p.trt <- 1-pchisq(lrt.trt, 2)

#` to compare, you need to have the same dataset

liverny1 <- ifelse(is.na(psadtmonthcat),NA,liverny)

reg6 <- coxph(st ~ factor(trtgrp) + liverny1 + numbersites + pain0c + pskar2c + proml + probs + highgrade + logpsa0 + logalkp0c + hemecenter)

lrt.psadt <- 2*(reg4$loglik[2] - reg6$loglik[2])

p.psadt <- 1-pchisq(lrt.psadt, 4)


ad