1 / 44

# Lecture 20 - PowerPoint PPT Presentation

Lecture 20. Comparing groups Cox PHM. Comparing two or more samples. Anova type approach where τ is the largest time for which all groups have at least one subject at risk Data can be right-censored for the tests we will discuss. Notation.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Lecture 20' - frayne

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Lecture 20

Comparing groups

Cox PHM

• Anova type approach

where τ is the largest time for which all groups have at least one subject at risk

• Data can be right-censored for the tests we will discuss

• t1<t2<…tDbe distinct death times in all samples being compared

• At time ti, let dij be the number of deaths in group j out of Yij individuals at risk. (j=1,..,K)

• Define

• Comparisons of the estimated hazard rate of the jth population under the null and alternative hypotheses

• If the null is true, the pooled estimate of h(t) should be an estimator for hj(t)

for j = 1,…,K

If all Zj(τ)’s are close to zero, then little evidence to reject the null.

• LOTS!

• Gehan test

• Fleming-Harrington

• Not all available in all software worth trying a few in each situation to compare inferences

• Let’s look at a prostate cancer dataset

• Prostate cancer clinical trial

• 3 trt groups (doce Q3, doce weekly, Q3 mitoxantrone)

• 5 PSA doubling times categories

• outcome: overall survival

#################################

# test for differences by trt grp

plot(survfit(st~trt), mark.time=F, col=c(1,2,3))

test1 <- survdiff(st~trt)

test2 <- survdiff(st~factor(trt, exclude=3))

test3 <- survdiff(st[trt<3]~trt[trt<3])

legend(50,1,as.character(0:4), lty=rep(1,5), col=1:5, lwd=rep(2,5))

• Note that we are interested in the average difference (consider log-rank specifically)

• What if hazards ‘cross’?

• Could have significant difference prior to some t, and another significant difference after t: but, what if direction differs?

• Not much evidence of crossing

• if there isnt overlap, then tests will be somewhat consistent

• log-rank: most appropriate for ‘proportional hazards’

Example curves?

• K&M 1.4

• Kidney infection data

• Two groups:

• patients with percutaneous placement of catheters (N=76)

• patients with surgical placement of catheters (N=43)

Log-rank curves?

Comparisons curves?

p

0.11

0.96

0.53

0.24

0.26

0.002

0.24

0.002

0.002

0.004

Notice the differences! curves?

• Situation of varying inferences

• Need to be sure that you are testing what you think you are testing

• Check:

• look at hazards?

• do they cross?

• Problem:

• estimating hazards is messy and imprecise

• recall: h(t)= derivative H(t)

Misconception curves?

• Survival curves crossing  telling about appropriateness of log-rank

• Not true:

• survivals crossing depends on censoring and study length

• what if they will cross but t range isnt sufficient?

• Consider:

• Survival curves cross  hazards cross

• Hazards cross  survivals may or may not cross

• solution?

• test in regions of t

• prior to and after cross based on looking at hazards

• some tests allow for crossing (Yang and Prentice 2005)

• Names

• Cox regression

• semi-parametric proportional hazards

• Proportional hazards model

• Multiplicative hazards model

• When?

• 1972

• Why?

• allows adjustment for covariates (continuous or categorical) in a survival setting

• allows prediction of survival based on a set of covariates

• Analogous to linear and logistic regression in many ways

Cox PHM Notation curves?

• Data on n individuals:

• Tj : time on study for individual j

• dj : event indicator for individual j

• Zj : vector of covariates for individual j

• More complicated: Zj(t)

• covariates are time dependent

• they may change with time/age

Basic Model curves?

For a Cox model with just one covariate:

• h0(t):

• arbitrary baseline hazard rate.

• notice that it varies by t

• β:

• regression coefficient (vector)

• interpretation is a log hazard ratio

• Semi-parametric form

• non-parametric baseline hazard

• parametric form assumed only for covariate effects

Linear model formulation curves?

• Usual formulation

• Coding of covariates similar to linear and logistic (and other generalized linear models)

Why “proportional”? curves?

• hazard ratio

• Does not depend on t (i.e., it is a constant over time)

• But, it is proportional (constant multiplicative factor)

• Also referred to (sometimes) as the relative risk.

Simple example curves?

• one covariate: z = 1 for new treatment, z=0 for standard treatment

• hazard ratio = exp(β)

• interpretation: exp(β) is the risk of having the event in the new treatment group versus the standard treatment

• Interpretation: at any point in time, the risk of the event in the new treatment group is exp(β) times the risk in the standard treatment group

Hazard Ratio: curves?CAP (cyclophosphamide, doxorubicin, cisplatin) versus paclitaxel

Hazard Ratios curves?

• Assumption: “Proportional hazards”

• The risk does not depend on time.

• That is, “risk is constant over time”

• But that is still vague…..

• Hypothetical Example: Assume hazard ratio is 0.5.

• Patients in new therapy group are at half the risk of death as those in standard treatment, at any given point in time.

• Hazard function= P(die at time t | survived to time t)

Hazard Ratios curves?

• Hazard Ratio =

hazard function for New

hazard function for Std

• Makes assumption

that this ratio is

constant over time.

Interpretation Again curves?

• For any fixed point in time, individuals in the new treatment group are at half the risk of death as the standard treatment group.

Hazard Ratio = .71

• This should be nothing new

• Two kinds of ‘independent’ variables

• quantitative

• qualitative

• Quantitative are continuous

• need to determine scale

• units

• transformation?

• Qualitative are generally categorical

• ordered

• nominal

• coding affects the interpretation

Tests of the model curves?

• Testing that βk=0 for all k=1,..,p

• Three main tests

• Chi-square/Wald test

• Likelihood ratio test

• score(s) test

• All three have chi-square distribution with p degrees of freedom

Example: TAX327 curves?

• Randomized clinical trial of men with hormone-refractory prostate cancer

• three treatment arms (Q3 docetaxel, weekly docetaxel, and Q3 mitixantrone)

• other covariates of interest:

• psa doubling time

• lymph node involvement

• liver metastases

• number of metastatic sites

• pain at baseline

• baseline psa

• alkaline phosphatase

• hemoglobin

• performance status

Cox PHM approach curves?

st <- Surv(survtime, died)

attach(data, pos=2)

reg1 <- coxph(st ~ trtgrp)

reg2 <- coxph(st ~ factor(trtgrp))

summary(reg2)

attributes(reg2)

reg2\$coefficients

summary(reg2)\$coef

Results curves?

> summary(reg2)

Call:

coxph(formula = st ~ factor(trtgrp))

n= 1006

coef exp(coef) se(coef) z p

factor(trtgrp)2 0.105 1.11 0.0882 1.19 0.2300

factor(trtgrp)3 0.245 1.28 0.0863 2.84 0.0045

exp(coef) exp(-coef) lower .95 upper .95

factor(trtgrp)2 1.11 0.900 0.935 1.32

factor(trtgrp)3 1.28 0.783 1.079 1.51

Rsquare= 0.008 (max possible= 1 )

Likelihood ratio test= 8.12 on 2 df, p=0.0173

Wald test = 8.16 on 2 df, p=0.0169

Score (logrank) test = 8.19 on 2 df, p=0.0167

Multiple regression curves?

• In the published paper, the model included all covariates included in previous list

Fitting it in R curves?

reg3 <- coxph(st ~ factor(trtgrp) + liverny + numbersites + pain0c + pskar2c + proml + probs + highgrade + logpsa0 + logalkp0c + hemecenter + psadtmonthcat)

reg4 <- coxph(st ~ factor(trtgrp) + liverny + numbersites +

pain0c + pskar2c + proml + probs + highgrade + logpsa0 +

> reg3 curves?

Call:

coxph(formula = st ~ factor(trtgrp) + liverny + numbersites +

pain0c + pskar2c + proml + probs + highgrade + logpsa0 +

coef exp(coef) se(coef) z p

factor(trtgrp)2 0.1230 1.131 0.1099 1.12 2.6e-01

factor(trtgrp)3 0.3784 1.460 0.1070 3.54 4.0e-04

liverny 0.4813 1.618 0.2168 2.22 2.6e-02

numbersites 0.4757 1.609 0.1430 3.33 8.8e-04

pain0c 0.3708 1.449 0.0925 4.01 6.1e-05

pskar2c 0.3167 1.373 0.1339 2.37 1.8e-02

proml 0.3132 1.368 0.1125 2.78 5.4e-03

probs 0.2568 1.293 0.0991 2.59 9.5e-03

highgrade 0.1703 1.186 0.0922 1.85 6.5e-02

logpsa0 0.1549 1.168 0.0312 4.96 7.0e-07

logalkp0c 0.2396 1.271 0.0483 4.96 7.0e-07

hemecenter -0.1041 0.901 0.0351 -2.96 3.1e-03

psadtmonthcat -0.0884 0.915 0.0430 -2.05 4.0e-02

Likelihood ratio test=205 on 13 df, p=0 n=641 (365 observations deleted due to missingness)

>

proportional? curves?

• recall we are making strong assumption that we have proportional hazards for each covariate

• we can investigate this to some extent via graphical displays

• but, limited for quantitative variables

“Local” Tests curves?

• Testing individual coefficients

• But, more interestingly, testing sets of coefficients

• Example:

• testing the psa variables

• testing treatment group (3 categories)

• Same as previous:

• Wald test

• Likelihood ratio

• Scores test

TAX327 curves?

reg5 <- coxph(st ~ liverny + numbersites +

pain0c + pskar2c + proml + probs + highgrade + logpsa0 + logalkp0c + hemecenter + factor(psadtmonthcat))

lrt.trt <- 2*(reg4\$loglik[2] - reg5\$loglik[2])

p.trt <- 1-pchisq(lrt.trt, 2)

#` to compare, you need to have the same dataset