Andrew Thomson LSH&TM

Controlling for individual covariates in CRTs with a small number of clusters and a dichotomous outcome Andrew Thomson LSH&TM

Content of talk • Potential approaches • Theoretical issues behind 2 such approaches • Choice of effect measure • Confidence Interval construction • Bias • Conclusions from this • Simulation results to assess methodology

2 ‘main’ approaches • CRTs have unique challenges to the analysis, due to the presence of between-cluster variability (within-cluster correlation) • Clusters are independent. • Model or estimate the BCV / WCC • Calculate a cluster summary measure, adjusting for individual covariates

Modelling the BCV • Random effects logistic regression. • Frequentist approach fitted using either Gaussian Quadrature, MQL, PQL • Bayesian approach – choice of prior? • Generalized estimating equations • Choice of correlation matrix • Small sample corrections • Not considered further in this talk

Calculating a summary measure • For unadjusted analyses this is fine • For cluster level covariates, one can regress cluster level prevalences (or a transform thereof) against cluster level covariates • One could aggregate individual level covariates to the cluster level but • Ecological fallacy • No ‘1 perfect way’ to summarise information

2 relevant methods • Standard logistic regression gives confidence intervals that are too narrow, and p-values that are too small • Bennett et al. Method was for rates. Fit a Poisson regression and use a t-test on the residuals • Gail et al. Fit a logistic regression to calculate cluster-level residuals from this, and use a permutation test to test for significance

Issues with these approaches • How well does the method for the analysis for incidence rates extend to proportions. What difficulties do we face with logistic as opposed to Poisson regression? • Permutation tests are prohibitively computer intensive (for the size of simulation study I wish to perform!) Can the methods of Gail be extended to the Wilcoxon test, which is more feasible?

The t-test • Fit a Poisson regression to the data, including all covariates except intervention • Calculate an expected number of responses per cluster • Calculate the SMR residual • Replace all rates, in the calculation of the rate ratio and the associated confidence interval with the SMR residual

The t-test • This approach gives equal weight to each cluster • One can fit a Poisson regression including the intervention effect to get a point estimate for the rate ratio giving equal weight to each observation – note with equal cluster sizes, the 2 approaches are the same.

A potential source of bias • Fitting a regression without including covariates will bias your estimate of treatment effect • Similarly, fitting a regression without including the treatment effect will bias your estimate of the covariate. • A potential solution is to fit the covariate in the regression, but ignore it in the estimation – not considered – yet!

Effect measures • Risk difference, risk ratio or the odds ratio can be used. • The SMR residuals estimate the risk ratio. • The residuals estimate the risk difference • I can not find a residual that measures the odds ratio • C.I. construction for the OR is problematic

The Odds ratio C.I. • There is no sensible way of using the residuals in C.I. construction due to the troublesome denominator

Summary • Perform logistic regression, calculate the expected number of cases per cluster • Calculate residuals and • Calculate the risk ratio and risk difference and associated C.I.s giving equal weight to each cluster • Not considering weighing yet

Non-parametric approach • Calculate residuals • Calculate mean difference in the residuals between each arm. • Perform a permutation test, by re-assigning residuals over all possible permutations • See if the difference originally calculated is significant

C.I. Construction • Add δ to each residual in the control arm • Permute and find the significance • Find values of δsuch that they give significance levels of 0.025 and 0.975 respectively. • These 2 values of δgive the upper and lower levels of a 95% C.I. for the risk difference

Extension to Wilcoxon • The approach works exactly the same, except replaces the permutation test with the Wilcoxon. • It is theoretically possible to extend this to SMR residuals • However, rather than an ‘additive’ δ, we need a ‘multiplicative δ. • This is fine, as long as we have no SMR residuals which are 0. Which i will have…

Advantages • Computationally feasible • Can ‘compare’ the risk difference results of the t-test and the Wilcoxon test • Still have the ‘preferred’ risk ratio estimate from the t-test

Disadvantages • We may not be able to get exact 95% C.I.s using a non-parametric approach. • Choose the C.I. that we can calculate that is closest to 95% • Further work will look at the other approaches. These always estimate the odds ratio. How do we compare, say, bias and mean square error if they are on different scales?

Simulation Study – Outcome Measures • Size (when IV effect = 0) • Coverage (when IV effect ≠ 0) • Power (defined as C.I. not including 0 / 1) • Bias in point estimate • Mean square error of point estimate

Parameters - 1 • Use the logistic distribution: • α = logit(0.02) or logit (0.2) • β1 = 0 or log(0.5) β2 = 0 or log(3) • Variance distributions chosen to be Normal, Double Exponential, or Uniform • Choose variance s.t. k = 0.2 or 0.4

Parameters - Cluster size • Fixed cluster sizes base on ‘likely’ designs – estimated from sample size formula • For each cluster size m, also consider the effect of variable cluster size • In each arm, choose cluster sizes that range uniformly from m/2 to 3m/2, rounding as necessary • E.G. for 4 clusters per arm of size 1700, choose values 850, 1417, 1983, 2550

Covariates • No consensus amongst previous studies as to how to simulate these. • Bennett et al chose 60% of individuals in the control arm and 40% in the intervention arm having a dichotomous covariate, with rate ratio 3 • Unrealistic? Nixon & Thompson provided an approach where one selects ‘extreme’ scenarios which ‘could happen’ by randomisation – used here.

Results! Large imbalance in favour of intervention arm K=0.2 – normal distribution – prevalence = 0.2

Small imbalance in favour of intervention arm K=0.2 – normal distribution – prevalence = 0.2

Small imbalance in favour of Control arm K=0.2 – normal distribution – prevalence = 0.2

Large imbalance in favour of Control arm K=0.2 – normal distribution – prevalence = 0.2

Conclusions • We get a biased estimate – direction of bias depends on covariate value • Bias is smaller when covariate is more common in control arm – possible due to ‘expectation issues’ • Will the ‘solution’ help this? • When the bias is positive, the risk ratio gives better coverage. When it is negative, the risk difference does. Why? • Coverage and size are generally good

Conclusions • If anything, the test is slightly anti-conservative- this is not good. • However it may be less ‘anti-conservative’ than other measures • Wilcoxon is varied – conservative? • However there is a complex relationship between bias and coverage • In general, there is worse coverage using the non-parametric approach • This is true for other variance distributions (not shown)

Large imbalance in favour of intervention arm K=0.2 – normal distribution – prevalence = 0.2 MSE is always 0.001 for risk difference

Small imbalance in favour of control arm K=0.2 – normal distribution – prevalence = 0.2 MSE is always 0.001 for risk difference

Conclusions • T-test is more powerful than Wilcoxon • Perhaps due to conservative / anti-conservative issues? • Large imbalances affect the coverage of the t-test much more than small ones. • We may improve the t-test by using weights based on cluster size • Power depends greatly on the covariate choice • The nominal power level is between the 2 values obtained

Further work… • Bias correction • Weighted t-test – also calculate a weighted point estimate for the Wilcoxon test • Wilcoxon for risk ratios? Depends on 0s • In general the t-test is ‘ok’. • Modelling approaches: are they better?

Thanks Questions?

Andrew Thomson LSH&TM