Generalized Estimating Equations: Insights and Simulation Studies

Andrew ThomsononGeneralised Estimating Equations (and simulation studies)

Topics Covered • What are GEE? • Relationship with robust standard errors • Why they are not as complicated as they appear • How does simulation answer (or not) the differences between different GEE approaches

Issues… • My results are questionable (thanks to Richard…) • Not shown in their entirety • But – Agree with other studies • Fixed cluster size is definitely correct

A simple example • Consider simple uncorrelated linear regression , e.g. height on weight • Minimize sum of squares

Simple example II • Differentiate wrt each parameter and set = 0 • In general if we have p covariates then minimizing ss is the same as solving p estimating equations

Extensions • Non-linear regression (logistic) • Weighting, based on the correlation of the results

Surprisingly – Not that bad • For each cluster, Dj is a 2 x mij matrix

A is an mij x mij matrix with diagonal elements • Independence – Identity matrix • Exchangeable. 1s on the diagonal, rho everywhere else • Unadjusted studies -

So what is DjTVj ? • Independence – Control • Independence - IV • Exch Control • Exch IV

Missing Out Some Algebra • Independence. Estimate • And estimate OR as • Exch -

Simple Interpretation • Independence gives equal weight to each observation • Exchangeable gives weight proportional to the variance (measured by rho) • No obvious working correlation matrix which gives equal weight to each cluster

Note on Simulation • Used to make inference about methods behaviour when unclear as to theoretical properties • Simulator has choice over • Parameters varied • Output measured • These should answer relevant questions

Relevance for simulation studies • Equal cluster sizes give the same point estimate • Any potential benefits of one approach over the other in terms of precision (measured by MSE) cannot be found • Simulation studies should always consider the variable cluster size case

Unadjusted studies • What outcome (OR, RR, RD) are we interested in measuring? • What weights do we use for each cluster? • Does the estimating procedure e.g. confidence interval construction have the right size?

Estimating the Variance • Done using robust standard errors • F is a matrix which depends on V and D • is estimated by • Independence is identical to robust standard errors • Criticism of GEE is also criticism of RSE

Problems and solutions • is biased downwards for small samples (< 40 clusters) p-values too small • We “know” what this bias is (function of D and V). Lets call it H • We replace with • Basically changing the filling of our sandwich

C.I Construction • Wald Test • Independence • Exchangeable • Bias Corrected • Score Test (adjusted score test) Evaluate score equations at H0 obtain a χ2 statistic.

More on the score test • Score test is conservative • Using bias correction will make it worse • Multiply χ2 statistic by J / (J-1) • CI construction is done using the bisection algorithm

Results! - Size (5% Nominal)

Power • H0 is not true. • Simulation studies tend to use beta-binomial distribution to simulate • Common rho (?) • If size is above nominal, power will e inflated as well. If they have the same size, does MSE have an effect?

Power results • In general above nominal. • Due to incorrect size • Naïve > Ind > Exch > B.C = Score • This result is expected and surprising at the same time. Score and B.C actually attain the nominal level • Considered later

Adjusted studies • Very few have been done ( 2.5) • Beta – binomial distribution is not amenable to including covariates • Cluster level covariate – same argument applies for the fixed / variable cluster size issue • Results are identical

Why is the adjusted score powerful? • The score test is just better • Power is based on p-values, rather than C.Is. Containing 1. It is possible to have a p-value that is significant but the confidence interval contains 1 • Score statistic not derived for all data sets due to model fitting

Fitting the models • R – various libraries (gee, geese, geepack). No score test. Crashes • STATA – xtgee – no score test • SAS – Proc Genmod. Score test. No score test CI construction • S-Plus – code from authors (allegedly)

Convergence • Depends on number of clusters • 15 – 20 clusters 100% convergence • 10 clusters 99.7% convergence • 4 – 6 clusters 99% convergence • Score test – lose even more in SAS • 15 – 20 clusters lose another 0.5% • 4 – 6 clusters lose another 1%

Conclusions • If you wish to use GEE then the adjusted score test is the (only?) appropriate way for a small number of clusters • This is perhaps questionable • The most complicated model to fit in terms of code.

What Should Simulation Do? • Reflect what you’ll see in practice • Variable cluster size • Include individual level covariates (ideally imbalanced) • Look not only at size but power (and coverage) • Measure MSE for no IV cases • Sensitivity to departures from assumptions

Number of Studies that do this • 0 • Mine does. • Perhaps ‘luck’ rather than judgement • Designed it 2 years ago • Decided 2 months ago that it was actually quite good

‘Luck’ • 1 supervisor, 2 advisors • One advisor suggested MSE • The other was adamant I did sensitivity analysis • Richard obviously made outstanding contribution. • Something of a consortium approach

Data sharing • Given this – might be useful to have data files available online • Use these for any further analysis methods that may become available • Server space? Interactivity? • Results?

Thank You

Generalized Estimating Equations: Insights and Simulation Studies