Stephen Fisher, Jane Holmes, Nicky Best, Sylvia Richardson

Combining individual and aggregate data to improve estimates of ethnic voting in Britain in 2001 and 2005 Stephen Fisher, Jane Holmes, Nicky Best, Sylvia Richardson Department of Sociology, University of Oxford Department of Epidemiology and Biostatistics Imperial College, London http://www.bias-project.org.uk

Outline • Question of interest • Model we will use • Analysis

Target analysis Individual outcome xij yij Individual exposure Aggregate exposure Zi, Xi Zi, Xi Zi, Xi Ecological regression Aggregate outcome Aggregate exposure Yi Hierarchical Related Regression (HRR) Individual outcome xij yij Individual exposure Aggregate outcome Aggregate exposure Yi

A decline in ethnic minority support for Labour? From 1974 to 2001 around 80% of ethnic minorities vote Labour Between 2001 and 2005 there were • Islamic terrorist attacks • US and UK led invasions of Afghanistan and Iraq • Heightened security and suspicion of non-whites • Unlawful detention of foreign terror suspects • Convictions of British soldiers for Iraqi prisoner abuse These and other events are thought to have undermined support for Labour among ethnic minorities. On the other hand, harsh stance on immigration in Conservative 2005 election campaign may have alienated ethnic voters

A decline in Muslim support for Labour? Initially • We found that the gap in Labour vote between whites and non-whites narrowed between 2001 and 2005. • Results presented at PSA 2009 • Audience opinion was interesting, but really wanted to know whether the same was true of Muslims So • We tested whether the gap in Labour vote between Muslims and non-Muslims narrowed between 2001 and 2005.

Individual-level model Probability subject j votes Labour Log odds ratio of Muslim voting Labour compared with non-Muslim Area-level random effect British Election Study post-election survey (BES) • Cross-sectional survey carried out after every general election For subject j in constituency i, • yij = voted Labour (1) / didn’t vote Labour (0) • xij = Muslim (1) / non-Muslim (0) But 1,898 subjects with validated data, only 20 Muslims

Aggregate data However, we have data at the aggregate level for entire population • 2001 Census data on % who are Muslim • Number of people who vote Labour in each constituency from General election results Data viewed as a 2x2 table. For constituency i: • yi = number who vote Labour • ni = number who are eligible to vote • xi = number who are Muslim

Ecological bias Standard analysis of this data will probably lead to biased results Bias in ecological studies can be caused by: • Confounding • Confounders can be area-level (between-area) or individual-level (within-area)  include control variables and/or random effects in model • Non-linear covariate-outcome relationship, combined with within-area variability of covariate • No bias if covariate is constant in area (contextual effect) • Bias increases as within-area variability increases • … unless models are refined to account for this hidden variability

Improving ecological inference Alleviate bias associated with within-area covariate variability Data at area-level, for constituency i: • Area-level outcomeyi = number of people who vote Labour • Area-level predictor = proportion who are Muslim Then yi ~ Binomial(ni , pi) • where the area-level probability pi is calculated by integrating individual-level probabilities given by individual-level model with respect to the within-area joint distribution fi(x) of all individual-level predictors • pi = pij(x) fi(x) dx pi is average group-level probability (of voting Labour) pij(x) is individual-level probability given covariates x fi(x) is distribution of covariate x within area i

The model for a single binary covariate Prob. of being Muslim Prob. Muslim votes Labour Prob. of being non-Muslim Prob. non-Muslim votes Labour Consider a single binary covariate x, e.g. Muslim/non-Muslim fi(x) is the proportion of individuals with x = 1 in each area, i.e. the proportion Muslim in each constituency Individual-level model • pij = g(i + xij), where g() = e/(1+e) • pij = g(i) if person j is non-Muslim • pij = g(i + ) if person j is Muslim Integrated group-level model • = proportion Muslim in constituencyi(mean of xij) • pi = average probability (proportion) of voting Labour in area i

Hierarchical Related Regression The parameters of the aggregate model have been derived from an underlying individual-level model So the exposure-outcome relationship is assumed to be the same in both the aggregate data and the individual-level data This means that the individual and aggregate data can be used simultaneously to make inference on the underlying individual-level model. The likelihood for the combined data is simply the product of the likelihoods of each set of data This combined model is termed a hierarchical related regression (HRR). (Jackson, Best and Richardson, 2006)

Recap Question of interest • How do Muslims vote? And did they change their voting behaviour between the 2001 and 2005 general elections? • i denotes constituency, j denotes subject within a constituency

Proportion of electorate who voted Labour in 2001 and 2005, by constituency

Analyses To start, various models are fit to the 2001 general election only • Simple model with only an individual Muslim effect

Analyses To start, various models are fit to the 2001 general election only • Simple model with only an individual Muslim effect • Add a contextual effect of Muslim as well as an individual effect

Analyses To start, various models are fit to the 2001 general election only • Simple model with only an individual Muslim effect • Add a contextual effect of Muslim as well as an individual effect • Add an interaction term

Analyses To start, various models are fit to the 2001 general election only • Simple model with only an individual Muslim effect • Add a contextual effect of Muslim as well as an individual effect • Add an interaction term • Include socio-economic status as a confounder • Partly motivated by the apparent interaction • Socio-economic status coded as manual/non-manual

More than one individual-level binary covariate For the integrated group-level model, when we have more than one binary covariate we need to know the cross-classification of individuals between covariate categories within each area, e.g. number of Muslims who have a manual job Then average probability of voting Labour in area i, Estimate p(xij, zij) by proportion in area i with covariates xij, zij • Census does not contain these cross-classifications • Estimate by product of the 2 marginals, Lasserreet al

Odds ratio of voting Labour for Muslims = 9.45 (3.20, 19.81)

Comparison of voting behaviour in 2001 and 2005 What we are really interested in is whether Muslims changed their voting behaviour between the 2001 and 2005 general elections Individual model for 2001 election Individual model for 2005 election

Results – odds ratios

Conclusions Muslims are more likely to vote Labour than non-Muslims Muslims did significantly change their voting behaviour between 2001 and 2005 • In 2005 they were less likely to support Labour than in 2001 We need to find and include more individual Muslim data in our analysis Jackson, C. H, Best, N. G. and Richardson, S. (2006). Improving ecological inference using individual-level data. Statist. Med., 25, 2136-2159 Lasserre, V., Guihenneuc-Jouyaux, C. and Richardson, S. (2000). Biases in ecological studies: utility of including with-area distribution of confounders. Statist Med., 19, 45-59

Stephen Fisher, Jane Holmes, Nicky Best, Sylvia Richardson

Stephen Fisher, Jane Holmes, Nicky Best, Sylvia Richardson

Presentation Transcript

Stephen Fisher, Jane Key, Nicky Best, Sylvia Richardson Department of Sociology, University of Oxford and Department of

Christopher Jackson With Nicky Best and Sylvia Richardson Department of Epidemiology and Public Health Imperial College,

Nicky Best, Chris Jackson, Sylvia Richardson Department of Epidemiology and Public Health Imperial College, London

Nicky Best and Chris Jackson With Sylvia Richardson Department of Epidemiology and Public Health Imperial College, Londo

Chris Jackson With Nicky Best and Sylvia Richardson Department of Epidemiology and Public Health Imperial College, Londo

Sylvia Chidi

Nicky brought cupcakes!!

Sylvia Plath

The Best of Sherlock Holmes

Nicky Andrew

Sylvia Richardson Centre for Biostatistics Imperial College, London

Jane Richardson Oracle Academy Jane.richardson@oracle

Mary Jane Richardson and Leah Sauchyn

Guangquan Li * , Robert Haining + , Sylvia Richardson * and Nicky Best *

Richardson

Prof Stephen Richardson Faculty Principal PA - Clare Pearson

Sylvia Richardson Centre for Biostatistics Imperial College, London

Sylvia Richardson sylvia.richardson@mrc-bsum.ac.uk

Alex Lewin Sylvia Richardson ( IC Epidemiology) Tim Aitman (IC Microarray Centre)

Alex Lewin (Imperial College, Dept of Epidemiology) Sylvia Richardson ( IC Epidemiology)

Alex Lewin Sylvia Richardson ( IC Epidemiology) Tim Aitman (IC Microarray Centre)

Alex Lewin (Imperial College) Sylvia Richardson ( IC Epidemiology)