1 / 33

Challenges and strategies when exploiting data on ethnicity from social survey datasets

Challenges and strategies when exploiting data on ethnicity from social survey datasets. Paul Lambert, University of Stirling

armina
Download Presentation

Challenges and strategies when exploiting data on ethnicity from social survey datasets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Challenges and strategies when exploiting data on ethnicity from social survey datasets Paul Lambert, University of Stirling Talk presented to the NCRM seminar ‘What is ethnicity? What methods best capture it?’, part of the NCRM series ‘’Promoting methodological innovation and capacity building in research on ethnicity’, University of Essex, 14th May 2010. This work draws upon materials from the GEMDE project, a component of DAMES (www.dames.org.uk), an ESRC funded research Node working on ‘Data Management through e-Social Science’

  2. Summary of claims • Well known challenges exploiting survey measures of ethnicity ..our response is usually too conservative.. • Better ‘data management’ could/should allow us to get much more from data • Take account of more precise ethnic differences • Longitudinal/cross-national comparisons • Complex multivariate models, interaction effects • We have something to offer here: ‘GEMDE’

  3. …why is working with ethnicity data in surveys so hard…? - It’s sparse - It’s collinear (e.g. to age) - It’s dynamic (cf. comparative research)

  4. Data includes: • Generic & specialist studies collecting ethnic ‘referents’ ‘ethnic identity’; nationality, parents’ nationality; country of birth; language spoken; religion; ‘race’ • National research: • Most countries have evolving standard definitions of ethnic groups, though not all surveys follow them • Some surveys cover large numbers from many/all groups • Most surveys only have sparse representation of most groups • Comparative research (international/longitudinal) : • Seen as highly problematic in many fields except immigration studies • Lambert, P.S. (2005). Ethnicity and the Comparative Analysis of Contemporary Survey Data. In J. H. P. Hoffmeyer-Zlotnick & J. Harkness (Eds.), Methodological Aspects in Cross-National Research (pp. 259-277). Manheim: ZUMA-Nachrichten Spezial 11.

  5. He said that ‘our response is usually too conservative’? I’m not conservative! Social theory is dynamic, fluid, ‘intersectional’, but representative empirical analyses struggles to engage with its terms Empirical studies are bivariate; descriptive; use low numbers of groups & normalising assumptions This is ‘conservative’ because.. • Administrative pressure to reify descriptive groups • Analyses simplify, or ignore, rather than incorporate, extra information on ethnic locations (e.g. language, religion) • Analytical results tend to be easily anticipated (basic descriptions, ignoring complex collinear contexts)

  6. 2) Data management for categorical data • Principal social survey datum • Basis of most social research reports/analyses/comparisons • It’s rich and complex • We’re often interested in very fine levels of detail / difference • We usually recode categories in some way for analysis • …how categorical data is managed is of great consequence to the results of analysis… Choices about recoding, boundaries, contrasts made [e.g. RAE analysis: Lambert & Gayle 2009] Management itself influences analytical approaches

  7. EFFNATIS sample (1999): Subjective ethnic identity

  8. UK EFFNATIS survey (1999) [Heckmann et al 2001]; [Penn & Lambert 2009]

  9. A ‘data management’ contribution? • Preserve information on what was done with categorical data • Communicate information on what should/could be done

  10. Standardizing categorical data • ‘Standardization’ refers to treating variables for the purposes of analysis, in order to aid comparison between variables • {In the terminology of survey research analysts} 1. Arithmetic standardization to re-scale metric values [zi = (xi – x) / sd] 2. Ex-ante or Ex-post harmonisation [during data production, or adaptation after the event] 3. Measurement or Meaning/Functional equivalence [Much comparative research flounders on the apparent impossibility of measurement equivalence and lack of options for functional equivalence, e.g. Van Deth, 2003] ‘One size doesn’t fit all so we can’t go on’

  11. Meaning equivalence • For categorical data, equivalence for comparisons is often best approached in terms of meaning equivalence (because of non-linear relations between categories and shifting underlying distributions) (even if measurement equivalence seems possible) • Arithmetic standardisation offers a convenient form of meaning equivalence by indicating relative position with the structure defined by the current context • For categorical data, this can be achieved/approximated by scaling categories in one or more dimension of difference

  12. ‘Effect proportional scaling’ using parents’ occupational advantage

  13. What was that then? • We can represent categories through positions on a scale • In turn, we can use position in the dimension as a category score which then plugs into a further analysis (e.g. regression main and interaction effects) ..Some options for data on ethnicity.. • Stereotyped Ordered Logistic Regression (SOR) models, summarize dimensions of difference according to regression predictor values [e.g. Lambert and Penn, 2001] • Geometric data analysis for distances between people, or things [cf. Prandy, 1979; Bennett et al., 2009] • Assign category scores by hand (a priori or by selected average)

  14. Is scaling useful? ..sometimes.. • Intrinsically revealing as an exploratory exercise • Parsimonious functional form in explanatory modelling • Esp. if ethnicity is a control variable • If interaction effects are considered • If a story of a linear functional form is persuasive (e.g. exponential increase)

  15. Predicting poor subjective health, BHPS w15

  16. What we do and what we ought to do Research applications tend to select a single simplifying collinear categorisation of a concept • Due to coordinated instructions [e.g. Blossfeld et al. 2006] • Due to perceived lack of available alternatives • Due to perceived convenience To make statistical analyses more robust we should… • Operationalise and deploy various scalings and arithmetic measures • Try out various categorisations and explore their distributional properties • … and keep a replicable trail of all these activities..

  17. 3) Some contributions from DAMES • 3 themes in DAMES ought, in our perspective, to help here • Replicability / transparency • Plurality of approaches • Ease of access (to off-putting operations)

  18. Replicability / transparency • Document your own recodes • Access somebody else’s recodes • Identify commonly used recodes (& use them..!)

  19. Plurality of approaches • Diminishing excuses for not trying out multiple operationalisations…

  20. Making complex things easier • Organising complex categorical data • Labelling, recoding, etc • Effect proportional scaling • Standardisation • Interaction terms

  21. GESDE: Grid Enabled Specialist Data Environments • Facilities for collecting together, and distributing, specialist data resources • Occupations: GEODE project began 2005 • Education and Ethnicity: GEEDE and GEMDE began Feb. 2008 • Capacity building aims: improving use of measures of these concepts by • improving access to relevant information • providing training / advice on good practice

  22. GEODE: Organising and distributing specialist data resources (on occupations)

  23. The GEODE model for GEMDE? • Occupational Information Resources • Occupational Unit Groups

  24. Our approach to GEMDE • ….A service for MUGs and MIRs… • Define/register ‘Minority Unit Groups’ • Define/register ‘Minority Information Resources’ • Explore data resources and obtain help in approaching analysis of complex, sparse data

  25. What's a MIR? • 'Minority Information Resource'. • This is our own terminology. By a MIR, we mean any piece of information which supplies systematic data on a minority unit group (MUG) classification. We've used this term to be deliberately similar to the phrase 'Occupational Information Resources' that we used on GEODE • E.g. summary statistical data about the categories from and documentation or information • E.g. recodings which have been used in a particular study • Social scientists are not in general aware of the existence of MIRs (cf. wides use of popular Occupational Information Resources). In GEMDE we seek to publicise little know resources and promote their uptake: We argue that better communication and dissemination of MIRs is in fact an important step towards better scientific practice of replication and standardisation of research. • In our terms, every MIR necessarily links to a MUG (but not every MUG has a MIR).

  26. The GEMDE prototype‘Liferay portal’ with access to MUGs and MIRs • Current facilities • Shibboleth access • Deposit MUGs/MIRs • Search/browse deposited resources • Feedback on resources (user ratings) • Still to come • Additional guest access • Review live data (e.g. pooled LFS records) • Expert and user quality ratings => …development over 2010...

  27. Screenshot here!

  28. Data used • Department for Education and Employment. (1997). Family and Working Lives Survey, 1994-1995 [computer file]. Colchester, Essex: UK Data Archive [distributor], SN: 3704. • Heckmann, F., Penn, R. D., & Schnapper, D. (Eds.). (2001). Effectiveness of National Integration Strategies Towards Second Generation Migrant Youth in a Comparative Perspective - EFFNATIS. Bamberg: European Forum for Migration Studies, University of Bamberg. • Inglehart, R. (2000). World Values Surveys and European Values Surveys 1981-4, 1990-3, 1995-7 [Computer file] (Vol. 2000). Ann Arbor, MI: Institute for Social Research [Producer]; Inter-university Consortium for Political and Social Research [Distributor]. • Li, Y., & Heath, A. F. (2008). Socio-Economic Position and Political Support of Black and Ethnic Minority Groups in the United Kingdom, 1972-2005 [computer file]. 2nd Edition. Colchester, Essex: UK Data Archive [distributor], SN: 5666. • University of Essex, & Institute for Social and Economic Research. (2009). British Household Panel Survey: Waves 1-17, 1991-2008 [computer file], 5th Edition. Colchester, Essex: UK Data Archive [distributor], March 2009, SN 5151.

  29. References • Blossfeld, H. P., Mills, M., & Bernardi, F. (Eds.). (2006). Globalization, Uncertainty and Men's Careers: An International Comparison. Cheltenham: Edward Elgar. • Bennett, T., Savage, M., Silva, E. B., Warde, A., Gayo-Cal, M., Wright, D., et al. (2009). Culture, Class, Distinction. London: Routledge. • Lambert, P. S., & Gayle, V. (2009). Data management and standardisation: A methodological comment on using results from the UK Research Assessment Exercise 2008. Stirling: University of Stirling, Technical paper 2008-3 of the Data Management through e-Social Science research Node (www.dames.org.uk) • Lambert, P. S., & Penn, R. D. (2001). SOR models and Ethnicity data in LIS and LES : Country by Country Report. Syracuse University, Syracuse, New York 13244-1020: Luxembourg Income Study Paper No. 260, Maxwell School of Citizenship and Public Affairs. • Penn, R. D., & Lambert, P. S. (2009). Children of International Migrants in Europe: Comparative Perspectives. Basingstoke: Palgrave. • Prandy, K. (1979). Ethnic discrimination in employment and housing. Ethnic and Racial Studies, 2(1), 66-79. • Simpson, L., & Akinwale, B. (2006). Quantifying Stablity and Change in Ethnic Group. Manchester: University of Manchester, CCSR Working Paper 2006-05. • van Deth, J. W. (2003). Using Published Survey Data. In J. A. Harkness, F. J. R. van de Vijver & P. P. Mohler (Eds.), Cross-Cultural Survey Methods (pp. 329-346). New York: Wiley.

More Related