1 / 19

Review and consultation: Next steps in supporting data on ethnicity

Review and consultation: Next steps in supporting data on ethnicity. DAMES workshop on ‘Data on ethnicity in social survey research’, 28 th January 2010, University of Stirling. Some preliminary comments: E-Social Science Challenges/principles Ethnicity research agendas

gary-stuart
Download Presentation

Review and consultation: Next steps in supporting data on ethnicity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Review and consultation: Next steps in supporting data on ethnicity DAMES workshop on ‘Data on ethnicity in social survey research’, 28th January 2010, University of Stirling

  2. Some preliminary comments: • E-Social Science • Challenges/principles • Ethnicity research agendas • Further comments/discussions/questions

  3. i) What makes this ‘e-Social Science’? Attention to data management in context of.. • Standards setting • Metadata • Portal framework Liferay portal to various DAMES resources iRODS system for ‘GE*DE’ specialist data Controlled data access under security limits • Use of workflows

  4. ‘Data Management’ ‘the tasks associated with linking related data resources, with coding and re-coding data in a consistent manner, and with accessing related data resources and combining them within the process of analysis’[…the DAMES Node..] • Usually performed by social scientists (post-release) • Most overt in quantitative survey data analysis • Usually a substantial component of the work process • Here we differentiate from archiving / controlling data itself

  5. ‘Data Management though e-Social Science’ • DAMES – www.dames.org.uk • ESRC Node funded 2008-2011 • Aim: Useful social science provisions • Specialist data topics – occupations; education qualifications; ethnicity; social care; health • Mainstream packages and accessible resources • Engage with existing provisions (e.g. ESDS; CESSDA) • Programme of case studies and provisions – more later

  6. ‘The significance of data management for social survey research’ • Data management is a major component of the social survey research workload • Pre-release manipulations performed by distributors / archivists • Coding measures into standard categories; Dealing with missing records • Post-release manipulations performed by researchers • Re-coding measures into simple categories • All serious researchers perform extended post-release management (and have the scars to show for it) • We do have existing tools, facilities and expert experience to help us…but we don’t make a good job of using them efficiently or consistently • So the ‘significance’ of DM is about how much better research might be if we did things more effectively…

  7. Data Management through e-Social Sciencewww.dames.org.uk

  8. E.g. of GEODE: Organising and distributing specialist data resources (on occupations)

  9. Challenges/principles Data manipulation skills and inertia • I would speculate that around 80% of applications using key variables don’t consult literature and evaluate alternative measures, but choose the first convenient and/or accessible variable in the dataset • Data supply decisions (‘what is on the archive version’) are critical • Much of the explanation lies with lack of confidence in data manipulation / linking data • Too many under-used resources – cf. www.esds.ac.uk

  10. Software issues • Stata seems to be the superior package for secondary survey data analysis: • Advanced data management and data analysis functionality • Supports easy evaluation of alternative measures (e.g. est store) • Culture of transparency of programming/data manipulation • Problems… • Not available to all users • Not easily incorporated in generic services

  11. Variables and functional form Functional form = the way in which measures are arithmetically incorporated in quantitative analysis • With occupations, education, ethnicity, and elsewhere, we tend to be too willing to make simplifying categorisations • Multiple categorisations are possible • As are scaling approaches – better suited for complex analytical procedures

  12. Good habits: Keep clear records of DM activities Reproducible (for self) Replicable (for all) Paper trail for whole lifecycle Cf. Dale 2006; Freese 2007 • In survey research, this means using clearly annotated syntax files (e.g. SPSS/Stata) Syntax Examples: www.longitudinal.stir.ac.uk

  13. Principle: Use existing standards and previous research • Variable operationalisations Use recognised recodes / standard classifications • NSI harmonisation standards (e.g. ONS) • Cross-national standards [Hoffmeyer-Zlotnick & Wolf 2003; Harkness et al. 2005; Jowell et al. 2007] • Research reviews [e.g. Shaw et al. 2007] • Common v’s best practices (e.g. dichotomisations) Use reproducible recodes / classifications (paper trail) • Other data file manipulations • Missing data treatments • Matching data files (finding the right data)

  14. Principle: Do something, not nothing • We currently put much more effort into data collection and data analysis, and neglect data manipulation • Survey research – the influence of ‘what was on the archive version’ …In my experience, a common reason why people didn’t do more DM was because they were frightened to…

  15. Principle: Support linking data Complex data (complex research) is distributed across different files. In surveys, use key linking variables for... • One-to-one matching SPSS: match files /file=“file1.sav” /file=“file2.sav” /by=pid. Stata: merge pid using file2.dta • One-to-many matching (‘table distribution’) SPSS: match files /file=“file1.sav” /table=“file2.sav” /by=pid . Stata: merge pid using file2.dta • Many-to-one matching (‘aggregation’) SPSS: aggregate outfile=“file3.sav” /meaninc=mean(income) /break=pid. Stata: collapse (mean) meaninc=income, by(pid) • Many-to-Many matches • Related cases matching

  16. Challenges.. Agreeing about variable constructions • Unresolved debates about optimal measures and variables • Esp. in comparative research such as across time, between countries In DAMES, we have particular interests in comparability for: • Longitudinal comparability (http://www.longitudinal.stir.ac.uk/variables/) • Scaling / scoring categories to achieve ‘meaning equivalence’ or ‘specific measures’

  17. Challenges.. Incentivising documentation / replicability • There is little to press researchers to better document DM, but much to press them not to • Make DM and its documentation easier? • Reward documentation (e.g. citations)?

  18. iii) Ethnicity research agendas Our impression • More data on more referents • Controlled access to data • Increasing recognition of intergenerational change • Mixed identities • Other views…?

  19. Further comments/ discussion/ questions • …..

More Related