1 / 16

Data Collection, Harmonisation and Storage (An international perspective)

Sub-brand to go here. Data Collection, Harmonisation and Storage (An international perspective). Jon Johnson (CLS, Senior Database Manager). CLS is an ESRC Resource Centre based at the Institute of Education. Contents. Introduction Survey Data ‘production line’ Data Management Compared

gasha
Download Presentation

Data Collection, Harmonisation and Storage (An international perspective)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sub-brand to go here Data Collection, Harmonisation and Storage (An international perspective) • Jon Johnson (CLS, Senior Database Manager) CLS is an ESRC Resource Centre based at the Institute of Education

  2. Contents • Introduction • Survey Data ‘production line’ • Data Management Compared • National Longitudinal Surveys • PSID and HRS (USA) • MCS, NCDS and BCS70 (UK) • LISS Panel (Netherlands) • Management strategies compared • Storage, maintenance and output • Meta Data Standards • New Requirements 2 2

  3. Introduction • In November 2008 CLS (MCS,NCDS, BCS70) and ULSC (BHPS, Understanding Society) were commissioned aspart of Objective 5 of the Survey Resources Network by the ESRC to: • Examine potential efficiencies in data management processes, particularly in relation to data management software; • Examine the use of cutting-edge data collection methods for longitudinal surveys carried out at CLS/ULSC • Completed a wide ranging review of the Survey Data Process and submitted it to the ESRC in November 2009. 3 3 www.cls.ioe.ac.uk

  4. Survey Data ‘production line’ 4 4 www.cls.ioe.ac.uk

  5. Data Management Compared • Various strategies to cope with the complex data flows of survey collection, management and dissemination: • Final report will be available from http://surveynet.ac.uk/sms/introduction.asp • Highly Integrated : National Longitudinal Surveys (USA) • Partnership : PSID and HRS (USA) • Contracted : MCS, NCDS and BCS70, BHPS,USoc (UK) • Loosely Integrated : LISS Panel (Netherlands) 5 5 www.cls.ioe.ac.uk

  6. National Longitudinal Surveys (USA) • Over more than two decades the NLS has developed in-house software to capture the survey. • More recently they have integrated this into a turnkey solution where the storage of the survey is itself a mirror of the data collection instrument. • Based on a highly normalised Oracle database, a snapshot of the data is auto-processed and available to researchers on a “create your own dataset basis” and then turned into standard flat datasets for use by researchers. • Ref: http://www.chrr.ohio-state.edu/ 6 6 www.cls.ioe.ac.uk

  7. PSID and HRS (USA) • Both the Panel Study of Income Dynamics (PSID) and the Health and Retirement Survey (HRS) utilise the in-house resources of the Survey Research Centre which provides survey data collection resources primarily to studies based at the University of Michigan. • Survey instrument design is closely linked both to the PI and data management teams using Blaise for data collection. • Data is prepared internally using SAS and processed to download as packaged datasets from PSID and also from IPCSR. • Ref: http://psidonline.isr.umich.edu/ and http://hrsonline.isr.umich.edu/ 7 7 www.cls.ioe.ac.uk

  8. MCS, NCDS and BCS70 (UK) • CLS is responsible for specification of the instruments and data output which is implemented by a third party survey organisation. • Data is further processed within CLS using SIR and provided to researchers as packaged datasets for download from the ESDS Data Archive. Meta-data is harvested from the CAI instrumentation and held in an SQL database for generation of HTML web pages directly from DDI 2.0 XML • Ref: http://www.cls.ioe.ac.uk and http://www.cls.ioe.ac.uk/datadictionary 8 8 www.cls.ioe.ac.uk

  9. LISS Panel (Netherlands) • The LISS Panel is primarily a web based survey, which uses a layer over Blaise with a dedicated survey instrument programming section closely linked to the survey design team. • Data is produced from Blaise and managed in SPSS and provided as prepared datasets for use by researchers for download from LISS. • A separate SQL metadata database, based on DDI 3.0 is used to provide navigation and generate the codebook etc. • Ref: http://www.lissdata.nl/lissdata/Homec 9 9 www.cls.ioe.ac.uk

  10. Management strategies compared • All studies face the same challenges • Complex data • Data description handling • Management of meta-data • Myriad audiences • Longitudinal consistency • Resource constraints • Re-purposing of data 10 10 www.cls.ioe.ac.uk

  11. All in one basket approach NLS NHANES 11 11 www.cls.ioe.ac.uk

  12. Data and Meta-data separated LISS / PSID / HRS MCS / NCDS / BCS / BHPS / USoc 12 12 www.cls.ioe.ac.uk

  13. Storage, maintenance, output • Cleaning your data • Cohort data continually evolves • 2-3% of people mis-report sex • Interviewers mis-key data • Data entry clerks mis-key data • Respondents mis-understand questions • Outputting and deriving data • Synchronizing changes, derivations and internal consistency, e.g. geographical identifiers and outputting in the best format for research is a function best done by DB staff 14 13 www.cls.ioe.ac.uk

  14. Meta Data Standards • The Data Documentation Initiative has emerged as the front runner as the basis for an international standard • Existing foothold is limited • Lacks sufficient support for longitudinal studies • Provides at least a minimum of data which would enable international cross-cohort data discovery • Can we establish a ‘Dublin Core’ for longitudinal / birth cohort surveys? 15 14 www.cls.ioe.ac.uk

  15. New Requirements • Video / audio • Genetics • Web capture e.g. social networks • Paper Archives • Record Linkage • Biological measures • Data security (ISO27001) • Disclosure control 13 15 www.cls.ioe.ac.uk

  16. Institute of Education University of London 20 Bedford Way London WC1H 0AL Tel +44 (0)20 7612 6000 Fax +44 (0)20 7612 6126 Email info@ioe.ac.uk Web www.ioe.ac.uk Any questions? 16 16

More Related