1 / 26

Prof K. Schürer Director, UK Data Archive President, CESSDA Project Co-ordinator, EC FP7/ESFRI PPP

Social Science: Implementation and Management of a National Initiative- the example of the UK Data Archive. Prof K. Schürer Director, UK Data Archive President, CESSDA Project Co-ordinator, EC FP7/ESFRI PPP 1 st African Digital Curation Conference Pretoria, 12-13 February, 2008.

nola
Download Presentation

Prof K. Schürer Director, UK Data Archive President, CESSDA Project Co-ordinator, EC FP7/ESFRI PPP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Social Science:Implementation and Management of a National Initiative- the example of the UK Data Archive Prof K. Schürer Director, UK Data Archive President, CESSDA Project Co-ordinator, EC FP7/ESFRI PPP 1st African Digital Curation Conference Pretoria, 12-13 February, 2008

  2. UKDA history & overview • Archive established in 1968 (as ‘Data Bank’) • Funded by (then) SSRC to provide a service to UK HE sector • Initial focus on government survey data • New distributed service established 1 Jan. 2003 • Economic and Social Data Service (ESDS) • Mixed data types and formats • Specialist Qualidata unit and History Data Service • Still predominately funded to provide service for HE/FE sectors • ESRC, JISC, University of Essex • Project funding (EC, JISC, MRC, AHRC, etc.) • Since 2005 designated as ‘Place of Deposit’ by TNA • C.60 staff – mixture of staff expertise/skills

  3. UKDA now runs a family of services • ESDS • Census Programme Registration Service and Portal • History Data Service • HistPop • Rural Environment and Land Use Data Support Service • Nesstar Support Service Plus R&D projects: • SToRE • DeXT • CESSDA-PPP …

  4. Specialist data services • ESDS Government • ESDS International • ESDS Longitudinal • ESDS Qualidata Greater emphasis on: • value-added data and documentation • enhanced resource discovery • improved delivery services • support and training for the secondary use of data for research, learning and teaching • outreach and promotion

  5. Data RIs can work • 5,000+ datasets in the collection • 250+ new datasets are added each year (and new versions) • C.50,000 orders for data per year • C. 115,000 online sessions • C. 48,000 registered users • C. 17,000,000 page hits and C. 950,000 pdf downloads last year • Gateway to other collections (CESSDA, ICPSR)

  6. Who produces social science data? • Government agencies • HE/FE sector • Private sector • Within HE/FE not just ESRC funded • MRC, NERC, AHRC, Wellcome, Leverhulme, Rowntree • Increasing number of large digitisation projects • JISC, NOF • Increasing tendency for government agencies to contract out survey work to private sector (NatCen) • University sector tend not to get Government contracts • Devolution • Local Government

  7. Types of data • Quant and Quali • Surveys • Censuses • Administrative data • Also increasing amounts of ‘non-survey’ type data • Images • Sound • Video • Mixed media

  8. Selection guided by ‘use’ & ‘users’ • UKDA seeks to identify and acquire material within broad areas • Discipline coverage: at the broadest level, data and other electronic resources relating to society, in particular data about individuals or groups of individuals. This includes strategic social science and economic datasets e.g. unemployment statistics, major household surveys. • Geographic coverage: data across a broad geographic coverage focusing on the United Kingdom and cross-national datasets but including material from other countries where appropriate and in particular where these provide opportunities for comparative research e.g. European data. • Temporal coverage: there are no restrictions on temporal coverage, although pre-1945 accessions are acquired through History Data Service. • Time series and panel data: data are sought which create or add to a time series and/or panel survey. • Thematic coverage: to create a coherent body of materials relating to a particular discipline or field of enquiry e.g. health.

  9. But do you need to keep everything? • Curation needs to be informed by selection and appraisal • Resource allocation • Rights issues / technical issues • Short term vs long term • Aggregative value • Need to estimate what are the costs of NOT preserving something

  10. ESRC award holders • Have been required to offer data for access/curation for over 20 years (more recent MRC) • Some reluctance • Need to work with carrots and sticks • Move to data management life-cycle approach • Move to self-archiving

  11. Preservation in outline • Standard directory structure for complete dataset • Everything in one place • Consistent structure makes precisely locating information easy • Makes caching of specific information types simple • Allows future migration to other systems and formats easier • Data and documentation stored in portable format • Ability to freely and intelligently read on many platforms • Easier conversion to required format • Easier migration to new portable format

  12. Note and Read files • Study Number } Data format files (SPSS exp, SAS, SIR) Original deposited format mrdoc } Machine readable document files (pdf, word, ascii) Processing information and control files

  13. Media • Paper based • Punched card, paper tape or manuscripts • Magnetic • Various reel to reel and cartridge based (QIC) • Optical • e.g WORM, CDROM • Storage environment, age and quality of original material • Rescue methods and services

  14. Multi-copies, multi-formats, multi-media, multi-places • Two copies on separate media in main system • Up to 10 different versions of each individual file in the shadow area • Read only CD-ROM copy with error checking • Complete off-site near-line copy of all data with a high level security protection • Tape monitoring and refresh strategy • Front end copy to reduce load on main system

  15. Standards & Security • BS7799 - Information security • Machine room conforms to main fire and environmental control standards • Conforms to BS5588 parts 3 and 9, BS5839 parts 1, 2 and 3, BS5306 part 4 and BS7083 • Conformity to BS6266, BS4783 parts 4, 5 and 7 • BS5454 store room

  16. But – curation and long-term preservation should not (cannot?) happen in isolation

  17. Challenges • Legal issues • Move toward greater openness and transparency • Freedom of Information Act, 2000 • Yet greater concerns with confidentiality (Data Protection Act, 1994) • Statistics Act (approved researcher status) • Data issues • Confidentiality – secure data service • Ethical issues (REC’s) • ‘grey’ data/publications (research outputs)

  18. Challenges #2 • Technical issues • e-Science and data grid • Institutional repositories • ‘Self-archiving’ (UKDA-Store) • Facebook/google generation • Political issues • RCUK, OECD statements on research outputs • Resource issues • Who pays?

  19. General Aims of the CESSDA Research Infrastructure PPP • The focus of this project will be a major upgrade of the CESSDA RI to ensure that European Social Science and Humanities (SSH) researchers have access to, and gain support for, data resources they require to conduct research of the highest quality, irrespective of the location of either researcher or data within the European Research Area (ERA). • The project will also improve the CESSDA RI so that member organisations are able to transcend the limitations of their national resources through the creation of a common platform, mission and stronger form of integration in which expertise is genuinely pooled, shared and applied in a co-ordinated pan-European experience. • This project will facilitate the delivery of a fully-integrated data archive infrastructure for the SSH, allowing seamless, permanent access to as many data holdings across Europe as possible.

  20. Current Project Partners: ADP, Slovenia ADPSS. Italy CIS, Spain CNRS-RQ, Spain DANS, Netherlands DDA, Denmark DISC, Sweden EKKE, Greece FORS, Switzerland FSD, Finland GESIS, Germany NSD, Norway RODA, Romania SDA, Czech Republic TARKI, Hungary UK Data Archive, United Kingdom WISDOM, Austria Other CESSDA members: CEPS/INSTEAD, Luxembourg ESSDA, Estonia ISSDA, Ireland Belgium Portugal Map showing countries which are CESSDA members

  21. Associated Partners: • Europe • LSZDA (Latvian Databank of Social Sciences) • Academy of Sciences, RigaDBSR(Bank of Social Data) • Institute of Sociology, Academy of Sciences Russian Sociological Data Archive, • Social Science Data Archive at REGLO Slovak Archive of Social Data • Archive of the Institute for Sociology at the Slovak Academy of Sciences • Archive of Sociological Data, Warsaw • Rudjer Boskovic Institute, University of Zagreb • University of Belgrade, Serbia • Institute of Sociology of the National Academy of Sciences of Ukraine • Kyiv National Taras Shevchenko University, Ukraine • KIIS (Kiev International Institute of Sociology), Databank • CPIJM, Centre for Political Studies and Public Opinion Research, • University of Ss. Cyril and Methodius, • North America • ICPSR, Inter-university Consortium for Political and Social Research, Michigan, USA • SSHRC, Canada • Australia • ASSDA, Australian Social Science Data Archive, Canberra

  22. Work Package Descriptions and their Beneficiaries • WP1: Management and Co-ordination (UKDA) • WP2: Dissemination Management (UKDA) • WP3: Defining the Strategic, Financial, Governance and Legal Framework (UKDA) • WP4: Controlled Vocabularies (FSD)

  23. Work Package Descriptions and their Beneficiaries • WP5: Developing the CESSDA RI one-stop-shop Portal (NSD) • WP6: Strengthening the CESSDA RI (RODA) • WP7: Widening the CESSDA RI (GESIS) • WP8: Enhancement of data and metadata infrastructures for the CESSDA RI (GESIS)

  24. Work Package Descriptions and their Beneficiaries • WP9: Deepening the CESSDA RI by building an infrastructure for content harmonisation and conversion (GESIS) • WP10: Data collection, dissemination and access issues (CNRS-RQ) • WP11: Investigating the potential of grid technologies (UKDA) • WP12: Technical Support for the Preparatory Phase (NSD)

  25. WP 1 – Management and Co-ordination WP 12 – Technical WP 10 WP 6 WP 5 WP 7 WP4 WP 3 WP 11 WP 9 WP 8 WP13 & WP14 External WP 2 - Dissemination

  26. Thank you

More Related