1 / 12

UKOLN is supported by:

Building a Data Repository to Meet an Institution’s Needs. University of Bath – JISC – Research360. blogs.bath.ac.uk/research360. Catherine Pink Data Scientist, UKOLN Open Repositories 2012. UKOLN is supported by:. Existing Research Infrastructure.

fishera
Download Presentation

UKOLN is supported by:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building a Data Repository to Meet an Institution’s Needs University of Bath – JISC – Research360 blogs.bath.ac.uk/research360 Catherine Pink Data Scientist, UKOLN Open Repositories 2012 UKOLN is supported by:

  2. Existing Research Infrastructure • Applied science and engineering research focus • ‘Small Science’ in collaboration with industry • Publications Repository – ‘Opus’ • ePrints, >28,000 journal papers & theses • Research Information Management (CRIS) • Pure • Links finance, publications, HR and postgraduate student databases • File storage for current research data • Unstructured, short term storage • Data not accessible by third parties

  3. Why build an Institutional Data Repository? • Demand for access to publically funded research • “Science as an open enterprise” – Royal Society 2012 • “Innovation and Research Strategy for Growth” – UK government 2011 • UK funding councils data policy • RCUK Common Principles on Data Policy • EPSRC expectations for data (compliance by May 2015) • Ability to respond to UK HEI assessment exercises • Research Excellence Framework (REF) 2014 and 2020 • Linking funding with research outputs and impact

  4. Enable researchers to comply with data policies: Funder, publisher & institutional • Requires archive of data • Publication in data journals • Deposit in a disciplinary data repository • Publish on researcher’s own websites • ‘Bridge the gap’ where no other data repository is suitable • At Bath: Evaluate a range of data repository options • Modify existing eprints or Pure? Use an external solution e.g. Dataflow? Build a bespoke new system? • Plan for long term needs or short term solutions?

  5. Manage the University’s research data assets • Link research inputs to research outputs • Link publications to supporting data At Bath: Embed data repository in existing infrastructure • Integrate with CRIS & publications repository • As the CRIS develops, will the data repository be superseded? • Maintain a register of research data held elsewhere • Maintain a register of non-digital research data At Bath: Enable deposit of metadata stubs • Can we capture metadata from external data repositories? (Metadata crosswalk)

  6. Enable data to be discoverable, intelligible & reusable At Bath: Developing web interface • Inward facing for data deposit • Outward facing for data searching At Bath: Developing a core set of mandatorymetadata • What schema to use/adapt? • Harvest metadata from the CRIS, enabling researchers to focus on descriptive metadata (title, summary, keywords) • How to capture sufficient detail to enable re-use?Metadata? Accompanying file? Data publication?

  7. Ensure data can have Impact • Data must be persistent and citable • Enable researchers to gain recognition for data publication and reuse At Bath: The repository will generate persistent URLs for archived data At Bath: A recommended data citation will be produced for each dataset At Bath: Link with DataCite to produce Digital Object Identifiers (DOIs) • What format should the institutional component of the DOI take? • If/when to mint DOIs for embargoed data? • How to ensure that multiple DOIs are not issued for data deposited elsewhere?

  8. Retain data for mandated periods • Should all data be retained forever? • UK funders vary in their requirements for duration of data archive • At Bath: Use the repository to facilitate compliance • At Bath: Developing data disposal guidelines • Need to capture publication date and retention periods in metadata • Need to log the date of third party access to data • When retention periods expire, is deletion automatic or flagged for review

  9. Protect the interests of subjects of research • Living individuals covered under the UK Data Protection Act (1998) • Names and personal details must be removed before data can be published • At Bath: Data deposit will include a check box to confirm published data has been anonymized • At Bath: Restrict access to underlying data for published metadata • How to capture/publish conditions for access to data? • At Bath: Investigating how to archive consent forms with the data they accompany • Can these be digitised? • How to secure access to them?

  10. Protect the interests of our research partners • Ability to publish collaborative research data determined by project specific contracts • Some data my require embargo periods to enable commercialisation of research • It may be necessary to prevent publication of metadata if the potential for commercialisation would be damage by its release • At Bath: Investigating the specific requirements of our industrial partners • How to select whether data and/or metadata can be published during data deposit • Can we manage access of restricted data so that only key researchers and their external collaborators can (re)use it? • Can we automate production of licences for re-use if data ownership is set out in specific contracts?

  11. What can’t the institutional data repository do? • No ability to query individual datum • No ability to align multiple datasets • No peer review of data quality • Should we issue disclaimers with published data? • Difficulty handling ‘big data’

  12. Find out more: www.ukoln.ac.uk/projects/research360 blogs.bath.ac.uk/research360

More Related