1 / 25

UK Digital Curation Centre : enabling research data management at the coalface

UK Digital Curation Centre : enabling research data management at the coalface. Dr Liz Lyon Associate Director DCC / Director UKOLN University of Bath, UK. Overview. Moving data across boundaries : structural science Managing data in institutions : emerging DCC tools

edna
Download Presentation

UK Digital Curation Centre : enabling research data management at the coalface

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UK Digital Curation Centre : enabling research data management at the coalface Dr Liz Lyon Associate Director DCC / Director UKOLN University of Bath, UK

  2. Overview Moving data across boundaries : structural science Managing data in institutions : emerging DCC tools Making data count : publication and attribution

  3. “Bridging the chasm” between the local laboratory bench and large scale facilities e.g. DIAMOND synchotron Develop Integrated Information Model Use cases and Inter-disciplinary Pilots Cost-benefit analysis: before and after http://www.ukoln.ac.uk/projects/I2S2/

  4. Structural Sciences Infrastructure

  5. An Idealised Scientific Research Activity Lifecycle Model Scholarly Knowledge Write Proposal Publications Database Publish Research Research Concept and/or Experiment Design Citations, References Discover, Access, Validate, Reuse & Repurpose Data (include DMP) Research Outputs Papers, articles, presentations, reports Peer-review Proposal Peer Review IPR, Embargo & Access Control Comments, annotations, ratings etc. Prepare Manuscript Archive, Preservation & Curation (OAIS conformant; Representation Information etc.) Start Project User registration data; Instrument allocation data etc. Prepare Supplementary Data Documentation, Metadata & Storage (Reference, Provenance, Context, Calibration etc.) Acquire Sample Results Data Processed Data Derived Data Raw Data Risk assessment data; other sample data Write Usage Report Interpret & Analyse Results Data Process & Analyse Derived Data Check & Clean Raw Data Conduct Experiment Generate, Create, & Collect Raw Data Appraisal & Quality Control Programs (generate customised software) Research Activity Administrative Activity Information Flow KEY: Curation Activity Publication Activity

  6. Existing work : mappings and gaps Bibliographic records (FRBR, SWAP) Research Management (Cerif?) DC, Ontologies Curation (OAIS, PREMIS?) Data Management and Provenance (CSMD, OPM?) PROCESS Software descriptions (??) Slide : Brian Matthews, STFC

  7. Integrated Information Model • Focus on Open Methodology • Develop Data Model • Join up to other Data Model work : • OreChem • Data Conservancy • Linked data approach • http://www.ukoln.ac.uk/projects/I2S2/

  8. Requirements Analysis Report “…it is apparent that the greatest need is for a robust data management infrastructure which supports each researcher in capturing, storing, managing and working with all the data generated during an experiment. Internal sharing of research data amongst collaborating scientists … is also a primary concern as is a requirement for access to research data in the long run so that a researcher … can return to and validate the results well into the future.”

  9. INCREMENTAL Project • Institutional perspective : Scoping study • Creating & organising data • Storage and access • Back-up • Preservation • Sharing and re-use

  10. “While many researchers are positive about sharing data in principle, they are almost universally reluctant in practice. ..... using these data to publish results before anyone else is the primary way of gaining prestige in nearly all disciplines.” http://www.flickr.com/photos/mattimattila/3003324844/ The majority of people felt that some form of policy or guidance was needed.... Incremental Project Report, June 2010

  11. Emerging funderrequirements

  12. Data types, formats, standards, capture • Ethics and Intellectual Property • Access, sharing and re-use • Short-term storage & data management • Deposit & long-term preservation • Adherence and review

  13. DMP Online Currently updating Version 2.0 Version 3.0 summer 2010 http://www.dcc.ac.uk/dmponline

  14. Making DMPs work : the start of a long process… • Embed DMPs inresearch lifecycles / activity model as the norm • Code of Conduct for Research • Assess & review DMPs (not just the science content of proposals) • Educate reviewers (DCC guidance for social science in prep) • Manage compliance • Infrastructure to share DMPs • Analyse cost-benefits

  15. An Idealised Scientific Research Activity Lifecycle Model Scholarly Knowledge Write Proposal Publications Database Publish Research Research Concept and/or Experiment Design Citations, References Discover, Access, Validate, Reuse & Repurpose Data (include DMP) Research Outputs Papers, articles, presentations, reports Peer-review Proposal Peer Review IPR, Embargo & Access Control Comments, annotations, ratings etc. Prepare Manuscript Archive, Preservation & Curation (OAIS conformant; Representation Information etc.) Start Project User registration data; Instrument allocation data etc. Prepare Supplementary Data Documentation, Metadata & Storage (Reference, Provenance, Context, Calibration etc.) Acquire Sample Results Data Processed Data Derived Data Raw Data Risk assessment data; other sample data Write Usage Report Interpret & Analyse Results Data Process & Analyse Derived Data Check & Clean Raw Data Conduct Experiment Generate, Create, & Collect Raw Data Appraisal & Quality Control Programs (generate customised software) Research Activity Administrative Activity Information Flow KEY: Curation Activity Publication Activity

  16. Incentives? Data citation, credit, metrics, attribution

  17. Journal Article Workflow Visualisation Model Data Annotation Concept Complexity : what are we citing? Macro Micro / Nano Attribution granularity

  18. Integrative genomics Gene expression & clinical traits data in Sage Commons Genome-Wide Association Studies (GWAS) Large-scale predictive network models of disease Co-expression and Bayesian (probabilistic graph) networks Complex data analysis pipelines

  19. Large-scale predictive network models of disease • Sage Pipeline • Multiple datasets • Visualise: Cytoscape • Workflow: Taverna

  20. Functionality? How do we cite? • Persistent identification - URIs • Identifier-agnostic framework • Resilient resolution service • Multi-directional linking e.g. to peer-reviewed paper, to datasets • Version control, provenance

  21. Take homes... Infrastructure : seamless & cost-effective Open Methodology : emerging Data Model Researchers need help with data management Data Management Plans : DCC DMP online tool We need to incentivise data management Citation Framework : assure credit & attribution

  22. Thank you… Chicago Mart Plaza, 6-8 December 2010 www.dcc.ac.uk

  23. © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related