martin donnelly digital curation centre university of edinburgh n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Martin Donnelly Digital Curation Centre University of Edinburgh PowerPoint Presentation
Download Presentation
Martin Donnelly Digital Curation Centre University of Edinburgh

Loading in 2 Seconds...

play fullscreen
1 / 43

Martin Donnelly Digital Curation Centre University of Edinburgh - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

What is research data and why manage it? An introduction to the issues and drivers, benefits and funder requirements. Martin Donnelly Digital Curation Centre University of Edinburgh. University of Stirling 25 March 2013. Running order. DEFINITIONS DRIVERS

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Martin Donnelly Digital Curation Centre University of Edinburgh' - ronald


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
martin donnelly digital curation centre university of edinburgh

What is research data and

why manage it?

An introduction to the issues and drivers,

benefits and funder requirements

Martin Donnelly

Digital Curation Centre

University of Edinburgh

University of Stirling

25 March 2013

running order
Running order
  • DEFINITIONS
  • DRIVERS
  • RULES AND (IN)EQUATIONS

- Group Exercise (30 mins)

slide4

Digital Curation Centre, est. 2004

  • Three partners: Edinburgh, Glasgow and Bath
  • Primary funder is JISC
  • Helping to build capacity, capability and skills in data management and curation across the UK’s higher education research community
  • DCC Phase 3 Business Plan

www.dcc.ac.uk

what kinds of data

What is Research Data?

…whatever is produced in research or evidences its outputs

  • Facts
  • Statistics
  • Qualitative
  • Quantitative
  • Unpublished research outputs
  • Discipline specific
What Kinds of Data?
slide6

A Data Gift?

“Data underpins our economy and our society - data about how much is being spent and where, data about how schools, hospitals and police are performing, data about where things are and data about the weather.”

Tim Berners Lee, director of W3C.

slide7
“the active management and appraisal of data over the lifecycle of scholarly and scientific interest”

Data management is a part of good research practice

What is Research Data Management?

data is usually central to the process
Data is (usually) central to the process
  • The six datacentric phases of the research lifecycle
slide9

http://www.google.co.uk/imgres?q=illumina+bgi&hl=en&client=firefox-a&hs=Jl2&rls=org.mozilla:en-GB:official&biw=1366&bihhttp://www.google.co.uk/imgres?q=illumina+bgi&hl=en&client=firefox-a&hs=Jl2&rls=org.mozilla:en-GB:official&biw=1366&bih

Proliferation

Data...

Data...

http://www.flickr.com/photos/thinkmulejunk/352387473/

http://www.flickr.com/photos/charleswelch/3597432481//

http://www.flickr.com/photos/wasp_barcode/4793484478/

http://www.flickr.com/photos/usfsregion5/4546851916//

slide10

Why manage HE research data?

  • Research integrity (defend findings)
  • Research impact (linking data and publication, making data citable)
  • Supports / enables reuse, which keeps funders happy
  • Maximises value and increases ROI, which keeps govt happy
  • Helps to meet regulatory requirements
  • Can control costs (via capacity planning etc)
attitudes approaches
Attitudes / approaches
  • The term “research data” means different things to different people in HE
  • Researchers may care enormously about their data, so much so that they worry about it going out into the world on its own
  • Others (e.g. those with responsibility for compliance) may worry about it not going out into the world, or going out when it shouldn’t / underdressed
  • Some may not recognise the relevance of ‘data’ in what they do…
slide12

“Data sharing was more readily discussed by early career researchers.”

“While many researchers are positive about sharing data in

principle, they are almost universally reluctant in practice. ..... using these data to publish results before anyone else is the

primary way of gaining prestige in nearly all disciplines.”

INCREMENTAL Project

open to all case studies of openness in research
Open to all? Case studies of openness in research

Choices are made according to context, with degrees of openness reached according to:

The kinds of data to be made available

The stage in the research process

The groups to whom data will be made available

On what terms and conditions it will be provided

Default position of most:

YES to protocols, software, analysis tools, methods and techniques

NO to making research data content freely available to everyone

Angus Whyte, RIN/NESTA, 2010

slide15

The data deluge

“Surfing the Tsunami”

Science: 11 February 2011

slide16

Public good

  • Preservation
  • Discovery
  • Confidentiality
  • First use
  • Recognition
  • Public funding
rcuk policy and code of conduct on the governance of good research conduct
RCUK Policy and Code of Conduct on the Governance of Good Research Conduct

Unacceptable research conduct includes mismanagement or inadequate preservation of data and/or primary materials, including failure to:

  • keep clear and accurate records of the research procedures followed and the results obtained, including interim results;
  • hold records securely in paper or electronic form;
  • make relevant primary data and research evidence accessible to others for reasonable periods after the completion of the research: data should normally be preserved and accessible for 10 yrs (in some cases 20 yrs or longer);
  • manage data according to the research funder’s data policy and all relevant legislation;
  • wherever possible, deposit data permanently within a national collection.

Responsibility for proper management and preservation of data and primary materials is shared between the researcher and the research organisation.

slide19

April 2011 - EPSRC Letter to VCs

  • EPSRC expects all those institutions it funds:
  • to develop a roadmap that aligns their policies and processes with EPSRC’s expectations by 1st May 2012
  • to be fully compliant with these expectations by1st May 2015

http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx

slide20

Institutional Policies

http://www.dcc.ac.uk/resources/policy-and-legal

government pressure
Government pressure…

6.9 The Research Councils expect the researchers they fund to deposit published articles or conference proceedings in an open access repository at or around the time of publication. But this practice is unevenly enforced. Therefore, as an immediate step, we have asked the Research Councils to ensure the researchers they fund fulfil the current requirements. Additionally, the Research Councils have now agreed to invest £2 million in the development, by 2013, of a UK ‘Gateway to Research’. In the first instance this will allow ready access to Research Council funded research information and related data but it will be designed so that it can also include research funded by others in due course. The Research Councils will work with their partners and users to ensure information is presented in a readily reusable form, using common formats and open standards.

http://www.bis.gov.uk/assets/biscore/innovation/docs/i/11-1387-innovation-and-research-strategy-for-growth.pdf

slide23

Making Public Data Accessible

“We have opened up much public data already, but need to go much further in making this data accessible. We believe publicly funded research should be freely available. We have commissioned independent groups of academics and publishers to review the availability of published research, and to develop action plans for making this freely available”

The Open Data Institute (ODI) will be the first of its kind, a pioneering centre of innovation, driven by the UK Government’s Open Data policy

slide24

Data for Impact

  • Research Excellence Framework (REF) measures researcher contributions and their impact
  • Has struggled in terms of its breadth when it comes to extending beyond paper-based metrics
  • Wariness of researchers to spend time on activity that doesn’t count to the REF
  • REF panels now allow submission of “a substantial, coherent and widely admired data set or research resource”
slide25

Data Citation

  • Data access raises visibility
  • Data with DOI = citeable research output
  • Data citations are good for researchers
slide27

STORAGE

MANAGEMENT

greenhouse storage
Greenhouse = storage

DATA

Horticulture = management

slide29

MANAGEMENT

SHARING

rule 1 don t share it all
Rule 1. Don’t Share It All

But! You generally need a reason NOT to share, e.g.

  • Commercial interests
  • Ethical concerns
  • Data Protection Act
various factors at play
Various factors at play…
  • Law(s) of the land(s) (FOI, DPA)
  • Government pressure
  • Funder policies (and expectations)
  • Publisher policies
  • Institutional policies
  • Disciplinary norms
  • Ethical considerations
  • Commercial interests / partnerships
rule 2 don t keep it all
Rule 2. Don’t Keep It All

Why not?

1. We probably can’t afford the costs of storage: increasing volumes outpace declining storage hardware costs

and

2. We probably can’t afford the time it will take to ensure it remains accessible/discoverable

According to: John Gantz and David Reinsel 2011 Extracting Value from Chaos,http://www.emc.com/digital_universe

slide33

“Keeping 2018’s data in S3 would cost the entire global GDP”

http://blog.dshr.org/2012/05/lets-just-keep-everything-forever-in.html

how to decide
How to decide?
  • Relevance to Mission – including any legal/funder requirement to retain the data beyond its immediate use.
  • Scientific or Historical Value – significance and relationship to publications etc.
  • Uniqueness – can it be found elsewhere / if we don’t preserve it, who will?
  • Potential for Redistribution – quality / IP / ethical concerns are addressed.
  • Non-Replicability – either impossible to replicate (e.g. atmospheric or social science data) or not financially viable.
  • Economic Case – costs of managing and preserving the resource stack up well against potential future benefits.
  • Full Documentation – surrounding / contextual information necessary to facilitate future discovery, access, and reuse is adequate.

How to Appraise & Select Research Data for Curation

Angus Whyte, Digital Curation Centre, and Andrew Wilson, Australian National Data Service (2010)

all together institutional engagements
All Together: Institutional Engagements

With funding from HEFCE we’re:

  • Working intensively with c. 20 HEIs to increase RDM capability
    • 60 days of effort per HEI drawn from a mix of DCC staff
    • Deploy DCC and external tools, approaches and best practice
  • Support varies based on what each institution wants/needs
    • Institution agrees a schedule of work with the DCC, and each assigns a primary contact / programme manager
  • Lessons and examples to be shared with the community

www.dcc.ac.uk/community/institutional-engagements

ie activities
IE activities

Piloting tools

Assessing needs

RDM roadmaps

Policy implementation

Policy development

slide37

Data Management Planning: roles and responsibilities for data across the research lifecycleGroup Exercise

Martin Donnelly and Jonathan Rans

Digital Curation Centre

University of Edinburgh

University of Stirling

25 March 2013

dmp c hecklist headings
DMP Checklist Headings

§1: Introduction and Context

§2: Data Types, Formats, Standards and Capture Methods

§3: Ethics and Intellectual Property

§4: Access, Data Sharing and Re-use

§5: Short-Term Storage and Data Management

§6: Deposit and Long-Term Preservation

§7: Resourcing

§8: Adherence and Review

§9: Agreement/Ratification by Stakeholders

§10: Annexes

Checklist for a Data Management Plan (Donnelly and Jones)

group exercise 20 minutes
Group exercise (20 minutes)

In groups of 4 or 5:

  • Select one of the DMP Checklist headings, and brainstorm all the stakeholders you think might be involved (and how/why) – be specific!
  • Remember to think of different stages of research: pre-award, in-project, post-project
  • We’ll have a short reporting/discussion session at the end

SECTIONS

§1: Introduction and Context

§2: Data Types, Formats, Standards and Capture Methods

§3: Ethics and Intellectual Property

§4: Access, Data Sharing and Re-use

§5: Short-Term Storage and Data Management

§6: Deposit and Long-Term Preservation

§7: Resourcing

§8: Adherence and Review

§9: Agreement/Ratification by Stakeholders

§10: Annexes

notes
Notes

N.B.

  • There are no ‘right’ or ‘wrong’ answers
  • All research projects are different
  • The DMP will depend upon the nature of the research AND the context (funder, domain, institution(s) etc)
  • DMPs are metadata and communication tools
questions and contacts
QUESTIONS AND CONTACTS

For more information:

  • Visit http://www.dcc.ac.uk
  • Email martin.donnelly@ed.ac.uk
  • Twitter @mkdDCC

This work is licensed under a Creative Commons Attribution 2.5 UK: Scotland License.

credits
CREDITS

Images:

Slide 3 (Definitions) – http://www.flickr.com/photos/dougbelshaw/

Slide 11 (Feet up) – http://www.flickr.com/photos/chaparral/

Slide 14 (Driver) – http://www.flickr.com/photos/rpmarks/

Slide 26 (Equations) – http://www.flickr.com/photos/billburris/

Slide 28 (Greenhouse) – http://www.flickr.com/photos/mykl/

Thanks also to DCC colleagues for their slides:

Kevin Ashley, Liz Lyon, Graham Pryor, Sarah Jones, Marieke Guy