1 / 43

Changing data landscape

Changing data landscape. Sarah Jones Digital Curation Centre, Glasgow sarah.jones@glasgow.ac.uk Twitter: @ sjDCC. HKUST RDM services workshop, 19-20 March 2019, Hong Kong. What is the Digital Curation Centre?.

darlaj
Download Presentation

Changing data landscape

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Changing data landscape Sarah Jones Digital Curation Centre, Glasgow sarah.jones@glasgow.ac.uk Twitter: @sjDCC HKUST RDM services workshop, 19-20 March 2019, Hong Kong

  2. What is the Digital Curation Centre? “a centre of expertise in digital information curation with a focus on building capacity, capability and skills for research data management across the UK's higher education research community” www.dcc.ac.uk

  3. What is happening at the coal face? Image CC-SA-ND by Bill Dickinson www.flickr.com/photos/skynoir/8270436894

  4. How many researchers make data open? 79% of researchers have made data openly available The State of Open Data 2017 Digital Science 2300 respondents worldwide only 1 in 10 provides their research data as open data for the public Researchers and their data (2015) eInfrastructures Austria 3026 Austrian respondents 29% 68% of researchers already share data or expect to do so in future Jisc DAF studies (2016) 1185 UK respondents 64% agree that they are willing to share their data Open Data: the researcher perspective (2017), Elsevier 1162 respondents worldwide 32% 18% 21%

  5. How do researchers share data? Less than 15% publish data in a repository. Elsevier: Open Data - the researcher perspective “When asked where they have published data, most commonly respondents had done so as an appendix to an article (just over 30%) with a data repository close behind (just under 30%) and 20% having published in a data journal.” Digital Science: The State of Open Data Over half only allow access on request. 54% share data by using external storage devices or email. eInfra Austria: Researchers and their data Of 13 methods stated, top 4 options for currently sharing data were: • Emailing data files (65%) • Cloud service e.g. Dropbox, Googledrive (59%) • Portable storage (35%) • Supplementary data (20%) Formal repository (public / institutional) c.12% Jisc DAF studies

  6. Opinions on sharing “While many researchers are positive about sharing data in principle, they are almost universally reluctant in practice. ..... using these data to publish results before anyone else is the primary way of gaining prestige in nearly all disciplines.” “Data sharing was more readily discussed by early career researchers.” Incremental project http://eprints.gla.ac.uk/54623/3/54623.pdf

  7. Why do researchers share data? “For more than half of the researchers, the most attractive incentives for sharing their data were increased visibility and impact, new cooperation opportunities, recognition in professional circles, as well as their contributions being regarded as scientific output.” eInfra Austria: researchers and their data Digital Science: The State of Open Data Jisc DAF studies

  8. Data storage and loss 17% of respondents had lost data More than one-third had experienced data loss. Strong preference to store on business/private computer, external hard drive & usb eInfra Austria: researchers and their data 36% had experienced loss and 83% of this was due to physical storage media Digital Science: The State of Open Data Jisc DAF studies

  9. Wellcome OA compliance rates

  10. Sharing of microarray data • Increase from c.5-35% in under a decade • Best-practice guidelines for sharing microarray data are fairly mature • Two centralized databases have emerged • Unusually strong data sharing requirements in some journals  Piwowar, H. (2011) Who Shares? Who Doesn't? Factors Associated with Openly Archiving Raw Research Data. PLOS One https://doi.org/10.1371/journal.pone.0018657

  11. Awareness of OS & initiatives European Commission (OSPP) Open Science Policy Platform. (2017) Providing researchers with the skills and competencies they need to practise Open Science. Report of the Working Group on Education and Skills under Open Science, doi: 10.2777/121253

  12. Greater impact Mandates Better science € OS advocacy How does this even help me or my career? Deadlines paperworkPRESSURE Too big to email… Dropbox? Not enough storage RDM issues

  13. Respondents mentioned 40 terms which were unclear to them in European Commission DMP Language is a barrier… “Researchers are not familiar with the following terms/phrases : Metadata, standards for metadata/data, ontologies, mapping with ontologies, interoperability, ... . All the ICT jargon” “With the help from Swedish National Data Service we could clarify many questions. Without this help we would not be able to finish the DMP.” Grootveld et al. (2018). OpenAIRE and FAIR Data Expert Group survey about Horizon 2020 template for Data Management Plans http://doi.org/10.5281/zenodo.1120245

  14. Drivers for change Image CC-BY-SA-ND by David D Wang https://www.flickr.com/photos/30326117@N08/3475108362

  15. Why are research data important to unis? “If an institution spent A$10 million on data, what would be the return? The answer is: more publications; an increased citation count; more grants; greater profile; and more collaboration.” Dr Ross Wilkinson, ANDS www.ariadne.ac.uk/issue72/oar-2013-rpt

  16. Research data: institutional crown jewels? http://www.flickr.com/photos/lifes__too_short__to__drink__cheap__wine/4754234186

  17. Data driven discovery Citizen science projects & public engagement Old weather project models climate change: Data for research, not from research

  18. Why make data available?

  19. Sharing leads to breakthroughs “It was unbelievable. Its not science the way most of us have practiced in our careers. But we all realised that we would never get biomarkers unless all of us parked our egos and intellectual property noses outside the door and agreed that all of our data would be public immediately.” Dr John Trojanowski, University of Pennsylvania www.nytimes.com/2010/08/13/health/research /13alzheimer.html?pagewanted=all&_r=0 • ...and increases the speed of discovery

  20. Validation of results “It was a mistake in a spreadsheet that could have been easily overlooked: a few rows left out of an equation to average the values in a column. The spreadsheet was used to draw the conclusion of an influential 2010 economics paper: that public debt of more than 90% of GDP slows down growth. This conclusion was later cited by the International Monetary Fund and the UK Treasury to justify programmes of austerity that have arguably led to riots, poverty and lost jobs.” www.guardian.co.uk/politics/2013/apr/18/uncovered-error-george-osborne-austerity

  21. Cut down on academic fraud • Stapel – 55 publications – “fictitious data” www.nature.com/news/2011/111101/full/479015a.html

  22. Benefits for you: sharing data increases citations! • Want evidence? • Piwowar, Vision – 9% (microarray data) • Drachen, Dorch, et al – 25-40%, astronomy • Gleditch, et al – doubling to trebling (international relations) • Open Data Citation Advantage • http://sparceurope.org/open-data-citation-advantage

  23. Increased use and economic benefit The case of NASA Landsat satellite imagery of the Earth’s surface: Up to 2008 Since 2009 Freely available over the internet Google Earth now uses the images Transmission of 2,100,000 scenes per year. Estimated to have created value for the environmental management industry of $935 million, with direct benefit of more than $100 million per year to the US economy Has stimulated the development of applications from a large number of companies worldwide • Sold through the US Geological Survey for US$600 per scene • Sales of 19,000 scenes per year • Annual revenue of $11.4 million http://earthobservatory.nasa.gov/IOTD/view.php?id=83394&src=ve

  24. Don’t undervalue research data

  25. Research data policy changes Image CC-BY-NC-SA by Tom Magllery www.flickr.com/photos/lwr/13442910354

  26. Data policy trends • Proliferation of policies • Make the landscape easier for researchers to navigate • More harmonisation needed • Clarifications needed when requirements conflict • Emphasis on data sharing more than RDM. Increasingly ‘open’ and ‘FAIR’ rhetoric • Research data policies often ‘aspirational’ and high-level • Need for more group guidelines and practical procedures • More researcher input when developing services & infrastructure

  27. Move towards openness Slide from Giulia Ajmone Marsan, Directorate for Science, Technology and Innovation, OECD

  28. Science as an open enterprise “Much of the remarkable growth of scientific understanding in recent centuries is due to open practices; open communication and deliberation sit at the heart of scientific practice.” Royal Society report calls for ‘intelligent openness’ whereby data are accessible, intelligible, assessable and usable. https://royalsociety.org/policy/projects/science-public-enterprise/Report

  29. G8UK - Endorses OA Open Data Charter Policy Paper 18 June 2013 “To the greatest extent and with the fewest constraints possible publicly funded scientific research data should be open, while at the same time respecting concerns in relation to privacy, safety, security and commercial interests, whilst acknowledging the legitimate concerns of private partners.” G8 Science Ministers Statement- (June 2013)

  30. Harmonisation UKRI principles & concordat

  31. Science Europe policy harmonisation • Voluntary alignment of RDM policies among funders in Europe • Core Requirements for DMPs • Criteria to select Trusted Repositories • Published a framework to support research communities in setting up protocols for the collection and management of data within specified disciplinary domains  • Hope Domain Data Protocols (DDPs) will become ‘DMP template’ for a given domain https://www.scienceeurope.org/policy/policy-areas/research-data

  32. Ultimately funders expect: • Data management plans • timely release of data • once patents are filed or on (acceptance for) publication • open data sharing • minimal or no restrictions if possible • FAIR data • documented and reusable • preservation of data • typically 5-10+ years if of long-term value

  33. Increasing harmonisation & coordination Image CC-BY-SA-ND by Fabrice Denis Photography https://www.flickr.com/photos/fabricedenisphotography/36062765374

  34. Global Open Science Commons data commons

  35. European Open Science Cloud • Key Messages http://doi.org/10.2777/1524

  36. Proposed EOSC governance structure Advise on the implementation Steerthe implementation Contribute to the implementation EU-funded projects Governance Board Stakeholders Forum Nationally-funded projects and initiatives MS/AC delegates and the European Commission Users, Service Providers, Public sector, Industry, SMEs, etc. ReviewsEndorses Orients Other projects and initiatives ProposesMonitors Reports Extended Coalition of Doers Working Groups Interact Executive Board WG WG WG European stakeholder organisations and individual experts WG Supports Supports Supports/coordinates w. EOSCSecretariat.euCoordination and Support Action

  37. What is the RDA? International member based organisation with more than 7,900 members globally representing 137 countries RDA is building the social and technical bridges that enable open sharing of data Vision: researchers and innovators openly sharing data across technologies, disciplines, and countries to address the grand challenges of society

  38. 7500 members & 137 countries in 5 years!

  39. The university dimension Image CC-BY-SA by Dawn Manser www.flickr.com/photos/dawnmanser/3532598208

  40. Annual RDM survey issued by DCC Income range percentiles - split into 3 groups across all 161 HEIs • 60 UK Higher Education Institutions responded to DCC survey 2015, of 132 invited • Research-active institutions well represented 77% Research income % of total 20% 3% Percentiles Briefing and links to data: http://www.dcc.ac.uk/survey2015

  41. Who has what in place? Policy and strategy Business planning 87% 13% 50% Data Mgmt Planning * 38% 40% Data cataloguing Managing active data 18% 22% Data preservation Governing access & reuse 63% Skills training & consultancy % indicating ‘rolling out’ or ‘embedding’ * referred to ‘access & storage systems’ in survey

  42. Components of RDM services www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services

  43. Thanks for listening! DCC guidance, tools and case studies: www.dcc.ac.uk/resources Follow us on twitter: @digitalcuration and #ukdcc

More Related