slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
DataCite—Making Datasets Citable Jan Brase DataCite PowerPoint Presentation
Download Presentation
DataCite—Making Datasets Citable Jan Brase DataCite

Loading in 2 Seconds...

play fullscreen
1 / 18

DataCite—Making Datasets Citable Jan Brase DataCite - PowerPoint PPT Presentation

  • Uploaded on

DataCite—Making Datasets Citable Jan Brase DataCite. What if scientific data would be citable?. High visability of the data. Easy re-use and verification. Scientific reputation for the collection and documentation of content (Citation Index)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'DataCite—Making Datasets Citable Jan Brase DataCite' - roger

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
what if scientific data would be citable
What if scientific data would be citable?

High visability of the data.

Easy re-use and verification.

Scientific reputation for the collection and documentation of content (Citation Index)

Encouraging the Brussels declaration on STM publishing

Avoiding duplications

Motivation for new research


DOI names for citations

  • URLs are not persistent
  • (e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5).

Digital Object Identifiers (DOI names) offer a solution

  • Mostly widely used identifier for scientific articles
  • Researchers, authors, publishers know how to use them
  • Put datasets on the same playing field as articles

  • Dataset
  • Yancheva et al (2007). Analyses on sediment of Lake Maar. PANGAEA.
  • doi:10.1594/PANGAEA.587840
how to achieve this
How to achieve this?

Science is global

it needs global standards

Global workflows

Cooperation of global players

Science is carried out locally

By local scientist

Beeing part of local infrastrucures

Having local funders


Global consortium carried by local institutions

focused on improving the scholarly infrastructure around datasets and other non-textual information

focused on working with data centres and organisations that hold content

Providing standards, workflows and best-practice

Initially, but not exclusively based on the DOI system

Founded December 1st 2009 in London

datacite members

1. Technische Informationsbibliothek (TIB)

2. Canada Institute for Scientific and Technical Information (CISTI),

3. California Digital Library, USA

4. Purdue University, USA

5. Office of Scientific and Technical

Information (OSTI), USA

6. Library of TU Delft,

The Netherlands

7. Technical Information

Center of Denmark

8. The British Library

9. ZB Med, Germany

10. ZBW, Germany

11. Gesis, Germany

12. Library of ETH Zürich

13. L’Institut de l’Information Scientifique

et Technique (INIST), France

14. Swedish National Data Service (SND)

15. Australian National Data Service (ANDS)

16. Conferenza dei Rettori delle Università Italiane (CRUI)

17. National Research Council of Thailand (NRCT)

DataCite members

Affiliated members:

1. Digital Curation Center (UK)

2. Microsoft Research

3. Interuniversity Consortium for Political and Social Research (ICPSR)

4. Korea Institute of Science and Technology

Information (KISTI)

5. Bejiing Genomic Institute (BGI)


7. Harvard University Library


What type of data are we talking about?

Earth quake events => doi:10.1594/GFZ.GEOFON.gfz2009kciu

Climate models => doi:10.1594/WDCC/dphase_mpeps

Sea bed photos => doi:10.1594/PANGAEA.757741

Distributes samples => doi:10.1594/PANGAEA.51749

Medical case studies => doi:10.1594/eaacinet2007/CR/5-270407

Computational model => doi:10.4225/02/4E9F69C011BC8

Audio record => doi:10.1594/PANGAEA.339110

Grey Literature => doi:10.2314/GBV:489185967

Videos => doi:10.3207/2959859860

Anything that is the foundation

of further reserach

is research data

Data is evidence

Over 1,700,000 DOI names registered so far

DataCite Metadata schema published (in cooperation with all members)

DataCite MetadataStore

DataCite in 2013

datacite search
Searchterm: *

Searchterm: uploaded:[NOW-7DAY TO NOW]

Searchterm: relatedIdentifier:*

Searchterm: relatedIdentifier:issupplementto\:10.1029*


DataCite search
oai and statistics
OAI Harvester

DataCite statistics (resolution and registration)

OAI and Statistics
datacite content service
Service for displaying DataCite metadata

Different formats (BibTeX, RIS, RDF, etc.)

Content Negotation (through MIME-Typ)

Access through DOI proxy (

First implemented by CNRI and CrossRef:


DataCite Content Service
content negotiation
Optimized for m2m communication using the accept header of the http protocol

curl -L -H "Accept: MIME_TYPE"

Try a shortcut out in any webbrowser:

Content negotiation
resolving to the citation

Li, j; Zhang, G; Lambert, D; Wang, J (2011): Genomic data from Emperor penguin. GigaScience.

Resolving to the citation
resolving to the rdf metadata

<rdf:RDF xmlns:rdf="" xmlns:owl="" xmlns:j.0="" > <rdf:Description rdf:about=""> <j.0:identifier>10.5524/100005</j.0:identifier> <j.0:creator>Li, J</j.0:creator> <j.0:creator>Zhang, G</j.0:creator> <j.0:creator>Wang, J</j.0:creator> <owl:sameAs>doi:10.5524/100005</owl:sameAs> <owl:sameAs>info:doi/10.5524/100005</owl:sameAs> <j.0:publisher>GigaScience</j.0:publisher> <j.0:creator>Lambert, D</j.0:creator> <j.0:date>2011</j.0:date> <j.0:title>Genomic data from the Emperor penguin (Aptenodytes forsteri)</j.0:title> </rdf:Description></rdf:RDF>

Resolving to the RDF metadata
example of use
This allows persistent identification of RDF statements!

Implemented for all over 45 million CrossRef and DataCite DOI names

Example of use:

DOI Citation Formatter

Example of use

2012: STM, CrossRef and DataCite Joint Statement

  • To improve the availability and findability of research data, the signers encourage authors of research papers to deposit researcher validated data in trustworthy and reliable Data Archives.
  • The Signers encourage Data Archives to enable bi-directional linking between datasets and publications by using established and community endorsed unique persistent identifiers such as database accession codes and DOI's.
  • 3. The Signers encourage publishers and data archives to make visible or increase visibility of these links from publications to datasets and vice versa


The dataset:

Storz, D et al. (2009):

Planktic foraminiferal flux and faunal composition of sediment trap L1_K276 in the northeastern Atlantic.

Is supplement to the article:

Storz, David; Schulz, Hartmut; Waniek, Joanna J; Schulz-Bull, Detlef; Kucera, Michal (2009): Seasonal and interannual variability of the planktic foraminiferal flux in the vicinity of the Azores Current.

Deep-Sea Research Part I-Oceanographic Research Papers, 56(1), 107-124,

thank you
Thank you!

See you


19th – 20th

in Washington DC