Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Digital Object Identifiers for EOSDIS data HDF Workshop April 17, 2012 John Moses, ESDIS John.email@example.com
Assessment of identification schemes Study by ESIP Cluster on Preservation and Stewardship in 2009 Unique Identifier Unique Locator Citable Locator Scientifically Unique ID Adapted from Duerr, R. E., et al.. 2011 (submitted). On the utility of identification schemes for digital Earth science data: An assessment and recommendations. Earth Science Informatics. 2
Digital Object Identifier for EOS products • The DOI® system and the Handle System provide an Internet resolution service for unique and persistent identifiers of digital objects • Internet Infrastructure components owned by International DOI Foundation (IDF)– www.doi.org • A DOI consists of two part alphanumeric string • doi:[prefix]/[suffix]; for example doi: 10.5067/123; • Prefix 10 identifies the DOI registry; 5067 identifies the Registrant Agent • Suffix alphanumeric string 123 uniquely identifies the data item • The purpose in assigning DOIs to EOSDIS products is to provide a permanent data identifier for citation in publications • ESIP citation guideline using doi: • Doe, J. and R. Roe. 2001. The FOO Data Set. Version 2.3. The FOO Data Center. http://dx.doi.org/10.xxxx/notfoo.547983. Accessed 1 May 2011.
Implementing DOIs for EOSDIS • Develop ops concept through pilot processes • Guidelines for DOI suffix, location & citation information. • Request, assign, monitor DOIs, location & citation metadata • Add DOIs to DAAC product citation web pages • Imbed DOIs into product metadata at next reprocessing • HIRDLS, GLAS, AMSR-E data providers are in final reprocessing • Add DOIs to GCMD and ECHO through metadata updates • Add DOI metadata to NTRS for searchable documentation • Setup metrics collection from journal citation reports
Implementation in Interoperable Architectures Provenance collection DOI tools Provenance Services tools DOI NASA Technical Reports Server tools DOI Metadata flows in NASA Earth Science Data Systems
Attributes for embedding DOIs • Framework structures in HDF and netCDF • HDF global attribute name and value verses naming an identifier group (which would allow discovery of identifier types) • ECS CoreMetadata Product Specific Attributes in the AdditionalAttributes group section • netCDF file-level attribute name: “Id” and “naming authority” • Consider attribute names for DOI value: • Advantage to having two parts – a key code to indicated this is an identifier, and namespace that indicates the type/application of DOI; e.g., that it applies to the data product level (i.e., has same value for all granules/files of the series – a series identifier). • Hypothetical DOI example • Attribute name: identifier_product_DOI • Attribute value: 10.5067/Aura/HIRDLS/data1
DOI Registration and Guidelines • A DOI will be assigned for each EOSDIS standard data products • The DOI subscription holder (ESDIS) will provide location & citation metadata to DOI subscription provider (CDL EZID) and will be notified when the DOI has been registered • Ideally we want one DOI per data item but the registry does not preclude multiple registrations of similar data • New DOI metadata can be uploaded as frequently as desired • Typically when location or citation information changes • A major new version of the data product would be assigned a new DOI. DOIs of old versions that are no longer available would have updated locators that point to the new version (with explanation)
Guidelines for DOI suffix • The DOI itself should be a relatively short string so that users can read from printed material or display and key into a browser with minimum error. • The DOI suffix (ASCI characters with no spaces): • Would be a descriptive name of domain-specific structure that reflects the science data product contents • Should have some recognition by the research community, such as a semantic name or acronym, e.g., instrument/platform/campaign/investigation name or measurement parameter • Should help readers distinguish between published paper and dataset • Should not have organizational reference subject to change (i.e., publisher, archive, owner)
Member Institute using DataCite (RA):California Digital Library and EZID • EZID is a service providing researchers a way to manage identifiers persistently for datasets, files, and resources of all types. • The service is available via a machine to machine programming interface (an API) and as a web user interface. • Core functions: • Create a persistent identifier: DOI • Add object location (URL landing page, separate from citation) • Add citation metadata (DataCite repository, mandatory shown below) • Creator (person or organization) • Title (long name of dataset) • Publisher (holder of the data – organization making it available) • Publication Year (year when data was, or will be first available) • Update object location • Update object metadata
Registration Agent: DataCite • DataCite, established a scientific data application with IDF. • Service is run by open membership organization of gov and edu libraries. Focused on improving the scholarly infrastructure around datasets. • Most appropriate RA because of their focus on working with data centers to assign persistent identifiers to datasets leveraging the Digital Object Identifier (DOI) infrastructure. • United States Member Institutes • California Digital Library (Founding Member) • Recommended subscription provider because of bulk pricing and EZID Web/API services • Office of Scientific and Technical Information, US Department of Energy ( new Member Dec 2010) • Purdue University Libraries (Member) • Interuniversity Consortium for Political and Social Research - ICPSR (Associate Member) • Microsoft Research (Associate Member) TIB: German National Library of Science and Technology