1 / 12

Persistent identifiers – an Overview

Persistent identifiers – an Overview. Juha Hakala The National Library of Finland 2011-02-01. Traditional identifiers.

shika
Download Presentation

Persistent identifiers – an Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Persistentidentifiers – an Overview Juha Hakala The National Library of Finland 2011-02-01

  2. Traditional identifiers • Traditional (bibliographic) identifiers are systems like ISBN (International Standard Book Number) which provide unique and persistent identification for certain types of resources (books, serials, etc.) • They were designed for printed resources before the Internet was invented; thus the match with the digital resources and the Web may be a forced one • These identifiers are well established international standards with relatively clear roles • Not always clear how to apply them to the e-resources, except that identified resources themselves should be persistent

  3. Persistent identifiers (PIDs) • A new category of identifiers which are actionable in the Internet, that is, they enable persistent linking (resolution) to the resource or a surrogate such as a bibliographic description of the resource • Most PIDs are also “traditional” identifiers • When using a DOI, one can identify a book with DOI & an embedded ISBN or DOI with a local ID string • URN is the only exception from this; URNs must include a traditional identifier • URN namespaces inherit the rules of the traditional identifier used; there is no need to discuss the scope of the URN itself

  4. Traditional versus persistent identifiers • Assigning a traditionalidentifiersuch as ISBN is (shouldbe?) a controlledprocesswithpreciserules • What is identified, bywhom • Assigning a PID such as ARK mayormaynotbe a controlledprocess and the rules of applicationmaybevague • Sometimes the rulesaredifferent: • A bookmusthave just one ISBN, butitmayhavetwoPIDs(for instance, ARK and DOI) • The National Library of Finland usesHandles in itsDspacesystem, but URN is the ”official” identifier of theseresources

  5. Recommendations • Conflicts between the two identifier groups should be avoided at all cost • If a traditional identifier can be assigned to the resource, use that identifier as a part of the PID • It follows that PIDs that cannot (easily) incorporate traditional identifiers may cause problems • Any identifier (traditional / PID) should have explicit implementation guidelines • If no general guidelines exist rules must be developed locally; such rules should eventually be aligned in the level of the PID community

  6. Persistent identifiers and the Web: Cool URIs • From the library point of view, cool URIs (URLs) are not proper identifiers at all • The same resource may be available from many URLs • Over time, different resources or variant versions of the sameresource may be available in the same URI • There is absolutely no control over cool URI assignment • A user cannot know if a URI is cool or not (most of them aren’t) • Instead, cool URIs are just shelf marks • What is a realistic time frame for cool URI persistence? • Cool URIs can support only resolution; persistent identifiers can be more versatile in this respect • Match with the current / future long term preservation systems

  7. Services provided by PIDs • Basic question: whatservicesdoweneed? • Someexamples: • Findalllocations (URLs) related to the PID • Findbibliographic metadata related to the PID • Retrievethe preservationcommitment of the owningorganization (concerning the resource at hand) • Thereis no overallframework/ contextwithinwhich to designthe resolutionservices • Each PID provides a slightlydifferent set

  8. PID –based services in the future • Theoreticalbasiscouldbetwofold: • Functionalrequirements for bibliographicrecords (FRBR) –model: work, expression, manifestation • Currenttheory and practice of long-termpreservationbased on the migrationstrategy (and a long tail of manifestations for eachwork) • Thismeansitmustbepossiblefor instance to: • Findallworksrelatedto the work at hand • Findallexpressionsrelated to the work at hand • Findallmanifestations of the work at hand • Find out differencesbetweenthesemanifestations

  9. PID–based services in the future (2) • It should also be possible to • Find out who is preserving the resource • Retrieve the rights metadata related to the resource • Retrieve the preservation metadata related to the resource • Retrieve the most original version (the eldest preserved manifestation) of the resource • Retrieve the latest (and supposedly the easiest to use) manifestation of the resource • …

  10. Example: qualitative social scientific data set • The workitselfshouldbedescribed; one metadata elementshouldbe the PID • Expressions (translations to otherlanguages) shouldhavetheirownPIDs, linked to the worklevelrecord • Theremaybemultiplemanifestations (relationaldatabase, Excel table, etc.) of eachexpression; eachoneshouldhaveitsown PID, and thereshouldbelinks to the work / expressions • In thisenvironment, itwouldmakesense to providelinks to the work, and let the users to choose the mostappropriatemanifestation • Choice of the language, fileformat, etc.

  11. Recommendations (2) • Services supportedby PID systemsneed a facelift • Manysystemsweredesigned 10+ yearsago, whendigitalobject management systemswerestill in theirinfancy • Upgradesmustbedone in a non-destructivemanner (existingimplementationsmustbecompliantwith the new version) • Allaspects of PID systemsshouldbestandardized • SomePIDs(e.g. ARK and PURL) haveneverreached a standardstatus, and at bestonlyonepart of the system (identifiersyntax) hasbeenpublished as a standard • More(and better) opensourceimplementationsareneeded

  12. Conclusion • TherewillbemultiplePIDs in existencein the future(just liketherearenow) • Once a systemhasbeenchosen, youcannotgiveitup • PID supporters and cool URI proponentswillmostlikelycontinuetalkingpastoneanother for quitesometime, but: • Given the timeframe the national libraries& archivesmustpreserveresources (centuries) and the technicalcomplexity of thistask, coolURIsfallshort of the requirements in severalways; instead, PIDsmustbeused • PID systemsare to someextent ”work in progress”

More Related