slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
DOI SYSTEM: OVERVIEW PowerPoint Presentation
Download Presentation
DOI SYSTEM: OVERVIEW

Loading in 2 Seconds...

play fullscreen
1 / 42

DOI SYSTEM: OVERVIEW - PowerPoint PPT Presentation


  • 204 Views
  • Uploaded on

DOI SYSTEM: OVERVIEW. International DOI Foundation. doi>. Outline / Key concepts. Origins of the DOI System Current position of DOI System activities Persistence Actionable identification Interoperability System components Standardisation DOI System applications. doi>.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'DOI SYSTEM: OVERVIEW' - Patman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

DOI SYSTEM: OVERVIEW

International DOI Foundation

slide2

doi>

Outline / Key concepts

  • Origins of the DOI System
  • Current position of DOI System activities
  • Persistence
  • Actionable identification
  • Interoperability
  • System components
  • Standardisation
  • DOI System applications
slide3

doi>

Further reading

DIGITAL OBJECT IDENTIFIER (DOI®) SYSTEM

Article in: Encyclopedia of Library and Information Sciences

(forthcoming) third edition (Taylor & Francis)

http://www.doi.org/overview/070710-Overview.pdf

the doi system
The DOI System

doi>

  • DOI (Digital Object Identifier) System: www.doi.org
  • Initially developed from the publishing industry but now wider
    • a non-profit collaboration to develop infrastructure for persistent identification and management of content
    • Approx 2000 user organisations (through agencies)
    • CrossRef (scholarly publishers); EC; science data; major ISBN agencies; etc.
  • Currently being standardised in ISO (TC46/SC9)
    • the home of ISBN etc “content identifiers”
  • One application of the Handle System®
    • adds to it additional features – social and technical infrastructure, policies, metadata management
    • focus on one area of interest (content/intellectual property)
    • offers a specific data model based on indecs (discussed later)
    • DOI System technology equally applicable for parties and licences
slide5

doi>

1966: ISBN began “identification numbering”

  • “In 1965 the largest British book wholesaler WH Smith announced their intention to move their wholesaling and stock distribution operation to a purpose built warehouse in Swindon [in 1967]. To aid efficiency they would install a computer, and this would necessitate the giving of numbers to all books held in stock…”
  • “The idea of numbering books is not new. One British publishing house has been giving numbers to its books for nearly a hundred years. What is an entirely new concept, however, is that numbers should be given to all books; that these numbers should be unique and non-changeable; and that they should be allocated according to a standard system…”
    • (David Whitaker, The Bookseller, May 27 1967)
slide6

doi>

ISO continues “identification numbering”

http://www.collectionscanada.ca/iso/tc46sc9/

Information and Documentation - Identification and Description

ISO 2108 International Standard Book Numbering (ISBN)

ISO 3297 International Standard Serial Number (ISSN)

ISO 3901 International Standard Recording Code (ISRC)

ISO 10444 International Standard Technical Report Number (ISRN)

ISO 10957 International Standard Music Number (ISMN)

ISO 15706 International Standard Audiovisual Number (ISAN)

ISO 15707 International Standard Musical Work Code (ISWC)

ISO Project 20925 Version identifier for Audiovisual Works (V-ISAN)

ISO Project 21047 International Standard Text Code (ISTC)

ISO Project 27729 International Standard Name Identifier

ISO Project 26324 Digital Object Identifier System

  • 1. trend towards identifiers of abstract entities
  • all ISO TC46SC9 identifiers now carry mandatory structured metadata
    • to specify the item identified (either from start, or when revised)
web related identifiers
Web-related identifiers

doi>

  • URI, URL and URN
    • Not sophisticated enough alone for content management
    • Additional techniques: PURLs, RDF, SW, ARK, N2T, Handle, etc
  • Related standards:
  • Open URL
    • A syntax to create web-transportable packages of metadata and/or identifiers about an information object
    • Not an identifier, but a complementary technology for appropriate redirection of identifier resolution
    • in use with URLs, Digital Object Identifiers (DOI names)
  • "info" URI Registry
    • Turn legacy identifiers (e.g. info:lccn/2002022641) into URLs
    • IETF RFC 4452: The "info" URI Scheme for Information Assets with Identifiers in Public Namespaces. http://info-uri.info/

Note: DOI System is not designed ONLY for the web, but it is the current most common digital environment.

slide8

doi>

Terminology: the over-used term “identifier”

“Identifier” as numbering schemes

  • Registries
  • Normally central control, commitment
  • Examples: ISBN, EAN bar codes, IANA, ITU phone numbering plans etc
  • Normallyfocus on attributes (metadata)

“Identifier” as syntax specifications

  • Normally little central control
  • e.g URI (URL); MPEG-21 DII
  • Few structured attributes, low barriers to entry
  • Some more structured than others: e.g. URN, info URI

Other confusions:

  • Some practical systems use both schemes and specifications
  • Representations and interactions between different schemes and specifications:
    • e.g. an ISBN can be expressed as a URL, as an EAN bar code, a DOI name, etc
  • Identifier as “system” versus as a “unique label”
  • Schemes begin to be used for things outside scope
slide9

doi>

1995: Armati Report

  • Information Identification - a report to STM publishers (Mar 95)
  • Uniform File Identifiers - a report to AAP publishers (Oct 95)
  • “..need to unify in one scheme music, audiovisual, document management, internet engineering, digital libraries, copyright registration and object based software” [i.e. web was not the focus]
  • “..maximise utility of digital objects; enable core interoperability; enable integration of disparate sourced data; ability to trace ownership to manage rights”
  • requirements:
    • protect legacy investments
    • enable interoperability
    • provide link between digital and physical
    • maintain privacy of users
    • have persistence
    • standard syntax
    • global scalability
    • global uniqueness
    • global meaning
  • Led to launch of DOI System initiative(AAP committee, Uniform File Identifier)
slide10

Activity

tracking

Full

implementation

Initial

implementation

doi>

(1) DOI System: development in three tracks

Other efforts, standards, etc

Metadata

Single redirection

(persistent identifier)

Multiple resolution

A continuing development activity

slide11

International

DOI

Foundation

Operating

Federation

Agencies

members

Clients

doi>

(2) Creation of an organisation

Key driver: spend on development

Key driver: cost reduction

&

slide12

doi>

Current DOI System activity (Oct 2007)

  • Source: http://dx.doi.org/10.1000/127 (restricted access)
slide13

IDF

Incentive scheme: large discounts

per DOI name for large numbers of registrations,

e.g. 25% -> 90%+

RA

IDF has no role in this

C

doi>

Current strategy

  • Focus on enabling current RAs to generate more DOI names
  • New RAs in new areas
  • Social infrastructure development (RA policies)
  • Business model:
slide14

doi>

Persistence

  • “It is intended that the lifetime of a [persistent identifier] be permanent. That is, the [persistent identifier] will be globally unique forever, and may well be used as a reference to a resource well beyond the lifetime of the resource it identifies or of any naming authority involved in the assignment of its name.”
  • [Persistent Identifier] = URN in IETF RFC 1737: Functional Requirements for Uniform Resource Names. (http://www.ietf.org/rfc/rfc1737.txt)
  • Persistence is more a matter of social issues than technical solutions
  • Technology can assist.
slide15

doi>

Persistent identifier applications

  • ISSUES
  • What are we identifying with this identifier? [content not just bits]
  • What are we resolving to from this identifier?
  • What, if any, explicit metadata are we making available?
  • How will the cost of providing the infrastructure be met?
  • THEMES
  • Identification of entities of all forms
    • To be used in variety of contexts
  • Appropriate use of metadata at appropriate level
    • Development of ontology tools to describe entity relationships
  • Persistent  Interoperable  Precise  Automation  Logic
slide16

doi>

Persistent identifier applications

  • DOI name = Digital Object Identifier Name
  • An implemented identifier system
  • Packaged system of components
  • Principles of persistent identification including semantically consistent interoperation
  • Implemented identifier systems
    • actionable labels following a specification
    • e.g. Bar code system, DOI System
    • “if you use this system, then the label IS actionable”
    • Packaged system offering label + tools + business model
    • A packaged system is not essential, but is convenient
slide17

Syntax

Policies

doi>

Data Model

Resolution

slide18

DOI name syntax

can includeany

existing identifier,

formal or informal,

of any entity

  • An identifier “container” e.g.
    • 10.1234/5678
    • 10.5678/978-0-7645-4889-4
    • 10.2224/2007-01-0verview-DOI
  • NISO standard Z39.84
  • First class object: name
    • Not “intelligent” as a label
    • Cannot tell what it is from looking at the DOI name
  • Redirection through resolution
slide19

URL

DOI

URL

DOI

URL

DOI

DOI

DOI

URL

URL

URL

URL

URL

DOI

DOI

DOI

URL

DOI

URL

DOI

URL

DOI

URL

DOI

URL

DOI

URL

DOI

Assigner

Content

DOI

directory

DOI

directory

DOI

directory

Content

slide20

Resolve from DOI name to data

    • initially Location (URL) – persistence
  • May be to multiple data:
    • Multiple locations
    • Metadata
    • Services
    • Extensible
  • Uses the Handle System
    • - Implementing URI/URN concept
    • - Advantages of granularity, scalability, administrative delegation, security, etc

Resolution allows a

DOI name to link to

any & multiple pieces

of current data

slide21

doi>

Why do we need “metadata”?

  • Having an identifier alone doesn’t help – we want to know “what is this thing that’s identified?”
    • we want to know precisely
    • precisely enough for automation
  • There’s lots of metadata already: which should be (re-) used
  • People use different schemes: need to map from one scheme to another (e.g. does “owner” in scheme A mean “owner” in scheme B?)
slide22

doi>

DOI System data model

  • The underlying model of how data within the DOI System relates to other data
  • Two components
    • Data Dictionary + DOI Application Profile Framework
  • Data Dictionary
    • Provides tool for precise description of entity through metadata (and mapping to other schemes).
  • DOI Application Profile framework.
    • Provides means of relating entities: grouping entities and expressing relationships
    • A mechanism for grouping DOI names with similar properties
  • DOIs, APs, and DOI System services built using these:
    • have many-to-many relationships: expressed through multiple resolution (handle)
    • may have precise descriptions: expressed through metadata in Data Dictionary
slide23

Entities are

identified by

DOI names

APs have one or more

Services

The properties of groups of

DOI names are defined as APs

Services have

definitions

Application Profile

965

Service Instance

Service Definition

965

876

876

456

456

453

453

Application Profile

Service Instance

Service Definition

784

784

369

369

908

Service Instance

Service Definition

908

doi>

Application Profile (AP) Framework

slide24

Application Profile

453

Service Instance

Service Definition

784

doi>

Application Profile (AP) Framework

Entities are

identified by

DOI names

APs have one or more

Services

The properties of groups of

DOI names are defined as APs

Services have

definitions

Application Profile

965

Service Instance

Service Definition

965

876

876

456

456

453

453

Application Profile

Service Instance

Service Definition

784

784

369

369

908

Service Instance

Service Definition

908

  • New APs and services may be created or made available
  • One change to an AP to affect all DOI names within that AP
slide25

Metadata tools:

    • a data dictionary to define
    • a grouping mechanism to relate
  • Necessary for interoperability
    • “Enabling information that originates in one context to be used in another in ways that are as highly automated as possible”.
  • Able to use existing metadata
    • Mapped using standard dictionary
    • can describe any entity at any level of granularity

<indecs>

Data Dictionary

+

DOI AP framework

slide26

DOI System

policies

allow any

business model

for practical

implementations

  • Implementation through IDF
    • Governance and agreed scope, policy, “rules of the road” , central tools (dictionary, resolution mechanism)
    • Cost-recovery (self-sustaining)
  • Registration agencies (“franchise”)
    • Each can develop own applications
    • Use in “own brand” ways appropriate for their community
    • Examples: CrossRef, OPOCE
slide27

doi>

Costs

  • For an everyday user:
    • Free: any DOI name may be resolved by anyone
    • No obligations
  • For an assigner:
    • Must work through a Registration Agency
    • Cost depends on application: DOI registration is bundled in
      • e.g. CrossRef – crosslinking of citations: for a publisher, from $275 per year (2008)
  • For a Registration Agency:
    • Must be a full RA member of the International DOI Foundation
    • Fees based on volume
  • Developing, managing, implementing, standardising, etc:
    • Paid for by International DOI Foundation (open to anyone)
slide28

Identify

Describe

DOI name syntax can includeany existing identifier,formal or informal, of any entity

eg

DOI name metadata can be of any type, standard or proprietary

eg

OnixForBooks

OnixForSerials

IEEE/LOM

MARC

Dublin Core

Proprietary scheme

(but if you want to interoperate with anyone else in the DOI System network, you map to the <indecs> Data Dictionary (iDD).

10.2341/0-7645-4889-1

10.5678/978-0-7645-4889-4

10.1000/ISBN 0764548891

10.1234/OPOCE_presentation

10.2224/2007-1-29-CENDI-DOI

More than an identifier…

doi>

Resolve

Handle resolution technology allows you to access any kind of Service associated with your DOI name.

e.g.

A package of services is defined for an Application Profile

These services depend on metadata

slide29

doi>

Standardisation of DOI System (ISO TC46/SC9)

  • DOI System as ISO TC46 standard: entire DOI System
  • Refer to component tools (Handle System, Data Dictionary, etc) as informative references
  • Aim to separate existing “DOI Handbook” into formal standard (ISO) and operating manual (IDF)
  • Show that DOI System supports (does not compete with) other TC46/SC9 “identifiers”: offers option of adding Internet actionability, interoperability, in a standard way
  • Draft now finalised
  • Supporting materials (response to comments, FAQ) available
  • 2008 standard?
  • Recent overview article is based on ISO draft:
    • http://www.doi.org/overview/070710-Overview.pdf
  • DOI Handbook to be revised
slide30

doi>

DOI System applications

  • The main use of the DOI System is not simply to register an identifier
  • It is to make use of the identifier in a SERVICE offered to users
  • E.g. CrossRef provides bibliographic citation pre-and post-production look-up service across hundreds of publishers
  • It uses DOI names as one part of its service
  • It has become a de-facto requirement for academic publishing
slide31

Application issues

doi>

  • Multiple services may exist for an identifier
    • Don’t assume only monopoly services
    • One service may be definitive; some may be better than others
  • Multiple identifiers
    • Need to distinguish abstractions, representations, compound objects
    • Relation of DOI names to other identifiers (Bookland DOIs etc)
  • Interoperability becomes more important as an economic feature when there are multiple services or multiple uses – which there will be eventually
    • Don’t design only for today
  • Common frameworks for naming and meaning (to do all this) become important when services cut across silos; across media; from different sources; etc
    • Indecs–based approach (like ONIX etc)
  • Multiple resolution: returns multiple results in response to a request (e.g. a choice, an automated service)
    • need some way of grouping and ordering those results, e.g. Handle value typing
slide32

doi>

DOI names work with existing identifier schemes

  • General case
  • ISO standardisation of DOI System
    • “A DOI name is not intended as a replacement for other identifier schemes, but when used with them may enhance the identification functionality provided by those systems with additional functionality…”
  • Incorporate the other identifier into the DOI name syntax

and/or

  • Record the other identifier in the DOI name metadata.
  • Each scheme retains its autonomy but works together
  • ISBN and ISSN have already agreed options
slide33

doi>

DOIs can be used to define and declare

  • What does this DOI identify (precisely)?
    • For interoperable uses: use in services outside the control of the assigner
  • Metadata scheme already worked out
    • Kernel plus Application Profiles (extensions)
  • Standard ways of declaring simple metadata
    • e.g. for Open URL uses
    • Interoperability is key aspect which will tip requirements
slide34

doi>

DOI names to define the entity

  • Suppose I have here a pdf version of Defoe’s “Robinson Crusoe” issued by Norton. I find an identifier – is it of:
    • All works by Daniel Defoe
    • The work “Robinson Crusoe”?
    • The Norton edition of “Robinson Crusoe”?
    • The pdf version of the Norton edition of…. ?
    • The pdf version of…held on this server…?
  • Most digital objects of interest have compound form, simultaneously embodying several referents.
    • Multiple identifiers may be necessary (like music CDs)
  • Identifiers assigned in one context may be encountered, and may be re-used, in another place or time - without consulting the assigner. You can’t assume that your assumptions made on assignment will be known to someone else.
slide35

Chinese version

DOI name

56789

doi>

DOI names to express relationships

  • DOI name of one item may be related to DOI name of another
  • Through multiple resolution, metadata, Application Profiles…
  • Example: A DOI name of a work could resolve to several available formats, languages, etc.

Article

DOI name

12345

slide36

doi>

DOI names for “non-traditional” entities

  • Examples:
  • Scientific data
    • TIB (Registration Agency) is an example
  • Biological nomenclature
    • disambiguation and extension of the current taxonomy models: Names-4-Life: (IDF member)
  • Clinical Trials
    • identifying specific trials and sub-sets of items
    • UK project currently using DOI names on pilot basis
slide37

doi>

DOI names for “new” traditional entities

  • Example:
  • Book fragments – tables, figures, chapters, exercises
  • Interactive e-books
  • Some may use other identifiers which could become DOI names;
  • Some may be in scope but not yet widely used (e.g. ISBNs for Chapters);
  • Other may require new DOI names
  • Book Industry Study Group (BISG) working on this
  • Others:
  • Nature “precedings”; Scirus “topic pages”; some blogs?
slide38

doi>

DOI name multiple resolution

Significant benefit of Handle System:

  • Resolve from one DOI name to several different results
  • One-to-many linkage
  • Resolution request would give:
    • all results, or
    • all results of one type
  • Need a framework to build these applications on: group similar uses so that the results are predictable and can be used across applications
  • DOI Application Profile framework
  • Handle System “data value typing”
  • CrossRef to use for e.g. location-dependent resolution
  • Other business cases?
  • Could express relationships (ISTC to ISBNs etc)
slide39

Handles resolve to typed data

URL

2

http://a-books.com/….

DLS

9

acme/repository

HS_ADMIN

100

acme.admin/jsmith

XYZ

1001110011110

12

doi>

Handle

Data type

Index

Handle data

10.123/456

URL

1

http://acme.com/….

Rules for data type construction: www.handle.net/overviews/types.html

slide40

doi>

DOI name contextual resolution

  • Resolve DOI name with some additional information to give results depending on context
  • Open URL: see e.g. http://www.crossref.org/03libraries/16openurl.html
    • Resolve to same content at different location (by user)
  • Full contextual resolution: Handle System can do this (DVIA)
    • Resolve to different content (by user)
    • Of interest re licensing etc but not yet part of DOI System
  • Steps in evolution:
    • URLs: not useful for long term management
    • naming and resolution: “get me the right thing”
    • contextual resolution: “get me the thing that is right for me” (e.g. “that I have access rights for”)
slide41

doi>

DOI name tools

Several DOI Name Tools have been developed, from a variety of sources

http://www.doi.org/tools.html

  • Such as plug-ins,

e.g. Adobe Acrobat plug-in

  • At different stages of development or use
slide42

DOI SYSTEM: OVERVIEW

International DOI Foundation