1 / 36

the OAI Protocol for Metadata Harvesting an update

the OAI Protocol for Metadata Harvesting an update. H erbert V an de S ompel Los Alamos National Laboratory – Research Library.

jud
Download Presentation

the OAI Protocol for Metadata Harvesting an update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. the OAI Protocol for Metadata Harvesting an update Herbert Van de Sompel Los Alamos National Laboratory – Research Library

  2. The Open ArchivesInitiative has been set up to create a forum to discuss and solve matters of interoperability between preprint solutions, as a way to promote their global acceptance. Paul Ginsparg, Rick Luce & Herbert Van de Sompel

  3. Luce * Van de Sompel * Ginsparg

  4. 2 core motivations • as a systems librarian: change the system • as a researcher: find (technical) ways to facilitate the change

  5. P U B D I S L I B A R as a systems librarian optimizing the output the input is far from optimal

  6. eprint systems • xxx e-print archive • (Physics - 1991 - Los Alamos - Ginsparg) • RePEc • (Economy - Surrey U - Krichel) • NCSTRL • (Computer Science - Cornell U - Lagoze) • NDLTD • (Theses - Virginia Tech - Fox) • CogPrints • (Cognitive Sciences - Southampton U - Harnad)

  7. as a researcher • eprints are attractive building block in ongoing transformation of scholarly communication • but: interoperability could increase impact of e-prints: • amongst e-print solutions • with building blocks that implement other functions of scholarly communication • with the established communication system

  8. UPS Prototype: eprints discovery • 1999: Van de Sompel, Krichel, Nelson • results: • insights regarding how un-interoperable the systems were • a cross-repository searching and linking service • recommendations to the Santa Fe meeting: • data provider / service provider model • metadata harvesting • simplicity

  9. evolution towards OAI-PMH v.2.0 • Santa Fe Convention [02/2000] • OAI-PMH 1.0 [01/2001] • OAI-PMH 2.0 [06/2002]

  10. nature experimental experimental stable Dienst verbs OAI-PMH OAI-PMH requests HTTP GET/POST HTTP GET/POST HTTP GET/POST responses XML XML XML transport HTTP HTTP HTTP unqualified Dublin Core unqualified Dublin Core metadata OAMS document like objects resources about eprints metadata harvesting metadata harvesting metadata harvesting model Santa Fe convention OAI-PMH v.1.0/1.1 OAI-PMH v.2.0

  11. Requests repos i tory harves ter Replies OAI-PMH model service provider data provider 6 OAI-PMH

  12. repos i tory harves ter OAI-PMH model service provider data provider • Supporting protocol requests: • Identify • ListMetadataFormats • ListSets • Harvesting protocol requests: • ListRecords • ListIdentifiers • GetRecord

  13. repos i tory harves ter OAI-PMH model service provider data provider Datestamp Identifier Set Records

  14. A&I image FTXT OPAC e-print federated services

  15. A&I image OPAC e-print harvester FTXT metadata harvesting via OAI-PMH metadata FTXT

  16. A&I image FTXT e-print OPAC Author Title Abstract Identifer metadata harvesting via OAI-PMH metadata

  17. issue solved? • no, just a tiny part of the technical challenges to support discovery • many more technical issues • even more non-technical issues

  18. A R interoperable grid issue solved? technical awareness certification rewarding registration archiving

  19. issue solved? non-technical • I am happy to leave those to you • but: even for non-technological issues, part of the answer might be found in applying technology

  20. indicators of adoption of OAI-PMH • data providers • service providers • tools • structural support

  21. data providers • 49 registered repositories [11/2001] • 65 registered repositories [03/2002] • 5+ million records • many unregistered repositories

  22. service providers • Arc : cross-searching of registered repositories [Old Dominion U] • [ http://arc.cs.odu.edu ] • OLAC: cross-searching of Language Archive Community repositories • http://www.language-archives.org/index.html

  23. service providers • Scirus scientific search engine [Elsevier] • [ http://www.scirus.com ] • my.OAI : user-tailorable cross-searching of registered repositories [FS Consulting, Inc.] • [http://www.myoai.com] • growing interest from web search engines

  24. OAI-PMH tools • Repository Explorer: interactive exploration of repositories [Virginia Tech] • [ http://www.purl.org/NET/oai_explorer ] • eprints.org: generic OAI-PMH compliant repository software [U of Southampton] • [ http://www.eprints.org ] • ALCME repository and harvester software [OCLC] • [ http://alcme.oclc.org/index.html ]

  25. OAI-PMH flies: structural support • Metadata Harvesting Initiative of the Mellon Foundation • NSDL (NSF funded) • UK FAIR call for proposals to support disclosure of institutional assets (papers, learning materials, etc.) • Institute for Museum and Library Services • several EC projects exploring/supporting usage of OAI-PMH: TEL, Leaf, Cyclades, OA Forum

  26. OAI-PMH flies: and also … • Australian Museums Online & CIMI : OAI conference • NIMH white paper on data archiving for Animal Cognition Research • Library of Congress • National Library of Canada • OCLC thesis database • Illinois State Library Catalogue

  27. future • OAI • OAI-PMH • communities • adoption

  28. the OAI-PMH • release of OAI-PMH v.2.0 [06/2002] • no backwards compatibility with v.1.0/1.1 • stable • migration process for registered repos • ? formal standardization ? • ? SOAP version ~ web services framework [SOAP, WSDL, UDDI] ?

  29. communities • proliferation of community-specific add-ons for: • collection & set level metadata • expressive metadata formats (e.g. qualified DC XML Schema) • shared set-structures • machine readable rights (about the metadata)

  30. adoption • evolution • from talking about OAI-PMH • to talking about projects that use OAI-PMH • to talking about projects and failing to mention they use OAI-PMH • => OAI-PMH becomes part of the infrastructure

  31. I just wanted to report what I consider an OAI success. I discovered that RLG had harvested records for two of the American Memory collections I had made available and integrated them into their Cultural Materials Initiative service without the need for a single e-mail or phone call. They reported that it was working very well for them. [Caroline Arms, Library of Congress]

  32. http://www.openarchives.org openarchives@openarchives.org

  33. the OAI: not really an organization • Executive: Carl Lagoze & Herbert Van de Sompel • 2000 – 2002 funding from CNI and DLF • Steering Committee • Technical Committe: • protocol revision & stabilization • Alpha testers

  34. OAI-tech US representatives Thomas Krichel (Long Island U) - Jeff Young (OCLC) - Tim Cole - (U of Illinois at Urbana Champaign) - Hussein Suleman (Virginia Tech) - Simeon Warner (Cornell U) - Michael Nelson (NASA) - Caroline Arms (LoC) - Muhammad Zubair (Old Dominion U) - Steven Bird (U Penn.) European representatives Andy Powell (Bath U. & UKOLN) - Mogens Sandfaer (DTV) - Thomas Baron (CERN) - Les Carr (U of Southampton)

  35. OAI-PMH 2.0 alpha testers (1/2) • The British Library • Cornell U. -- NSDL project & e-print arXiv • Ex Libris • FS Consulting Inc -- harvester for my.OAI • Humboldt-Universität zu Berlin • InQuirion Pty Ltd, RMIT University • Library of Congress • NASA • OCLC

  36. OAI-PMH 2.0 alpha testers (2/2) • Old Dominion U. -- ARC , DP9 • U. of Illinois at Urbana-Champaign • U. Of Southampton -- OAIA, CiteBase, eprints.org • UCLA, John Hopkins U., Indiana U., NYU -- sheet music collection • UKOLN, U. of Bath -- RDN • Virginia Tech -- repository explorer

More Related