1 / 21

Reference Linking in Project Euclid

Reference Linking in Project Euclid. …with some thoughts on the preservation of digital collections. A presentation at the Workshop on Linking and searching in distributed digital libraries University of Michigan, Ann Arbor, University Library March 19, 2002 William R. Kehoe

cleo-finch
Download Presentation

Reference Linking in Project Euclid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reference Linking inProject Euclid …with some thoughts on the preservation of digital collections. A presentation at theWorkshop on Linking and searching in distributed digital libraries University of Michigan, Ann Arbor, University Library March 19, 2002 William R. Kehoe wrk1@cornell.edu Digital Library and Information Technologies Cornell University Library

  2. Overview • Context – what is Project Euclid? • Requirements – the constraints for the reference linking system • Implementation – some design views • Next Steps – our plans for the future • Preservation – thinking long-term about digital collections William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  3. What is Project Euclid? • A partnership of independent publishers of mathematics and statistics journals • Publishers provide born-digital versions of their print journals. • http://projecteuclid.org William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  4. Reference Linking: two viewpoints • The publisher’s point of view • Links to multiple resources add value to the electronic version. • MR numbers, CrossRef DOIs, web links are included in the reference when we find them • The library’s point of view • The appropriate copy problem—does a link lead to a copy for which the library has viewing/distribution rights. • Is the copy an authentic representation of the original? • Project Euclid represents publishers William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  5. Purpose References in article files are made available as links on HTML abstract pages <<PDF>> <<HTML>> William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  6. Requirements • Automatic processing • Extensibility to multiple reference styles • Extensibility to multiple input formats • Low-cost maintenance • High accuracy William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  7. <<PDF>> • Title • Author and affiliation • Abstract goes here • Body • References <<XML>> • Title • Author and affiliation • Abstract goes here • Body • References Implementation Conversion Look-up Extraction Creating Links Parsing Storing William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  8. <<PDF>> <<Text>> • Title • Author and affiliation • Abstract goes here • Body • References • Title • Author and affiliation • Abstract goes here • Body • References Conversion The converter is Derek Noonberg’s “pdftotext” utility. http://www.foolabs.com/xpdf/home.html Converter William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  9. Conversion/Extractionactivity diagram William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  10. Extraction A fragment of the perl module that extracts the references from the text version of an article William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  11. Parsing Method Factory getMRNum() getYear() getDOI() getTitle() getJournal() … more … Object view Reference MRNum LinkedString String Year DOI Title Journal William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  12. Parsing Each element of a Reference is extracted by a subroutine customized for how the element appears in a particular journal style. William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  13. Look-up Query • |IEEE Trans. Automat. Control|chang||||1994||||Stability, queue length and delay of deterministic and stochastic queue • |SIAM J. Control Optim.|Dupuis||||1989|||| Result set • 0018-9286|IEEE Trans. Automat. Control|Chang|39|5|913|1994|||95b:90029|Stability, queue length, and delay of deterministic and stochastic queueing networks. • |SIAM J. Control Optim.|Dupuis||||1989|||| William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  14. Link Creation An HTML anchor tag is inserted into the reference string and saved to an XML file. The User Interface module later uses the linkedString element when creating an Article Abstract page on the fly. It doesn’t have to know how to create the link. • <string>[Ar] V. ARNOLD , A-graded algebras and continued fractions, Comm. Pure Appl. Math. 42 (1989), 993­1000.</string> • <linkedString>[Ar] V. ARNOLD , A-graded algebras and continued fractions, Comm. Pure Appl. Math. 42 (1989), 993­1000. <a href="http://www.ams.org/mathscinet-getitem?mr=90h:32025" target="_blank">MR 90h:32025</a></linkedString> William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  15. Storing <referenceList> <reference> <refString></refString> <linkedString></linkedString> <title></title> <journal></journal> … more elements … </reference> <reference> … elements … </reference> </referenceList> Stored as an XML file William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  16. Display An element in an xml file provides… …an HTML link on the article’s abstract page … … which links to a MathSciNet page William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  17. Next Steps • More journals • Adding DOIs to the abstract page • Conversion from LaTeX files • Digitized back issues William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  18. Addendum on Digital Preservation • Libraries and others are considering ways to preserve our digital resources for the long term. • One possible solution is the LOCKSS system (Lots of Copies Keep Stuff Safe) • Another solution is to preserve the metadata needed to describe and reconstruct a collection while preserving and providing access to the data files. The Consultative Committee for Space Data Systems has published a Reference Model for an Open Archival Information System (OAIS). Many of the persons working with digital collections in the library and archive world are using this model to plan for long-term preservation. William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  19. Archival Information Package Content Information PreservationDescriptionInformation * DataObject RepresentationInformation FixityInformation ContextInformation 1 << file >>Digital Object ReferenceInformation Provenance Information Archival Information Package From the Reference Model for an Open Archival Information System (OAIS) OAIS-compliant systems also contain the metadata objects in yellow Most digital collections contain some form of the objects in blue. William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  20. OAIS Functional Model From the Reference Model for an Open Archival Information System (OAIS) William R. Kehoe, Digital Library and Information Technology, Cornell University Library

  21. For More Information… • Project Euclid—http://projecteuclid.org • MR Batch Lookup—http://www.ams.org/mrlookup-support/technical_help.html#http • Consultative Committee for Space Data Systems—http://www.ccsds.org • Reference Model for an Open Archival Information System (OAIS)—http://www.ccsds.org/documents/pdf/CCSDS-650.0-R-2.pdf • LOCKSS—http://lockss.stanford.edu William R. Kehoe, Digital Library and Information Technology, Cornell University Library

More Related