1 / 50

Exploring IR Technologies

IR Workshop Managing Scholarly Assets in Institutional Repositories: Sharing Experiences Among JULAC Libraries 24 February 2006, HKUST Library. Exploring IR Technologies. Ki Tat LAM Head of Library Systems The Hong Kong University of Science and Technology Library lblkt@ust.hk. Contents.

emmett
Download Presentation

Exploring IR Technologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IR WorkshopManaging Scholarly Assets in Institutional Repositories:Sharing Experiences Among JULAC Libraries24 February 2006, HKUST Library Exploring IR Technologies Ki Tat LAM Head of Library Systems The Hong Kong University of Science and Technology Library lblkt@ust.hk

  2. Contents • DSpace Software • SRW/U, Usage statistics, OpenURL • Cross-Searching Technologies • Search engines – Google • OAI-PMH - OAIster, Scirus, HKIR • HKIR • Standardization • Author names; subjects; document types; metadata schema • Document deposition versus linking • Research Assessment Exercise

  3. DSpace Software • Jointly created by MIT Libraries and Hewlett-Packard Company [http://www.dspace.org/] • Open source software – released since 2002 • Adopted by HKUST Library for its IR since February 2003 [http://repository.ust.hk/] • Also adopted for HKUST’s Digital University Archives – migrated to DSpace in October 2004 [http://archives.ust.hk/]

  4. DSpace Software [cont.] • HKUST’s Electronic Journals Online searching service will soon be migrated to DSpace [http://lbapps.ust.hk/ej/] • Adopted by CUHK for its IR (known as SiR) since mid-2004 [http://dspace.lib.cuhk.edu.hk/] • Adopted by CityU for its IR since 2005 [http://dspace.cityu.edu.hk/] • Will be adopted by HKIEd for building its IR

  5. IR Software and Services • Open Source Software • DSpace • GNU EPrints • Fedora • See OSI Guide to Institutional Repository Software[http://www.soros.org/openaccess/software/] • Commercial Software • VITAL from VTLS Inc. – powered by Fedora • DigiTool from Ex Libris • Symposia from Innovative Interface Inc.

  6. IR Software and Services [cont.] • Commercial Hosting Services • Digital Commons from ProQuest – powered by the bepress platform

  7. DSpace at HKUST As of 19 February 2006, Home URL: http://repository.ust.hk/ IR Software: DSpace Version 1.3.2 System Software: Fedora Core 4 Linux; Tomcat 5.0; JDK1.4.2 Server: Intel Pentium 4 3GHz; 3GB RAM; 80GB hard disk Content: 2231 documents from 42 departments Usages: Documents were accessed 74,467 times since October 2004

  8. DSpace at HKUST • Customizations • Document submission form • Add item form • CJK support • Authentication and authorization • SRW/U interface • Collection and Usage statistics • OpenURL linking

  9. DSpace at HKUST [cont.] • SRW/U Interface • Search and Retrieval for the Web (or by URL) • Base URL: [http://repository.ust.hk/SRW/search/DSpace] • Alternative way of searching the repository - using standard web services • Allows search service providers to issue a federated search to various IRs and deliver the search results in their own GUI interface

  10. Response to the following SRW search request: http://repository.ust.hk/SRW/search/DSpace?query=dc.creator+%3D+%22ip+nancy%22&operation=searchRetrieve&maximumRecords=1&startRecord=1...

  11. XSLT-converted response to the following SRW search request: http://repository.ust.hk/SRW/search/DSpace?query=dc.creator+%3D+%22ip+nancy%22&operation=searchRetrieve&maximumRecords=1&startRecord=1...

  12. DSpace at HKUST [cont.] • Size of the Repository [http://repository.ust.hk/dspace/dbstat.jsp] • Compiles in real time the number of items, collections and communities in the Repository • Top 20 Most Access Documents [http://repository.ust.hk/dspace/top20.jsp] • Compiled every month against the Tomcat web access logs • Excludes access by most robots

  13. DSpace at HKUST [cont.] • OpenURL • All documents deposited in the HKUST IR must meet the open access criterion • Two solutions to link to non-open access documents were explored: • Direct linking to the documents as found in the library subscribed databases • OpenURL for Link Resolver • OpenURL approach was adopted because: • More persistent than vendor-provided URLs • Transparent to what databases locally subscribed

  14. DSpace at HKUST [cont.] • One disadvantage of the OpenURL approach – what if the in-house link resolver fails to find a target link? e.g. • Host of the document is not OpenURL capable • Database not subscribed by the library • Target not profiled by the local link resolver • Developed a data entry interface to assist in the construction of OpenURL • Demonstration: • Sample item with OpenURL • Staff interface for OpenURL construction

  15. Click on this image to launch HKUST’s WebBridge link resolverto locate the published version Documentdepositedin the Repositoryis apre-published version

  16. Click on this link to retrieve the article hosted on Elsevier’s ScienceDirect platform

  17. Click on this link to view the full-text of this article

  18. Build OpenURL Edit Item View Item OpenURL constructed

  19. Check INNOPAC for bib record and then auto-insert the ISSNs to the form Click this link to test the OpenURL Click this button to create this OpenURL fragment

  20. Cross-Searching IRs • Cross-searching approaches • If the IR site is open for robot access, documents are very likely available in major search engines, such as Google and Yahoo. • Indexing services harvest IR metadata using OAI-PMH protocol: • OAIster from University of Michigan [http://oaister.umdl.umich.edu/] • Scirus from Elsevier [http://www.scirus.com/] • HKIR – an experimental system by HKUST Library [http://lbapps.ust.hk/hkir/]

  21. Documentindexed byGoogle

  22. Document indexed by Google Scholar

  23. Document indexed by OAIster

  24. Click this link tosearch HKUSTIR on Scirus

  25. Draft Only Scirus search results page will look like this

  26. Cross-Searching IRs [cont.] • OAI-PMH • A protocol developed by Open Access Initiative for harvesting metadata from distributed repositories • Most of the IR software, including DSpace, are OAI-PMH capable • Indexing services such as OAIster are OAI data harversters • IRs are OAI data providers

  27. OAI-PMH’s XML outputin response to a“GetRecord” request Metadata in Unqualified Dublic Core metadata schema (oai_dc) OAI-PMH “GetRecord” request by URL:http://repository.ust.hk/dspace-oai/request?verb=GetRecord& ... 1783.1/1805

  28. HKIR • HKIR - an experimental system developed by the HKUST Library to demonstrate the features of harvesting and cross-searching the scholarly and research output from the Hong Kong UGC funded institutions [http://lbapps.ust.hk/hkir/] • Powered by the DSpace software • Equipped with OCLC’s OAIHarvester2 software for harvesting OAI metadata from IRs

  29. HKIR [cont.] • Databases harvested (as of 22 Feb 2006): • CUHK SiR [70 records] • CityU Institutional Repository [425 records] • HKUST Electronic Theses [1,681 records] • HKUST Institutional Repository [2,126 records] • HKU Theses Online [13,583 records]

  30. Possible add-on to aid UGC’s research assessment exercise

  31. A sampleHKIRrecord Click on this link to go to the record in CUHK’s IR This record was harvested from CUHK’s IR and it is in their Fine Arts collection

  32. A sampleHKIRrecord showing fields labeled in qualified Dublin Core elements

  33. HKIR supports OpenURLs harvested from local IRs

  34. HKIR [cont.] • Standardization Issues • Author names standardization • Subject analysis • Free vocabulary versus thesaurus • Adopt same thesaurus among institutions? • Document types • Adopt same set of definitions among institutions? • Metadata schema • Adopt same metadata schema? • Use oai_dc schema for OAI harvesting?

  35. Author namesstandardization Author name assigned by HKUST Author name assigned by CityU

  36. Document type assigned to the same article are different

  37. HKIR [cont.] • Problem on loading harvested oai_dc metadata • oai_dc is the most popular metadata schema used by OAI data provider tools, e.g. • Virginia Tech’s VTOAI - used by HKUST and HKU in their Theses databases • OCLC’s OAICat - used by DSpace • oai_dc does not support qualified Dublin Core • The qualified DC fields stored in local DSpace have to be scaled down to simple DC when exporting records to OAI harversters

  38. HKIR [cont.] • Mapping metadata back to qualified DC for loading to HKIR is challenging • Need to develop a HKIR version of schema that takes qualified DC

  39. Metadata in oai_dc schema as received by the OAI harvester dc:dentifier.citation in local IR dc:dentifier.uri in local IR dc:dentifier.openurl in local IR

  40. HKIR [cont.] • Document deposition and linking • Deposit all open access documents to the local IRs • If published version is in restricted access, then deposit the pre-published version and provide a link to the published version • Use OpenURL for linking as long as the document is in a database that can be reached via link resolvers • Otherwise, add the vendor-specific link to the metadata record

  41. HKIR [cont.] • Research Assessment Exercise (RAE) • Assess the quality of the research output of the academic staff • Assist in assessing the research fund allocation to the funded institutions • UGC is conducting RAE 2006 [http://www.ugc.edu.hk/eng/ugc/publication/prog/rae/rae.htm] • Each eligible academic staff submits a maximum of six publications • Assessed by subject panels

  42. HKIR [cont.] • High potential of utilizing the cross-institutional repository to assist academic staff to submit items and prepare reports • Go electronic – no longer need to collect submissions in printed format • IRRA (Institutional Repositories & Research Assessment) - a project that support RAE through IRs, for the UK RAE in 2008 [http://irra.eprints.org/] • Developing software for EPrints and DSpace to facilitate RAE tasks • DSpace version to be available in summer 2006

  43. HKIR [cont.] • If we have a cross-institutional repository for Hong Kong IRs, then we may consider adding support for RAE to the system • Next round of UGC RAE is in 2011or 2012

  44. Sample screen from an IR showing users selecting items for RAE submission[source: http://irra.eprints.org/software/bronze/eprints.html]

  45. Thank You!

More Related