maintaining the integrity of e book titles in cityu library catalogue n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Maintaining the integrity of e-book titles in CityU library catalogue PowerPoint Presentation
Download Presentation
Maintaining the integrity of e-book titles in CityU library catalogue

Loading in 2 Seconds...

play fullscreen
1 / 42

Maintaining the integrity of e-book titles in CityU library catalogue - PowerPoint PPT Presentation


  • 134 Views
  • Uploaded on

Maintaining the integrity of e-book titles in CityU library catalogue. 7 th HKIUG, 12 Dec 2006, HKUST Joanna Pong, Philip Wong Run Run Shaw Library City University of Hong Kong. Table of Contents. Growth of e-books in CityU Duplication problems Attempted solutions Effective Solutions

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Maintaining the integrity of e-book titles in CityU library catalogue' - jasmine-kaufman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
maintaining the integrity of e book titles in cityu library catalogue

Maintaining the integrity of e-book titles in CityU library catalogue

7th HKIUG, 12 Dec 2006, HKUST

Joanna Pong, Philip Wong

Run Run Shaw Library

City University of Hong Kong

table of contents
Table of Contents
  • Growth of e-books in CityU
  • Duplication problems
  • Attempted solutions
  • Effective Solutions
  • De-duplication jobs
  • Benefits and limitations

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

1 growth of e books in cityu
1. Growth of e-books in CityU
  • E-book collection contains English e-books, Chinese e-books & e-theses
  • From 2001: NetLibrary (around 200 titles)

To Oct 2006: > 200,000 titles

    • English e-books: > 87,000 titles
    • Chinese e-books: > 45,000 titles
    • e-theses: > 70,000 titles

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

1 growth of e books in cityu cont d
1. Growth of e-books in CityU (cont’d)
  • Acquisition of e-books from 2001 onwards

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

1 growth of e books in cityu cont d1
1. Growth of e-books in CityU (cont’d)

Total > 200,000 titles

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

1 growth of e books in cityu cont d2
1. Growth of e-books in CityU (cont’d)
  • Major e-book collections

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

1 growth of e books in cityu cont d3
1. Growth of e-books in CityU (cont’d)
  • E-theses

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

1 growth of e books in cityu cont d4
1. Growth of e-books in CityU (cont’d)
  • Consortial acquisition of e-books
    • Digital Dissertation Consortium – since 2005
    • Apabi D-Lib Consortium – since 2006
    • NetLibrary Super E-book Consortium – since 2006
  • New consortia
    • Electronic Resources Academic Library Link (ERALL), a JULAC project on collective e-book collection development

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

1 growth of e books in cityu cont d5
1. Growth of e-books in CityU (cont’d)
  • Growth of e-book usages (from CGI Logs)

-- showed an uprising trend

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

2 duplication problems
2. Duplication problems

The variety of e-book collections and high number of titles created problems in cataloguing

A major problem-> Title duplication

  • We load records supplied by different vendors, resulted in title duplication
  • More e-book titles, more title duplication
    • same title from different collections
    • same title from same collection

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

2 duplication problems cont d
2. Duplication problems (cont’d)
  • Duplication from different collections

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

2 duplication problems cont d1
2. Duplication problems (cont’d)
  • Duplication from the same collection
    • NetLibrary collection
      • Titles purchased by CityU since 2001
      • Titles acquired via Super-ebook Consortium

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

2 duplication problems cont d2

Same title from NetLibrary acquired in different period

2. Duplication problems (cont’d)

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

2 duplication problems cont d3
2. Duplication problems (cont’d)
  • Duplication from the same collection (cont’d)
    • UMI e-theses
      • Titles purchased by CityU since 2002
      • Titles acquired via Digital Dissertation Consortium
      • Titles in ProQuest Database

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

2 duplication problems cont d4
2. Duplication problems (cont’d)

Same UMI e-thesis title acquired in different period

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

3 attempted solutions
3. Attempted solutions
  • Single record approach in cataloguing
    • We apply single record approach for all e-versions of the same title
    • Applied to e-books and e-journals

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

3 attempted solutions cont d
3. Attempted solutions (cont’d)
  • Duplication control in e-journals
    • CityU applied and modified BU’s program to merge e-journal titles from aggregator databases

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

3 attempted solutions cont d1
3. Attempted solutions (cont’d)
  • Duplication control through manual methods
    • For e-books, our previous solutions
        • Manual checking
        • Headings reports – duplicate call numbers
        • Loading through match field 001 – identify duplicate records
        • Encounter basis
    • Okay when the number of titles remains small

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

3 attempted solutions cont d2
3. Attempted solutions (cont’d)
  • Duplication control through customized load profiles
    • The first attempt to automate the procedure
    • Utilized the local load profiles and translation table in INNOPAC to merge 2 sets of NetLibrary titles
      • Super E-book Consortium titles purchased in 2006
      • NetLibrary titles purchased since 2001
      • 2,206 titles were found duplicated

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

3 attempted solutions cont d3
3. Attempted solutions (cont’d)
  • Duplication control through customized load profiles (cont’d)
    • Using load profiles is not a complete solution
      • Cannot match multiple tags (cannot match tag 020 against tag 024)
      • Cannot match selected sets (cannot exclude print titles)
      • Cannot merge multiple records automatically; must output for manual checking to decide the master record

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

4 effective solutions
4. Effective Solutions
  • Cataloguing worked with Systems to run de-duplication and merging of records
  • Prerequisite
    • easy to apply
    • able to fit in the existing workflow
    • have flexibility to handle different sizes of e-book batches
    • allow prompt or ad hoc loading of records if necessary

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

4 effective solutions cont d
4. Effective Solutions (cont’d)
  • Scope of de-duplication
    • Include English e-books and e-theses
      • e-books: 88,000 records
      • e-theses: 70,000 records

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

4 effective solutions cont d1
4. Effective Solutions (cont’d)
  • Scope of de-duplication (cont’d)
    • Exclude Chinese e-books because
      • CityU so far only has one Chinese e-book collection, Apabi.
      • Vendor supplied unique records when we joined the Apabi D-Lib consortium (no duplication with previously purchased titles)
      • We will also handle Chinese e-books if we acquire other Chinese e-book collections in the future

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

4 effective solutions cont d2
4. Effective Solutions (cont’d)
  • What fields to match?
    • E-books
      • Match ISBN – a relatively reliable tag
      • Match major MARC tags – 110 match key
    • UMI e-theses
      • Use UMI number for matching

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

4 effective solutions cont d3
4. Effective Solutions (cont’d)
  • How to merge?
    • Set the one with the earliest Create Date as the master record
    • Add reproduction note (tag 533), name of book collection (tag 773) and URL link (tag 856) of the duplicate record(s) to the master record

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

4 effective solutions cont d4
4. Effective Solutions (cont’d)
  • Matching algorithm of ISBN
    • Print ISBN vs. e-book ISBN
      • Some records come with print ISBN, some with e-book ISBN, some with both
      • Both types are used for matching
    • Different tags to store ISBN
      • 020 $a, $z
      • 024 (1st indicator 3) $a, $z
      • 776 $z
      • All the above are used for matching

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

4 effective solutions cont d5
4. Effective Solutions (cont’d)
  • Matching algorithm of ISBN (cont’d)
    • 13-digit ISBN vs. 10-digit ISBN
      • Starting on 1 Jan 2007, the ISBN is 13-digit
      • Some publishers already used 13-digit ISBN before that
      • Starting from 12 Nov 06, OCLC moves 13-digit ISBN to tag 020
      • 13-digit ISBN with prefix “978” may have 10-digit equivalents, they are converted to 10-digit for matching

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

4 effective solutions cont d6
4. Effective Solutions (cont’d)
  • Matching algorithm of ISBN (cont’d)
    • ISBN with “noise”
      • Some ISBN include a note enclosed in parentheses
      • Do not use ISBN for matching if the text inside the parentheses indicates that the ISBN is for a set, a series, or a volume etc.

e.g. “0415191327 (series : International library of psychology)”

      • Hints: look for keywords “set”, “series” and compare with Tag 440 and Tag 830

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

4 effective solutions cont d7
4. Effective Solutions (cont’d)
  • Matching algorithm of the 110 Match Key
    • To guarantee there is no mismatch by ISBN, construct additional match key based on INN-Reach 110 Match Key

Title + Gen. Media + Pub. Year + Pagination + Edition + Publisher + Type of Record + Title Part + Title Number

      • Constructed the key and normalized
      • Refer to INN-Reach documentation for details

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs
5. De-duplication jobs
  • Initial clean-up
  • Regular de-duplication

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs cont d
5. De-duplication jobs (cont’d)
  • Initial clean-up
    • One time -- to de-duplicate records that had been loaded
    • 6,063 (7.2%) duplicate records were found, out of 84,756 English e-book titles
    • Fine tune program after initial clean-up

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs cont d1
5. De-duplication jobs (cont’d)
  • Regular de-duplication
    • Once every month
    • Flexibility
      • Depends on no. of title loaded & urgency to load the records
      • Clean-up before loading vs. clean-up after loading

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs cont d2
5. De-duplication jobs (cont’d)
  • Regular de-duplication (cont’d)
    • Procedures
      • Output e-book records from catalogue
      • Run de-duplication program to match with vendor records
      • Overlay records in catalogue with merged records
      • If vendor records have been loaded

delete duplicate vendor records from catalogue

      • Else

insert new vendor records into catalogue

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs cont d3

Vendor

Vendor records

INNOPAC

Match & Merge

Master records

Merged

Duplicated

New

Overlay

Delete

Insert

INNOPAC

5. De-duplication jobs (cont’d)
  • Flow chart

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs cont d4
5. De-duplication jobs (cont’d)
  • De-duplication results
    • Initial clean-up of e-books

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs cont d5
5. De-duplication jobs (cont’d)
  • De-duplication results
    • Initial clean-up of e-books (cont’d)

Distribution of titles merged from 2 records

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs cont d6
5. De-duplication jobs (cont’d)
  • De-duplication results
    • Initial clean-up of e-books (cont’d)
      • We found that for the duplicated titles within the same collection, some will direct users to different e-books, this problem is more serious in ebrary.
      • Fine-tune program, add the condition:

When two matched records have the same CGI scripts (i.e. belong to the same collection) but different book IDs, do not merge them, but flag for review

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs cont d7
5. De-duplication jobs (cont’d)
  • De-duplication results (cont’d)
    • Initial clean-up of e-theses

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

5 de duplication jobs cont d8
5. De-duplication jobs (cont’d)
  • De-duplication results
    • Initial clean-up of e-theses (cont’d)

Distribution of titles merged from 2 records

(DDC = Digital Dissertation Consortium)

More than 4,000 DDC & ProQuest records had been de-duplicated with manual process (using 001 field) before the initial clean-up process.

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

6 benefits and limitations
6. Benefits and limitations
  • Benefits
    • Single record for all versions of the same e-book or e-thesis titles, maintain integrity in the library catalogue
    • Save much staff time & manual effort
    • Method applicable to other e-resources
    • Management need – generate duplication statistics
    • Can be applied to match existing e-book collections with e-book titles supplied by potential vendors – e-book collection development

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

6 benefits and limitations cont d
6. Benefits and limitations (cont’d)
  • Limitations
    • Depends on data in vendor-supplied records
      • Incorrect match and merge in case of incorrect or incomplete data
      • Chinese e-book records
        • Brief bibliographic data
        • Lack of standardization in transcription
        • Difficult to construct reliable match-key
        • Sometimes lack of ISBNs

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006

maintaining the integrity of e book titles in cityu library catalogue1
Maintaining the integrity of e-book titles in CityU library catalogue

Thank You!

Joanna Pong

E-mail: lbjoanpg@cityu.edu.hk

Philip Wong

E-mail: lbphilip@cityu.edu.hk

Maintaining the intergrity of e-book titles in CityU library catalogue, 7th HKIUG, 2006