html5-img
1 / 44

Z-Books:  Hunting Down Zombie Ebooks Hiding in your Catalog

Z-Books:  Hunting Down Zombie Ebooks Hiding in your Catalog. Kathryn Lybarger @ zemkat OVGTSL 2013 #ovgtsl2013 May 17, 2013. Cataloging ebooks. Success!. Except sometimes…. Or even worse…. Zombies?. These ebooks look normal. Until someone looks too closely.

simone
Download Presentation

Z-Books:  Hunting Down Zombie Ebooks Hiding in your Catalog

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Z-Books:  Hunting Down Zombie Ebooks Hiding in your Catalog Kathryn Lybarger @zemkat OVGTSL 2013 #ovgtsl2013 May 17, 2013

  2. Cataloging ebooks

  3. Success!

  4. Except sometimes…

  5. Or even worse…

  6. Zombies?

  7. These ebooks look normal

  8. Until someone looks too closely requires a subscription Please login Purchase for $30 Page not found error Currently unavailable

  9. Then the screaming starts

  10. Nobody wants that!

  11. Not just dead? • Dead links not so bad … if they are not in the catalog • Our patrons hate LOST books in the catalog • Zombies are more disappointing

  12. Strategy: • Make sure zombies don’t get into the catalog in the first place • Watch for news of recently turned • Hunt down the ones that are already in there

  13. URLs may be bad initially • May be a typo • Book not actually on the vendor site yet • Record may have NO URL

  14. Bad DOI • Not registered yet • Registered incorrectly • Maybe points TWO places!

  15. URLs may be modified • May contain proxy prefix • May be institution specific • May have session information

  16. Provider neutral records • Old standard: • One record per provider • To catalog: • Use that record • New standard: • All e-versions on one record • To catalog: • Use that record • Delete all URLs that don’t apply

  17. Ebook links in print books • Some print book records have URLs • 856 42 “Related Resource” • May sneak in through fast copy or batch cataloging

  18. Spot some bad URLs • Query the catalog for distinct hosts • In Voyager: SELECT DISTINCT ELINK_INDEX.URL_HOST FROM ELINK_INDEX WHERE ELINK_INDEX.RECORD_TYPE="B";

  19. Catch them before they come in • Verify one by one • Do they have notes indicating they’re bad? • Run list through a link checker

  20. Just keep new ones out? • Not sufficient • Good links may die • Nobody may tell you

  21. Vendor announcements • E-mail, RSS feeds • Often interspersed with ads or news • Do not always mention deletions

  22. Vendor data for deletions • Some vendors release “deleted” lists • You may have to check the web site • Even dig for them

  23. Current status data only • Some vendors will provide a list of what they currently have • Changes not highlighted • Download periodically

  24. Useful tool: vimdiff • Free and open source (charityware) • Available on unix, mac • Available on Windows (Cygwin)

  25. Vimdiff in action

  26. Some vendor data is less accessible • Examples: • MARC blob • “Whatever’s on the web site” • Watch for announcements? • Download / overlay periodically?

  27. Convert data to text • MARC -> .mrk text (MarcEdit) • Web site • Find A-Z title list page • Download / extract list • Compare text (vimdiff)

  28. How to extract? • Different per web site • Script (gather) • Download A-Z page • Find lines with book titles • Delete everything but the title • Compare to last month’s copy

  29. Unix tools • vim / vimdiff – editor • curl – download web pages • grep – search file contents • sed – reformat files • Available in Windows through Cygwin

  30. Hunting in the catalog • Necessary maintenance • Links can go bad • (Sometimes whole platforms!)

  31. Link checking • Many link checkers available • They check for codes: • Good? • Forbidden? • Not Found?

  32. Codes aren’t everything • A table of contents is a good page • A bad DOI can be fixed • Effective method differs by vendor

  33. Humans are better at this • Instructions might be complicated: • Go to the web page • Open up one of the chapters • Make sure it is a PDF, not an order form

  34. Normac • MARC Normalizer and Access Checker • Free, open source software • Available from GitHub

  35. Normalize MARC • Only include URLs for the vendor you want • Delete URLs with a proxy prefix

  36. Access Check • Zombies look different on each site – specify • Load in MARC or list of URLs • Check access according to rules

  37. Is it really a zombie? • Or does it just look that way to you? • Maybe your subscription changed?

  38. If you’re sure… • (Remove them from your catalog) • Contact the vendor • Modify WorldCat master record

  39. Dead links in WorldCat • Leave them in! • Make 856 second indicator blank • $z This electronic address not available when searched on [Date]

  40. Then what? OCLC WorldShare Metadata Collection Manager? Separate database of dead links?

  41. Any questions?

  42. Contact Me Kathryn Lybarger @zemkat Kathryn.Lybarger@uky.edu Problem Cataloger http://pc.blog.zemows.org/ GitHub http://github.com/zemkat

More Related