data mining library collection silos print books and e books in library collections l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Data Mining Library Collection Silos: Print Books and E-books in Library Collections PowerPoint Presentation
Download Presentation
Data Mining Library Collection Silos: Print Books and E-books in Library Collections

Loading in 2 Seconds...

play fullscreen
1 / 27

Data Mining Library Collection Silos: Print Books and E-books in Library Collections - PowerPoint PPT Presentation


  • 326 Views
  • Uploaded on

Data Mining Library Collection Silos: Print Books and E-books in Library Collections Lynn Silipigni Connaway Ed O’Neill Chandra Prabha Brian Lavoie Collection Assessment Why assess collections? Provide data for member libraries for decision-making Description of the collection

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Data Mining Library Collection Silos: Print Books and E-books in Library Collections' - omer


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data mining library collection silos print books and e books in library collections

Data Mining Library Collection Silos: Print Books and E-books in Library Collections

Lynn Silipigni Connaway Ed O’NeillChandra PrabhaBrian Lavoie

collection assessment
Collection Assessment
  • Why assess collections?
    • Provide data for member libraries for decision-making
      • Description of the collection
        • Identify specific subject areas
          • Determine collection age
          • Rate of growth
          • Strengths and weakness
      • Overlap/gap analysis
      • Identify last copy
      • Useful information
        • Outside funding
        • Library collection comparisons
        • Remote storage decisions
        • Collection development and management
        • Identify role of non- ARL libraries
worldcat as a collection
WorldCat as a Collection
  • World’s largest bibliographic database
    • July 1, 2003 = 50 million+ records
    • 1 billion holdings
  • Ideal source for data-mining
  • Characteristics of WorldCat
    • Age
    • Subject, using NATC
    • Holdings by type of library
      • ARL
      • Academic, non-ARL
      • Public
      • School
      • Special
worldcat as a collection4
WorldCat as a Collection
  • Use of MARC data elements in WorldCat
    • Types of materials
    • Library holdings to determine audience levels
  • Collection assessment and collection use
    • Unique titles
    • Analyze and compare aggregate holdings for libraries
    • Identify print books (p-books) and electronic books (e-books)
study objective
Study Objective
  • Digital materials constitute increasing proportion of library collections
  • Effective strategies for integrating print and digital materials within a library collection
    • Eliminate redundancies
    • Meet user expectations
  • Data-mining increasingly important to support collection management decisions
    • WorldCat
      • World’s largest bibliographic database
      • Ideal as source for data-mining
  • Data-mine WorldCat in order to examine characteristics of p-books and e-books
rationale
Rationale
  • Collection management
    • Development
    • Cooperation
    • Deselection
    • Preservation
  • Space allocation and management
  • Meet user expectations
  • Services for off-site users
  • Migration from print to digital
  • Convenient access
    • 24/7 access
    • Desk-top delivery
scope
Scope
  • WorldCat
    • July 1, 2003 = 50 million+ records
    • 1 billion holdings
  • Digital Items
  • Books
    • Print (p-book)
    • Digital (e-book)
strategy
Strategy
  • Identify digital items
  • Identify digital items with at least one other manifestation in WorldCat
    • FRBRize database
      • Work
        • Distinct intellectual or artistic expression
        • Cluster works in WorldCat
      • Manifestation
        • Physical embodiment of a work
  • Identify digital items with p-book equivalents
    • Assumption
      • If digital items have p-book equivalents, then digital items are e-books
    • Identify publishers and publication dates
need to determine
Need to Determine
  • Comparison of p-books and e-books
    • What is a book?
    • What is a p-book?
    • What is an e-book?
    • What is a digital item?
    • How do we extend p-book criteria to digital world?
what is a digital item
What is a Digital Item?
  • Working definition of digital item
    • Computer file
    • OR Electronic resource
    • OR Appropriate 856 field
      • Indicates electronic location or access
what is a p book
What is a P-book?
  • No consensus for definition of a book
    • Text (type = a) and monograph (bib level = m)
      • Broadsides?
      • Pamphlets?
      • Government documents?
      • Children’s books?
      • Microforms?
    • Authoritative Definitions
      • UNESCO
          • Nonperiodical literary publication consisting of > 49 pages, covers excluded
      • ANSI
          • Publications consisting of > 49 pages
          • Hard covers
      • US Postal Service (publication)
          • Publications > 24 pages
a p book is
A P-book IS:
  • Based on UNESCO definition
  • Working definition of a p-book
    • Printed on paper (excludes microform)
    • Language material
    • Monograph
    • Physical description
    • Form of item = regular or large print
    • Title does not include a GMD
    • Substantial length (> 49 pages; > 25 to include juvenile titles)
    • Excludes manuscripts (dissertations and theses)
what is an e book
What is an E-book?
  • Difficult to define e-book
    • Digital version of p-book (straightforward)
    • New conceptual views of a book in digital environment
  • Assumption
    • P-book is well-defined
    • If digital item has manifestation as a p- book, then digital item must also be a book
    • If p-book has digital equivalent or vice-versa, ignore e-book that has no print equivalents
an e book is
An E-book IS:
  • E-Book = Electronic (Digital) + Book
  • Definition of e-Book:
    • Digital equivalents of p-books
    • New conceptual definitions of books in digital environment
worldcat record analysis
WorldCat Record Analysis
  • P-book records = 24,048,235 (48% of WC)
  • Digital item records = 795,630 (15% of WC)
    • Web sites
      • Collections of interlinked, Web-accessible materials residing at a single location on the Internet
    • Documents
      • Various forms of electronic documents
      • E-books with no p-book equivalents and no minimum page requirements
        • Book chapters
        • Broadsides
        • Brochures
        • Pamphlets
    • Reprints
      • E-books with p-book equivalents = 76,375 (1.5% of WC)
worldcat record analysis21
WorldCat Record Analysis
  • Digital item records (continued)
    • Interactive learning objects
      • Computer programs offering self-contained, interactive tutorial or educational experience
    •  Software
      • Computer programs for creating and manipulating information
    • Serials
      • Journals
      • Proceedings
    • Images
    • Theses
    • Other (2 records)
      • Computer game
      • Raw data file
publishers of digital items with p book equivalents in worldcat
Publishers of Digital Items With P-Book Equivalents in WorldCat
  • Approximately 15,000 unique publishers
  • Approximately 150 publishers with > 25 records
  • Top 10 publishers
    • Institute of Electrical and Electronic Engineers (IEEE)
    • National Bureau of Economic Research
    • US Government Printing Office
    • Springer
    • Inter-University Consortium for Political and Social Research
    • PowerKids Press
    • University of Virginia Library
    • MIT Press
    • Microsoft
    • Broderbund Software and Books
discussion of analysis
Discussion of Analysis
  • Small number of
    • E-books with p-book equivalents
    • Publishers with > 25 records for e-books with p-book equivalents
  • Recent publication dates for e-books with p-book equivalents
  • More Web sites than documents or reprints
  • Difficult to identify and categorize digital items
    • Inconsistent cataloging policies and practices for digital items
    • Inconsistent definitions for types of digital items
future research
Future Research
  • Establish accepted criteria for defining an e-book independent of p-books
  • Identify and compare type of library holdings and NATC subjects for p-books and e-books
    • Identify electronic collection silos
  • Continue to collect these data to compare for trends
  • Identify types of content/materials that are better suited for either print or digital environment
questions and discussion

Questions and Discussion

connawal@oclc.org

oneill@oclc.org