cs 502 computing methods for digital libraries n.
Skip this Video
Download Presentation
CS 502: Computing Methods for Digital Libraries

Loading in 2 Seconds...

play fullscreen
1 / 34

CS 502: Computing Methods for Digital Libraries - PowerPoint PPT Presentation

  • Uploaded on

CS 502: Computing Methods for Digital Libraries. Lecture 9 Conversion to Digital Formats Anne Kenney, Cornell University Library. What are Digital Images?. Electronic snapshots taken of a scene or scanned from documents samples and mapped as a grid of dots or picture elements (pixels)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'CS 502: Computing Methods for Digital Libraries' - tacita

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cs 502 computing methods for digital libraries

CS 502: Computing Methods for Digital Libraries

Lecture 9

Conversion to Digital Formats

Anne Kenney, Cornell University Library

what are digital images
What are Digital Images?
  • Electronic snapshots taken of a scene or scanned from documents
  • samples and mapped as a grid of dots or picture elements (pixels)
  • pixel assigned a tonal value (black, white, grays, colors), represented in binary code
  • code stored or reduced (compressed)
  • read and interpreted to create analog version
four scanning methods
Four Scanning Methods






digital image quality is governed by
Digital Image Quality is Governed By:
  • resolution and threshold
  • bit depth
  • image enhancement
  • color management
  • compression
  • system performance
  • operator judgment and care
  • determined by number of pixels used to represent the image
  • expressed in dots per inch (dpi)--actually dots/sq. inch
  • increasing resolution increases level of detail captured and geometrically increases file size
effects of resolution
Effects of Resolution

600 dpi

300 dpi

200 dpi

threshold setting in bitonal scanning
Threshold Setting in Bitonal Scanning

defines the point on a scale from 0 to 255 at which gray values will be interpreted either as black or white

effects of threshold
Effects of Threshold

threshold = 60

threshold = 100

bit depth
Bit Depth
  • number of bits used to represent each pixel, typically 8 bits or more per channel
  • representing 256 (28) levels for grayscale and 16.7 million (224) levels for color example: 8-bit grayscale pixel

00000000 = black

11111111 = white

bit depth1
Bit Depth
  • increasing bit depth increases the level of gray or color information that can be represented and arithmetically increases file size
  • affects resolution requirements
image enhancement
Image Enhancement
  • can be used to improve image capture
  • use raises concerns about fidelity and authenticity

Effects of Filters

no filters used

maximum enhancement

  • reduces file size for processing, storage, transmission, and display
  • image quality may be affected by the compression techniques used and the level of compression applied
compression variables
Compression Variables
  • lossless versus lossy compression
  • proprietary vs. open schemes
  • level of industry support
  • bitonal vs. gray/color
common compression schemes
Common Compression Schemes
  • bitonal
    • ITU Group 4: lossless
    • JBIG (ISO 11544): lossless
    • CPC: Lossy
    • DigiPaper
  • grayscale/color
    • LZW, lossless
    • JPEG: lossy
    • Kodak Image Pac, “visually lossless”
    • Fractal and Wavelet compression
effects of jpeg compression
Effects of JPEG Compression

300 dpi, 8-bit grayscale

uncompressed TIFF

JPEG 18.5:1 compression

compression observations
Compression Observations
  • the richer the file, the more efficient and sustainable the compression
  • the more complex the image, the poorer the compression
equipment used and its performance over time
Equipment used and its performance over time
  • scanners offer wide range of capabilities to capture detail, dynamic range, and color
  • scanners with same stated functionality can produce different results
  • calibration, age of equipment, and environment affect quality
equipment used and its performance over time1
Equipment used and its performance over time
  • attributes and capabilities of monitor and/or printer are also factors
  • assess quality visually and computationally
    • use targets
    • control QC environment
    • increasing availability of software to assess resolution, tone, color, artifacts
image capture
Image Capture:

Create digital objects rich enough to be useful over time in the most cost- effective manner.

how to determine what s good enough
How to determine what’s good enough?
  • Connoisseurship of document attributes
  • Objective characterizations
  • Translation between analog and digital
    • measurement to scanning requirement to corresponding image metrics
    • e.g., detail sizeresolution MTF
    • tonal range bit depth signal-to-noise ratio
case study
Case Study
  • Brittle Books--printed text, use of metal type, commercial publishers, objective measurement, use of Quality Index from micrographics
  • 600 dpi 1-bit capture adequately preserves informational content of text-based materials
ensuring full informational capture no more no less
Ensuring Full Informational Capture: “No More, No Less”

desired point of capture

image quality and utility


create one scan to serve multiple uses
Create One Scan To Serve Multiple Uses
  • Derive alternative formats/approaches to meet current and future information needs
  • Base “derivative” requirements on document attributes, technical infrastructure, user requirements, and cost
  • Understand technical links affecting presentation and utility of derivatives
user requirements
User Requirements
  • completeness
  • legibility
  • speed of delivery
  • “cooked” files
derivatives from a digital master
Derivatives from a Digital Master
  • the richer the image, the better the derivative
    • a derivative from a rich file is superior in quality to one from a poorer scan
    • the richer the image, the better the image processing

monitor: 800 x 600 pixels



document at 60 dpi

480 pixels x 600 pixels



1,600 pixels

document at 100 dpi

800 pixels x 1,000 pixels

document: 8” x 10”, 200 dpi

(1,600 x 2,000 pixels)


Compression/File Format Comparison

for Derivative Files

GGIF Compressed

6:1 (NARA)

6:1 (NARA)

JPEG Compressed

20:1 ( LC) Compressed

20:1 (LC)

TIFF Uncompressed

alternatives for displaying oversize images
Alternatives for Displaying Oversize Images
  • File formats and compression schemes that support multi-resolution image delivery, e.g., wavelet compression, GridPix, Flashpix
  • User tools for representing scale (Blake Project ImageSizer, java applet), and improving image quality
recommendations coalescing
Recommendations Coalescing
  • Intent of conversion drives decisions
    • issues of access considered at conversion
    • notion of long-term utility and cross-institutional resources gaining ground
  • Access images will change with:
    • changing user needs and capabilities
    • changes in technologies: file formats, technical infrastructure,compression, web browsers, processing programs, scaling routines