Cs 502 computing methods for digital libraries
This presentation is the property of its rightful owner.
Sponsored Links
1 / 34

CS 502: Computing Methods for Digital Libraries PowerPoint PPT Presentation


  • 41 Views
  • Uploaded on
  • Presentation posted in: General

CS 502: Computing Methods for Digital Libraries. Lecture 9 Conversion to Digital Formats Anne Kenney, Cornell University Library. What are Digital Images?. Electronic snapshots taken of a scene or scanned from documents samples and mapped as a grid of dots or picture elements (pixels)

Download Presentation

CS 502: Computing Methods for Digital Libraries

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Cs 502 computing methods for digital libraries

CS 502: Computing Methods for Digital Libraries

Lecture 9

Conversion to Digital Formats

Anne Kenney, Cornell University Library


What are digital images

What are Digital Images?

  • Electronic snapshots taken of a scene or scanned from documents

  • samples and mapped as a grid of dots or picture elements (pixels)

  • pixel assigned a tonal value (black, white, grays, colors), represented in binary code

  • code stored or reduced (compressed)

  • read and interpreted to create analog version


Four scanning methods

Four Scanning Methods

Bitonal

Grayscale

Special

Treatment

Color


Digital image quality is governed by

Digital Image Quality is Governed By:

  • resolution and threshold

  • bit depth

  • image enhancement

  • color management

  • compression

  • system performance

  • operator judgment and care


Resolution

Resolution

  • determined by number of pixels used to represent the image

  • expressed in dots per inch (dpi)--actually dots/sq. inch

  • increasing resolution increases level of detail captured and geometrically increases file size


Effects of resolution

Effects of Resolution

600 dpi

300 dpi

200 dpi


Threshold setting in bitonal scanning

Threshold Setting in Bitonal Scanning

defines the point on a scale from 0 to 255 at which gray values will be interpreted either as black or white


Effects of threshold

Effects of Threshold

threshold = 60

threshold = 100


Bit depth

Bit Depth

  • number of bits used to represent each pixel, typically 8 bits or more per channel

  • representing 256 (28) levels for grayscale and 16.7 million (224) levels for color example: 8-bit grayscale pixel

    00000000 = black

    11111111 = white


Bit depth1

Bit Depth

  • increasing bit depth increases the level of gray or color information that can be represented and arithmetically increases file size

  • affects resolution requirements


Effects of grayscale on image quality

Effects of Grayscale on Image Quality

3-bit gray

8-bit gray


Image enhancement

Image Enhancement

  • can be used to improve image capture

  • use raises concerns about fidelity and authenticity


Cs 502 computing methods for digital libraries

Effects of Filters

no filters used

maximum enhancement


Image editing

Image Editing


Compression

Compression

  • reduces file size for processing, storage, transmission, and display

  • image quality may be affected by the compression techniques used and the level of compression applied


Compression variables

Compression Variables

  • lossless versus lossy compression

  • proprietary vs. open schemes

  • level of industry support

  • bitonal vs. gray/color


Common compression schemes

Common Compression Schemes

  • bitonal

    • ITU Group 4: lossless

    • JBIG (ISO 11544): lossless

    • CPC: Lossy

    • DigiPaper

  • grayscale/color

    • LZW, lossless

    • JPEG: lossy

    • Kodak Image Pac, “visually lossless”

    • Fractal and Wavelet compression


Effects of jpeg compression

Effects of JPEG Compression

300 dpi, 8-bit grayscale

uncompressed TIFF

JPEG 18.5:1 compression


Compression observations

Compression Observations

  • the richer the file, the more efficient and sustainable the compression

  • the more complex the image, the poorer the compression


Equipment used and its performance over time

Equipment used and its performance over time

  • scanners offer wide range of capabilities to capture detail, dynamic range, and color

  • scanners with same stated functionality can produce different results

  • calibration, age of equipment, and environment affect quality


Equipment used and its performance over time1

Equipment used and its performance over time

  • attributes and capabilities of monitor and/or printer are also factors

  • assess quality visually and computationally

    • use targets

    • control QC environment

    • increasing availability of software to assess resolution, tone, color, artifacts


Image capture

Image Capture:

Create digital objects rich enough to be useful over time in the most cost- effective manner.


How to determine what s good enough

How to determine what’s good enough?

  • Connoisseurship of document attributes

  • Objective characterizations

  • Translation between analog and digital

    • measurement to scanning requirement to corresponding image metrics

    • e.g., detail sizeresolution MTF

    • tonal range bit depth signal-to-noise ratio


Case study

Case Study

  • Brittle Books--printed text, use of metal type, commercial publishers, objective measurement, use of Quality Index from micrographics

  • 600 dpi 1-bit capture adequately preserves informational content of text-based materials


Ensuring full informational capture no more no less

Ensuring Full Informational Capture: “No More, No Less”

desired point of capture

image quality and utility

cost


Create one scan to serve multiple uses

Create One Scan To Serve Multiple Uses

  • Derive alternative formats/approaches to meet current and future information needs

  • Base “derivative” requirements on document attributes, technical infrastructure, user requirements, and cost

  • Understand technical links affecting presentation and utility of derivatives


User requirements

User Requirements

  • completeness

  • legibility

  • speed of delivery

  • “cooked” files


Derivatives from a digital master

Derivatives from a Digital Master

  • the richer the image, the better the derivative

    • a derivative from a rich file is superior in quality to one from a poorer scan

    • the richer the image, the better the image processing


Cs 502 computing methods for digital libraries

monitor: 800 x 600 pixels

800

600

document at 60 dpi

480 pixels x 600 pixels

2,000

pixels

1,600 pixels

document at 100 dpi

800 pixels x 1,000 pixels

document: 8” x 10”, 200 dpi

(1,600 x 2,000 pixels)


Cs 502 computing methods for digital libraries

Compression/File Format Comparison

for Derivative Files

GGIF Compressed

6:1 (NARA)

6:1 (NARA)

JPEG Compressed

20:1 ( LC) Compressed

20:1 (LC)

TIFF Uncompressed


Alternatives for displaying oversize images

Alternatives for Displaying Oversize Images

  • File formats and compression schemes that support multi-resolution image delivery, e.g., wavelet compression, GridPix, Flashpix

  • User tools for representing scale (Blake Project ImageSizer, java applet), and improving image quality


Recommendations coalescing

Recommendations Coalescing

  • Intent of conversion drives decisions

    • issues of access considered at conversion

    • notion of long-term utility and cross-institutional resources gaining ground

  • Access images will change with:

    • changing user needs and capabilities

    • changes in technologies: file formats, technical infrastructure,compression, web browsers, processing programs, scaling routines


  • Login