slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
So you want to start a digital library? PowerPoint Presentation
Download Presentation
So you want to start a digital library?

Loading in 2 Seconds...

play fullscreen
1 / 35

So you want to start a digital library? - PowerPoint PPT Presentation


  • 149 Views
  • Uploaded on

So you want to start a digital library?. A presentation by Tom, Hugh, and Noel. Digital Libraries in focus. UC Berkeley Digital Library Project The Perseus Project The Digital Scriptorium The William Blake Archive. Berkeley D-Lib Overview.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'So you want to start a digital library?' - andra


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

So you want to start a digital library?

A presentation by Tom, Hugh, and Noel

digital libraries in focus
Digital Libraries in focus
  • UC Berkeley Digital Library Project
  • The Perseus Project
  • The Digital Scriptorium
  • The William Blake Archive
berkeley d lib overview
Berkeley D-Lib Overview
  • Very much a test bed: emphasis on developing technologies for the digital library, not so much focus on building a coherent, fully-functional library (so far…)
  • technology-focused
  • Contents
perseus project
Perseus Project
  • “The Perseus Project is an evolving digital library of resources for the study of the ancient world and beyond. Collaborators initially formed the project to construct a large, heterogeneous collection of materials, textual and visual, on the Archaic and Classical Greek world…Recent expansion into Latin texts and tools and Renaissance materials has served to add more coverage within Perseus and has prompted the project to explore new ways of presenting complex resources for electronic publication.”
  • (inter)connection-focused
  • Starting Points
the digital scriptorium
The Digital Scriptorium
  • The Digital Scriptorium is basically the extension into cyberspace of Duke’s Rare Book, Manuscript, and Special Collections Library.
  • collection-focused
  • Projects
the william blake archive
The William Blake Archive
  • “..the Blake Archive was conceived as an international public resource that would provide unified access to major works of visual and literary art that are highly disparate, widely dispersed, and more and more often severely restricted as a result of their value, rarity, and extreme fragility.”
  • (single) book-focused
      • Does one thing really well
  • The texts
components of a digital library
Components of a Digital Library

STORAGE

MANAGEMENT

DELIVERY

formatting

metadata/history

search capabilities

  • useful/meaningful metadata can increase usability
  • must be standardized
  • multiple ways of accessing the data (entry-points)
  • need for multiple formats
  • digital library should maintain detailed records of object history

archiving

browsing

collections

  • predefined structure to the data (cf. collections)
  • on-site, system-wide, both?
  • arbitrary groupings of data

user interaction

  • ability to re-view the data

accessibility

  • users of differing physical and mental capabilities must have access to the library
components of a digital library1
Components of a Digital Library

SERVER

STORAGE

SEARCHING

DELIVERY

USING

BROWSING

MANAGEMENT

CLIENT

MANAGEMENT

USING

DELIVERY

SEARCHING

BROWSING

uc berkeley s digital library project

STORAGE

MANAGEMENT

DELIVERY

UC Berkeley’s Digital Library Project

formatting

metadata/history

search capabilities

  • addresses and implements multiple search techniques; results vary
  • represents a test-bed of info and archiving best-practices
  • metadata standards defined, some implemented

archiving

collections

browsing

  • discrete, disconnected collections
  • addresses and implements multiple searching techniques

user interaction

  • experimental tools in text, image, GIS, etc. (buggy)

Informix Universal Server. Database backend.

DBI. Perl module for web cgi access to databases.

AMASS Storage software. From Emass/ADIC. "Transforms offline storage into direct access mass storage."

Cheshire II Search Engine. In-house search engine project.

accessibility

  • information-overkill
  • reliance on Java = not universally accessible
the perseus project

STORAGE

MANAGEMENT

DELIVERY

The Perseus Project

formatting

metadata/history

search capabilities

  • standardized to the Web
  • metadata embedded with texts, images
  • multiple access points (via both texts and objects)
  • only basic formats available

archiving

collections

browsing

  • offers numerous predefined collections
  • further file formats retained

user interaction

accessibility

  • easily navigable site
  • not approved by Bobby

UNAVAILABLE TO THE PUBLIC.

duke s digital scriptorium

STORAGE

MANAGEMENT

DELIVERY

Duke’s Digital Scriptorium

formatting

metadata/history

search capabilities

  • multiple Web-centric formats available
  • metadata via SGML/HTML
  • basic metadata search capabilities (limited by SGML)

archiving

collections

browsing

  • offers useful predefined collections (“canned searches”)
  • masters not retained, only JPEG format used
  • discrete, disconnected collections
  • includes history behind data

user interaction

DynaWeb. From Inso. A tool that allows searches through structured SGML documents and translates from SGML to HTML on-the-fly.

SGML. Using the Encoded Archival Description DTD.

Webinator. From Thunderstone. Used to index the various static HTML pages in the Scriptorium. Also used to index the Duke Papyrus Collection.

accessibility

  • easily navigable site
  • Bobby-approved
the william blake archive1

STORAGE

MANAGEMENT

DELIVERY

The William Blake Archive

formatting

metadata/history

search capabilities

  • multiple, standard formats, most available from the site
  • metadata retained on every region of every image
  • text and image-based searches (both based on metadata)

archiving

collections

browsing

  • Works-in-Progress area allows for collaborative CM
  • the limited scope limits passive collection-browsing
  • TIFF originals retained

user interaction

  • INote software allows for individual image markup

accessibility

  • easily navigable site
  • not approved by Bobby

DynaWeb.SGML.Java Applets. (ImageSizer, INote)

finding things in the digital library
Finding Things in the Digital Library

Analog Library

Catalog / keyword search

Browsing

Special collections (varied / unique finding aids)

~ ~ ~ ~

Digital Library

Metadata-based searches

Virtual collections (varied finding aids)

Content-based (exploitive) searches

finding examples
Finding Examples
  • Using metadata (Blake Images)
  • Browsing (Perseus Texts)
  • By collection (Digital Scriptorium)
  • Using content
    • Berkeley Cheshire II (Documents)
    • Berkeley Cheshire II Tilebars (Documents)
    • Other media types (images, video, audio)
  • Helping the user distinguish (or not)
    • Berkeley (what am I really searching against)
    • Perseus search tools (metadata-based with pointers to content-based options)
texts in berkeley d lib
Texts in Berkeley D-Lib
  • Multivalent documents
    • “Multivalent documents (MVD) represent an open, extensible, network-centric document model.”
      • Enable high functionality for scanned page images. E.g., in a scanned page image “enlivened’” by MVD, you can select and paste text, highlight matching search terms, and perform a variety of other manipulations, such as sorting a table in a scanned image.
      • Support distributed annotations. With MVD, annotations of many sorts can be made by any user on any supported document type.
      • Generate alternative views of components of documents. For example, MVD lenses allow a different view of a region of a screen. A magnification lens will magnify a region; an “OCR lens’” will show what an OCR process produces for that region.
      • Alternative selection. Instead of just selecting text, you can chose to have the selection modified in particular ways.
texts in perseus
Texts in Perseus
  • Homer’s Iliad 1.1-32 (in Greek)
  • Homer’s Iliad 1.1-32 (with links to Perseus’ morphology parsing tool)
  • Homer’s Iliad 1.1-32 (with links to lemmas in the online lexicon)
  • Homer’s Iliad 1.1-32 (in Beta Code)
  • Homer’s Iliad 1.1-32 (in English, with links to searchable terms)
the digital scriptorium1
The Digital Scriptorium
  • Metadata:
    • EAD, which has 145 tags.
    • EAD is designed to describe hierarchical collections. An EAD file contains components (<c></c>), which can contain other components nested within them (<c01><c02></c02></c01>).
an example of ead
An Example of EAD

<c03 level="item"><did>

<unitid id="SHE-156">156.</unitid>

<unitdate normal="16650404">4 April 17 Chas. II [1665]</unitdate><note><p><list>

<item>(1) <persname authfilenumber="957702">George Shepperd</persname> of the <geogname authfilenumber="NT0526">town and county of Newcastle upon Tine</geogname>, gent.</item>

<item>(2) <persname authfilenumber="23549">Anne Carr (n&eacute;e Franks)</persname> of <geogname authfilenumber="SS0032">South Sheiles</geogname> in the county of Durham, widow.</item></list> Lease by (1) to (2) of his half part of the messuage in <geogname uthfilenumber="PO0016">Pockerley</geogname> in the county of Durham with its <subject authfilenumber="c56">collieries and coalmines</subject>, and a fulling mill.<lb>

Term: 1 month from <date normal=16650331">31 March 1665</date>.<lb>

Consideration: &pound;10.<lb>

Signed: (1 ). Seal: red wax, papered, on parchment tag.</p></note>

<physdesc><extent>Parchment. 1m.</extent></physdesc>

<unitloc loctype="container">114/5-1</unitloc></did>

<c04 level="item"><did>

<unitid id="SHE-156a">156. (a)</unitid>

<unitdate normal="16650414">14 April 17 Chas. II [1665]</unitdate><note><p>

Attached to 156:<lb>

Minutes of consultation with <persname authfilenumber="68239"> cousin Nan</persname> about above agreement.<lb> Refers to a book of surveys called <title render="italic">The Book of Pockerley</title>created in <date normal="162203xx">March 1622</date>.<lb>

See <ref target="SHE-2056">no. 2056 below</ref> for letter containing description of this meeting.</p></note>

<physdesc><extent>Paper. 1f.</extent></physdesc>

<unitloc loctype="container">114/5-2</unitloc></did></c04></c03>

rendered into plain text
Rendered into plain text:

156. 4 April 17 Chas. II [1665]

(1) George Shepperd of the town and county of Newcastle upon Tine, gent.

(2) Anne Carr (née Franks) of South Sheiles in the county of Durham, widow.

Lease by (1) to (2) of his half part of the messuage in Pockerley in the county of Durham with its collieries and coalmines, and a fulling mill.

Term: 1 month from 31 March 1665.

Consideration: £10.

Signed: (1 ). Seal: red wax, papered, on parchment tag.

Parchment. 1m.

[114/5-1]

156. (a) 14 April 17 Chas. II [1665]

Attached to 156:

Minutes of consultation with "cousin Nan" about above agreement.

Refers to a book of surveys called The Book of Pockerley created in March 1622.

See no. 2056 below for letter containing description of this meeting.

Paper. 1f.

[114/5-2]

texts in the blake archive
Texts in the Blake Archive
  • Also, essentially, multivalent documents.
    • Though with much stricter bounds than the Berkeley MVD’s.
  • They, too, use SGML markup to describe their archive.
texts in the blake archive1
Texts in the Blake Archive

<component type="figure" location="D"> <characteristic>shepherd</characteristic> <characteristic>male</characteristic> <characteristic>young</characteristic> <characteristic>short hair</characteristic> <characteristic>tights</characteristic> <characteristic>standing</characteristic> <characteristic>contrapposto</characteristic> <characteristic>looking</characteristic> <illusobjdesc> A young, short-haired male shepherd in tights stands in contrapposto, watching his grazing flock of sheep--perhaps looking at the sheep that lifts its head toward him. He holds a crook in his left hand; his purse is visible near his right knee. </illusobjdesc> </component>

possibilities for texts
Possibilities for texts
  • Full markup = very powerful finding/linking capabilities
    • Text Encoding Initiative (~400 tags!)
    • Perseus is an example of how fully marked-up texts can be used.
what else
What Else?
  • Georeferences
  • Contextual finding/browsing
  • Intelligent full-text searching
imagery
Imagery

ISSUES

  • storage
  • management
  • delivery (searching, browsing, interaction)
imagery best practices example
Imagery: Best-Practices Example

The Blake Archive

storage of multiple resolutions and TIFF originals

TIFF v. JPEG

imagery best practices example1
Imagery: Best-Practices Example

The Blake Archive

metadata applied to images regionally

imagery best practices example2
Imagery: Best-Practices Example

The Blake Archive

As a result, searching is improved.

It further allows for interactive programs like INote, a regional metadata assignment program, used by contributors (thusfar) to enhance this metadata store.

imagery further issues
Imagery: Further Issues?

The Perseus Project

While Perseus does archive larger image versions, the images that are accessible on the Web are useful only as peripheral learning aides, not learning tools in themselves.

Perseus has strong searching tools for text and has applied this paradigm to its imagery. This creates very powerful and useful metadata binding to the image object. But can we do more?

Unacceptablefor research.

imagery interesting delivery
Imagery: Interesting delivery?

Image searching via pattern recognition.

Berkeley’s Blobworldhttp://elib.cs.berkeley.edu/photos/blobworld/

geographic data in the digital library
Geographic data in the digital library
  • Tools for using geodata
  • Perseus Atlas
  • Berkeley GIS viewer

Tools for searching with geodata or relating it to other objects

Bueller?

Bueller?

searching relating geodata
Searching / relating geodata
  • Interactive map: select feature(s) by browsing or query, get access to “related” objects in the collection
  • Pick a non-geodata object, use GIS & full-text searches in background to “lookup” potentially related objects (geodata and/or not)
  • Plot features found in non-geodata source on an interactive map
lessons learned
Lessons learned
  • What kind of digital library (libraries) do we want?
    • Repository & access for multiple more-or-less discrete collections?
    • Cutting-edge test bed for cool DL technologies?
    • “Working library” to support a set of defined needs (research, teaching, outreach)?
    • A set of tools, resources & expertise to allow units and divisions to assemble one or more of the above?
    • Hybrid?
lessons learned1
Lessons learned
  • What kind of digital library (libraries) do we want?
  • Clearly defined mission, capabilities, features and institutional home/support keys to successful implementation (Blake)
lessons learned2
Lessons learned
  • What kind of digital library (libraries) do we want?
  • Clearly defined mission, capabilities, features and institutional home/support keys to successful implementation (Blake)
  • The storage/management/delivery model will underpin whatever choices we make
    • How does each candidate vendor “solution” map onto this model?
    • How much customization / interconnectedness / extensibility is possible?
lessons learned3
Lessons learned
  • What kind of digital library (libraries) do we want?
  • Clearly defined mission, capabilities, features and institutional home/support keys to successful implementation (Blake)
  • The storage/management/delivery model will underpin whatever choices we make
  • Mission before selection? Compromise on features inevitable but fraught with risk