1 / 13

Document management (aka ‘digital libraries’)

Document management (aka ‘digital libraries’). The Greenstone Group: Professor Ian Witten (leader); David Bainbridge, Dave Nichols, S.J. Cunningham, Steve Jones, Te Taka Keegan, Annika Hinze. Document management Content management Metadata management Multimedia documents

hiroko-boyd
Download Presentation

Document management (aka ‘digital libraries’)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Document management (aka ‘digital libraries’) The Greenstone Group: Professor Ian Witten (leader); David Bainbridge, Dave Nichols, S.J. Cunningham, Steve Jones, Te Taka Keegan, Annika Hinze

  2. Document management Content management Metadata management Multimedia documents Alerting and event notification support OCR-ing services Document & collection visualization User needs analysis Text mining Automatic metadata extraction Our work includes…

  3. Greenstone software • ‘digital library’ construction, use, and maintenance software • Developed at Waikato (www.greenstone.org) • Open Source • Widely used internationally (UNESCO, FAO, Texas A&M Uni, Kyrgyz Republic, …) Digital library: A collection of digital objects (text, video, audio) along with methods for access and retrieval, [user]and for selection, organisation, and maintenance[librarian]

  4. Greenstone software features Collections • “Library” = set of separate collections“Collection” = set of separate documents • Multigigabyte collections • Hierarchical document model • Multimedia picture, voice, music, video collections • Multi-language documents Unicode throughout • Multi-language interfaces French, Chinese, Arabic … • Web browser or CD-ROM • Searching full-text and fielded, ranked or boolean • Browsing hierarchical indexes created from metadata • Metadata Dublin core + collection-specific extensions • Plugins different document types and metadata specifications • Classifiers create browsing indexes (collection editor decides) • Compression techniques throughout uses MG • Distributed collections coming soon, with Corba • Open-source software free, extensible Documents Access Importing Distributing

  5. Greenstone supports: multilanguage documents

  6. Greenstone supports: hierarchically structured documents A book

  7. Greenstone supports: collection design, maintenance Designing a collection with the Gatherer

  8. Greenstone supports: a wide (and growing) set of file formats • DOC • PDF • XLS • LaTeX • Refer • MARC • … • highly extensible through ‘plugin’ mechanism

  9. Mobile document access • handheld information access • browsing methods for varying screen sizes • studies on search behaviour (on- and off-line) • support for non-text documents (FunkyZoom views of maps, images)

  10. Browsing and exploration: hierarchical phrase index • What’s in this collection? • Is it any good? • What coverage for topic X? • My query returned too much/little, what now?

  11. Recent and proposed projects • Making documents mobile: moving between large online collections and a PDA • Text mining: extracting quality metadata from legacy documents • User needs analysis: what sort of documents do a given set of users require, and how can the collection be managed? • Visualization: making it easy to ‘see’ what’s in a collection, and supporting effective browsing

  12. Recent and proposed projects • Multi-language collections: tailoring a document collection interface and interaction mechanisms to the language of its users • Alerting services: bringing potentially useful documents to the user’s attention, without overwhelming them • Supporting unusual users: collections for the physically disabled, illiterate or semi-literate, children, … • Audio and image collections: novel browsing and searching mechanism

  13. Recent and proposed projects • Storage and searching: developed highly efficient techniques for storing, indexing, and searching text documents; implemented in Greenstone, but portable to other document management software • Usability analysis: how easy is it to use your current document collection? How can access be improved? • And a host of wacky and cool things: collaging document collections, music retrieval systems, ‘aerial’ views of documents, …

More Related