1 / 17

European Metadata Initiatives: The METAe Metadata Engine

European Metadata Initiatives: The METAe Metadata Engine. Simon Tanner Higher Education Digitisation Service http://heds.herts.ac.uk. Overview. Introduction to HEDS. Current metadata contexts in Europe. METAe - The Metadata Engine Project Project summary Project objectives

bryony
Download Presentation

European Metadata Initiatives: The METAe Metadata Engine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service http://heds.herts.ac.uk

  2. Overview • Introduction to HEDS. • Current metadata contexts in Europe. • METAe - The Metadata Engine Project • Project summary • Project objectives • Description of work • Benefits Simon Tanner http://heds.herts.ac.uk

  3. Introduction to HEDS • HEDS provides advice, consultancy and a complete production service for digitization and digital library development. • Recent projects include: • Rekeying and tagging 35 million characters in Anthropology • 17th century Trade Directories • British newsreel scripts from the 1940’s • Transparencies - artwork, manuscripts, stained glass • Photographic prints and postcards - local history collections • Microfilm: manuscripts, political pamphlets • Consultancy: The British Library, Oxford University, New Opportunities Fund applicants. Simon Tanner http://heds.herts.ac.uk

  4. Current metadata contexts: SCHEMAS: Forum for Metadata Schema Implementors http://www.schemas-forum.org/ “SCHEMAS will inform schema implementers about the status and proper use of new and emerging metadata standards. The project will support development of good-practice guidelines for the use of standards in local implementations. It will investigate how metadata registries can support these aims.” Simon Tanner http://heds.herts.ac.uk

  5. Current metadata contexts: RSLP Collection Description http://www.ukoln.ac.uk/metadata/rslp/ “Based on a thorough modelling of collections and their catalogues, the project will develop a collection description metadata schema and associated syntax using the Resource Description Framework (RDF). We will develop a simple Web-based tool in order that projects can describe their collections and prototype a search service.” Simon Tanner http://heds.herts.ac.uk

  6. Current metadata contexts: CEDARS: CURL Exemplars for Digital Archives http://www.curl.ac.uk/projects/cedars.html “There is a pressing need for a strategy for digital preservation… the CEDARS project aims to address the strategic, methodological and practical issues and will provide guidance for libraries in best practice for digital preservation.” CEDARS are identifying the descriptive metadata elements that should be gathered to maximize the continued accessibility of digital resources. Simon Tanner http://heds.herts.ac.uk

  7. Presentation of an EU-project within the5th Framework Programme http://meta-e.uibk.ac.at/

  8. Project summary • To make the digital conversion of printed material • more reliable in terms of digital preservation • more cost-effective in terms of automation • more attractive in terms of user-friendliness and accessibility. • METAe will develop a software package to extensively automate and improve the generation of metadata. http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

  9. Project summary • The goals will be achieved by applying new technologies for character, layout and document recognition. • The METAe package will convert the captured information into XML documents. • XML files serve as a basis for various applications, such as: new XML search engines, navigation tools, electronic books, audio books, or the automated production of HTML, XHTML, PDF or PS files. http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

  10. Participants: co-ordinator & technical partners • Co-ordinator: Leopold-Franzens-Universität, Innsbruck (A) • Institut für Angewandte Informatik, University of Linz (A) • Mitcom Neue Medien GmbH (G) • CCS Compact Computer Systeme (G) • Dipartimento di Sistemi e Informatica, University of Florence (I) • Scuola Normale Superiore, Centro di Ricerche Informatiche per i Beni Culturali (I) http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

  11. Participants: library & research partners • Universidad de Alicante (S) • Friedrich-Ebert-Stiftung (G) • Cornell University Library (USA) • Bibliothèque nationale de France (F) • The National Library of Norway (N) • Biblioteca Statale A. Baldini (I) • Karl-Franzens-Universität Graz, (A) • Higher Education Digitisation Service HEDS (UK) http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

  12. Project objectives • Introduction of layout and document analysis as a key technology in future digitisation software. • Development of capturing and conversion tools for the automated recording and generation of administrative and descriptive metadata. • Development of an omnifont OCR-engine specialised in processing old European typefaces of the 19th century („Fraktur“, Gothic fonts). http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

  13. Project objectives • Evaluation of digital preservation standards(i.e. XML, EAD, TEI or ISO 12083) • Development of an XML search engine for tagged full texts and images. http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

  14. Description of work 1. Input module for scanning and importing existing metadata 2. OCR-engine specialised in typefaces of the 19th century 3. Document analysis module 4. Page layout analysis module 5. Rules and controlled vocabulary for automated recognition process 6. Conversion module assembling an XML document containing all recognised metadata 7. Export module for the XML enriched document and the scanned image http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

  15. http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

  16. Benefits 1. Reduce the need for manual post-processing of scanned content. 2. Produce a rich output, with metadata on all levels: administrative, structural and format metadata. 3. Offer new possibilities for successful long-term preservation. 4. New ways to enhance access, re-use and multi-versioning. 5. Selective and distributed correction of OCR‘d content. 6. Benefits for the visually disabled and also in scenarios of functional disability. http://meta-e.uibk.ac.at/ Simon Tanner, HEDS

  17. European Metadata Initiatives: The METAe Metadata Engine Simon Tanner Higher Education Digitisation Service Email: heds@herts.ac.uk http://heds.herts.ac.uk

More Related