1 / 27

Ya-ning Arthur Chen, Feng-chien Chung Computing Centre, Academia Sinica 11 April, ISGC 2008

Ya-ning Arthur Chen, Feng-chien Chung Computing Centre, Academia Sinica 11 April, ISGC 2008. A hybrid approach of digital long term preservation to institutional repositories - A case study of DSpace/SRB Integration. Outline. Background of MAAT From Website to Institutional Repository

kyria
Download Presentation

Ya-ning Arthur Chen, Feng-chien Chung Computing Centre, Academia Sinica 11 April, ISGC 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ya-ning Arthur Chen, Feng-chien Chung Computing Centre, Academia Sinica 11 April, ISGC 2008 A hybrid approach of digital long term preservation to institutional repositories - A case study of DSpace/SRB Integration

  2. Outline • Background of MAAT • From Website to Institutional Repository • Long Term Preservation & OAIS • The Hybrid Approach • Future

  3. MAAT – Background • The Metadata Architecture & Application Team (MAAT) was established in 2002 to engage in metadata research and service supportive for the National Digital Archives Program (NDAP) in Taiwan • To date, the MAAT has been supporting over 80 digital library projects of Taiwan E-Learning & Digital Archive Program (TELDAP, former: NDAP)

  4. system specifications, best practices of metadata standards, technical reports, research papers, briefings, and tutorial materials. MAAT – Motivation • A number of documents have been created and can be categorized into • questionnaires, • work sheets, • meeting records, • metadata mapping tables, • Most documents of the MAAT website are arranged in a static manner.

  5. MAAT Website http://www.sinica.edu.tw/~metadata Academia Sinica

  6. MAAT - Consideration1 • Document management and repository • over 1,000 documents and URL links have been arranged and served at the MAAT website. • the MAAT website needs an effective system of document management. • Access control • The MAAT website still lacks access control for document access.

  7. MAAT - Consideration2 • Workflow reengineering • the MAAT website adopts a centralized model to maintain documents and website arrangement. • This model is very complicated and labor-intensive, and the overhead cost is very high. • Usage Statistics Report

  8. MAAT - Challenge • Too manypublications, • Too muchchange(that is various documentversions), • Too manycontributors, and • Too manyinstitutions.

  9. Implementation Level Static Website Phase1: from website to IR Institution Repository

  10. DSpace - feature • Captures • Digital research material in any format • Directly from creators (e.g. faculty)‏ • Large-scale, stable, managed long-term storage • Describes • Descriptive metadata (Dublin Core) • Technical metadata (file size, format…) • Rights metadata (licenses, creative commons…) • Distributes • Via WWW, with necessary access control • Preserves • Persistent ID and Handle • Bitstream format registry

  11. DSpace - Data Model

  12. MAAT – Content1 • Content Type • 支援計畫 (Documents from the Projects we support) • 出版與活動 (Documents of Publication and Activity) • 計畫管理 (Project Management related – restricted documents) • 研究發展 (Research & Development - restricted documents) • 48 Communities, 110 collections, 783 items • Document Format • User upload: 794 pdf files, 446 ms word files, 59 ms powerpoint slides, 27 xml files, 17 jpeg images, 16 html files, 7 ms excel files…and the others • System generate: Over 1900 Plain Text files (mainly DSpace License files)…

  13. MAAT – Content2 • Access Method • DSpace user browse and search interface • Search engines (google, yahoo…etc.) • OAI-PMH harvesting

  14. MAAT DSpace http://pl11.sinica.edu.tw:8080/dspace/index.jsp

  15. DSpace - Consideration • The Need for Extending DSpace Storage Capabilities • The amount of documents grows so fast that an enormous size storage solution is required • The Lack of Risk Management Mechanism • The Reliable Backup and Disaster Recovery Systems are not included in the default DSpace Installation

  16. Implementation Level Statis Website Phase1: from website to IR Institution Repository Phase2: from IR to Long Term Preservation Institution Repository + Grid

  17. DSpace/SRB Approach1 • In 2004, NARA (with NSF/NPACI) has funded a project aimed at integrating DSpace and SRB to • allow DSpace to use the data grid as a storage layer • permit the exchange of authentic documents between them • NARA Proposal & Participants • San Diego Super Computer Center (SDSC)‏ • Member of National Partnership for Advanced Computational Infrastructure (NPACI) an NSF sponsored program • MIT Libraries • UC San Diego Libraries (UCSD)‏ • Hewlett Packard Laboratories (HP)‏ • National Archives and Records Administration (NARA)‏

  18. DSpace/SRB Approach2 • In DSpace, there can be multiple bitstream stores, each of these bitstream stores can be traditional storage or SRB storage. • Both traditional and SRB storage are specified by configuration parameters. • Both traditional and SRB bitstream stores are configured in dspace.cfg

  19. Examination of DSpace/SRB • An Open Archive Information System (OAIS) intends to preserve information for access and use by a Designated Community

  20. OAIS Functional Model

  21. Workflow

  22. OAIS Functional Model…Again DSpace RDBMS & SRB MCAT DSpace Submit Interface DSpace User Interface DSpace Ingest SRB Mass Storage DSpace Batch Import DSpace & SRB Administration

  23. Producer, Management and Consumer DSpace RDBMS & SRB MCAT DSpace Submit Interface DSpace User Interface DIP • Producer • DSpace may play the role of ingest SIP from producer, and generate AIP for Management & Storage • Management • SRB May play the role of receive AIP then Store & Manage data, and generate AIP for Access • Consumer • DSpace May Play the role of process the access request and generate the proper DIP for dissemination DSpace Ingest AIP AIP SIP SRB Mass Storage DSpace Batch Import

  24. Archives arrangement • Logical Archives structure: • DSpace allow multi-level communities and one level collection • Archive’s principle • Principle of provenance • Principle of respect des fonds • Physical Files Arrangement: • SRB Mass Storage Technology

  25. Future1 • Best Practice & SOP for DSpace/SRB integration • Deeper Check Against Activities of OAIS • Preservation Planning and policy • Monitor Producer/Management/Consumer’s service requirements and emerging technology, develop archival strategy & migration plan

  26. Future2 • Feasibility Evaluation • Migrate from SRB to others advanced technology, such as SRM, iRODS… • Adopt metadata approach to enhance digital preservation, such as PREMIS and METS (ex: structural map, behavior section…)

  27. Thank You

More Related