1 / 22

Dr. Martin Halbert MetaArchive Cooperative Wednesday, December 3, 2008

Comparison of Strategies and Policies for Building Distributed Digital Preservation Infrastructure: Initial Findings from the MetaArchive Cooperative. Dr. Martin Halbert MetaArchive Cooperative Wednesday, December 3, 2008 International Digital Curation Conference Edinburgh, Scotland.

millerlee
Download Presentation

Dr. Martin Halbert MetaArchive Cooperative Wednesday, December 3, 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparison of Strategies and Policies for Building Distributed Digital Preservation Infrastructure: Initial Findings from the MetaArchive Cooperative Dr. Martin Halbert MetaArchive Cooperative Wednesday, December 3, 2008 International Digital Curation Conference Edinburgh, Scotland

  2. Overview • Needs of cultural memory organizations (CMO) for digital preservation infrastructure that led to creation of MetaArchive • Framing comparison of some major digital preservation efforts and service offerings • Common distributed digital preservation (DDP) strategies • Findings from the MetaArchive Cooperative about DDP cooperatives

  3. Cultural Memory Organizations (CMOs) • Small to medium-sized libraries • Small research institutes • Historical associations • Archives • Museums • NOT enormous national agencies (US LoC, UK BL) • Organizations responsible for institutional memory / research assets of their communities • Culture here means any resource of primary research value, for humanities, science, or other scholarship

  4. Gaps in Digital Preservation Efforts • 66% of cultural heritage institutions (academic libraries, archives, art museums, public libraries, and other similar kinds of institutions) report that no one is responsible for digital preservation activities • 30% of all archives have been backed up one time or not at all Source: 2005 NEDCC Survey by Bishoff and Clareson

  5. The Problem • CMOs are rapidly digitizing or acquiring local digital archives with long term value for both scholarly and public research purposes • Yet CMO professionals most often lack affordable and scalable DP infrastructures • This lack of access to effective means for long term preservation of digital content is aggravated by a lack of consensus on DP issues and professional roles and responsibilities

  6. Digital Curation/ Preservation: An Emerging Field • Historically CMOs have been responsible for preservation of institutional memory • CMO administrators and funders are uncertain about how to carry out these responsibilities in the digital age • No consensus in CMOs on roles, best practices, or priorities in digital preservation • Many competing frameworks and assumptions brought forward from external groups and practitioners seeking to create this new field

  7. What led to MetaArchive? • Planning meetings by a group of US librarians and archivists in 2002-2003 on concerns about preserving digital archives • Felt that we needed to do something practical to help each other preserve our data • Not based on studies, just the observation of our anxieties about doing something together to keep our (expensive) digital materials preserved and viable

  8. The Need for Collaborative Approaches “The increased number and diversity of those concerned with digital preservation—coupled with the current general scarcity of resources for preservation infrastructure—suggests that new collaborative relationships that cross institutional and sector boundaries could provide important and promising ways to deal with the data preservation challenge. These collaborations could potentially help spread the burden of preservation, create economies of scale needed to support it, and mitigate the risks of data loss.” - The Need for Formalized Trust in Digital Repository Collaborative Infrastructure NSF/JISC Repositories Workshop (April 16, 2007)

  9. MetaArchive A distributed digital preservation cooperative for digital archives • Established in 2003 under the auspices of and with funding from the National Digital Information and Infrastructure Preservation Program (NDIIPP) of the US Library of Congress • A functioning DDP network using/building open source software, • Organized as an incorporated nonprofit cooperative of libraries and other cultural memory organizations • Sustained by organization fee memberships, cooperative agreement with US LoC , and other sponsored funding • Provides training and models for other groups to establish similar distributed digital preservation networks • Fosters broader awareness of digital preservation issues • Designed to address “in-the-trenches” needs of CMOs after environmental scans of other options

  10. Comparison of Selected Digital Preservation Efforts • National Scientific Research Agency Efforts • PubMed Central Efforts in US and UK • Social Science Dataset Archives (UK DA, US ICPSR) • Big-Science Agency Efforts (UKRDS, NSF DataNET) • Cross-Disciplinary National Efforts • US NDIIPP • UK PLANETS • Non-Governmental E-Journal DP Efforts • LOCKSS • Portico

  11. Differences and Variations • Variation evident in understanding of what constitutes digital curation/preservation (scope, practices, priorities) • Relative differences in prescriptivity and degree of centralization (top-down vs. bottom-up planning) between UK and US • Many specific differences in preservation and access aims and technologies

  12. Similar Patterns • Emphasis on collaboration between groups to accomplish digital curation/preservation • Exploration of new professional roles, expertise, models, and best practices • Virtually all efforts examined embrace distributed digital preservation strategies • Most programs (then and now) do not directly address the needs of CMOs

  13. Distributed Digital Preservation Strategies • Digital curation/preservation starts with secure and distributed bit-preservation & good metadata • Technology for secure replication: Many good DDP options (we use a private LOCKSS network) • Collaboration for digital curation/preservation • Provides a framework for systematically exploring new data curation lifecyle roles for CMOs to carry out their core responsibility for curating institutional memory materials • Cooperative strategies for sustaining distributed digital preservation infrastructures

  14. MetaArchive Phase I (2004-2007) • Developed a functioning network for distributed digital preservation (DDP) used by institutions with shared subject domain focus for mutual benefit • Developed this technical solution for DDP based on a reuse of LOCKSS technology, in the form of a separate network with higher capacity nodes • Created a conspectus database to capture collection-level preservation metadata pre-ingest • Created an administrative nonprofit corporation as an independent legal entity for membership agreements • Now preserving via DDP more than 650 collections from many different organizations

  15. Collection Variety Collections include: • Images • Text files • Multimedia files • Datasets • Program executables

  16. MetaArchive Membership • 11 institutions currently: • Emory, GA Tech, Auburn, VA Tech, FSU, Louisville, Hull, Rice, Boston College, Folger, and US Library of Congress • Doubled in size of membership within past year, plan to double again in next 12 months • Now undertaking strategic alliances with other membership organizations to provide DDP services (NDLTD)

  17. Catalytic Efforts • Host workshops in distributed digital preservation strategies • Instructing new MetaArchive members in network processes • Advise other groups considering DDP approaches • Advised/assisted in creation of two additional DDPNs: • Alabama • Arizona

  18. MetaArchive Phase II (2007-2010) • Established additional distributed archives • African Diaspora • Electronic Theses and Dissertations • Early modern literature • New software tools for enhanced conspectus, interoperability with grid-computing, format migration services • Became international with addition of Hull University in UK • Upcoming DDP workshops • Plan to double in size each year (on average) for this period, to reach a robust cooperative size • With funding from NHPRC will provide consulting and outreach services on the MetaArchive model for DDP services

  19. Membership Levels • Contributing Member Sites are institutions that need to preserve digital content, and therefore decide to contribute digital content into the preservation network. The preservation network acts for the common good to preserve the at-risk content submitted by the contributing sites. Contributing sites may also be preservation sites. • Preservation Member Sites are responsible for the basic ongoing activity of preserving digital content. At a minimum, every preservation site must include responsible staff and a node server of the relevant preservation network. Preservation sites collectively comprise a preservation network. • Sustaining Member Sites are responsible for steering committee of the cooperative, technical development of the computer systems that enable the preservation network. Obviously, development sites may also be preservation sites and/or contributing sites.

  20. Individual Roles • Program Managers are leaders that accept responsibility for coordinating the activities of a digital preservation network. • Data Wranglers are programmers and other technically adept workers that prepare local digital archives for ingestion into a preservation network. • System Administrators are staff members that maintain individual preservation node servers of the relevant preservation network. • Selectors are staff that identify and prioritize content to be preserved. They will most often be knowledgeable concerning the content of an institution’s digital archives, and may have been the same individuals that originally created or acquired the archives.

  21. Findings:Why DDP Cooperatives? • Enables collaborative pooling of resources (staff, expertise areas, technology, infrastructure, funds) • Also allows institutions to retain ownership individually of their part of the infrastructure, expertise, and operations • Defuses competitive jockeying between CMOs; no one institution is the primary leader to which the others sign agreements • Allows for decentered ongoing operations as individual institutions may join or leave • Flexible; cooperatives can be assembled quickly without onerous new overhead, by leveraging sunk costs in existing institutions • Nonprofit organization promotes trust by other institutions from public sector

  22. Questions and Answers • Some contacts: • Martin Halbert (MetaArchive President, Emory representative) mhalber@emory.edu • Tyler Walters (MetaArchive Treasurer, GA Tech representative) tyler.walters@library.gatech.edu • Katherine Skinner (MetaArchive Executive Director) kskinne@emory.edu • Martha Anderson (LoC Program Officer) mande@loc.gov

More Related