1 / 16

Co-leads: Wendy Pradt Lougee & Richard Luce NSF Digital Data Workshop September 2006

Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats, and institutions?. Co-leads: Wendy Pradt Lougee & Richard Luce NSF Digital Data Workshop September 2006. Process.

rosine
Download Presentation

Co-leads: Wendy Pradt Lougee & Richard Luce NSF Digital Data Workshop September 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Infrastructure BreakoutWhat capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats, and institutions? Co-leads: Wendy Pradt Lougee & Richard Luce NSF Digital Data Workshop September 2006

  2. Process • Articulation of assumptions/principles for data infrastructure • OAIS model: framework for exploring data infrastructure • Recommendations

  3. Assumptions • Infrastructure = technology, people, instruments, data • Tension: infrastructure requires a shared layer and needs to be discipline agnostic, while supporting discipline requirements • Culture prevails; must address incentives and process • Coordination problem - the universe will be split up • Both federated and central models; incentivize and leverage local investment • Collaboration critical model • The role of library in the curatorial function, e.g., governance, standards, collection development, privacy, etc. • Imperative of stakeholder representation

  4. Assumptions • Infrastructure is a shared common good - data is a public good • Infrastructure must: • support multiple representations of data • Layering and building / reuse of components • Support appropriate security systems, privacy, confidentiality, and enable trusted relationships

  5. Structuring the Problem • Functions (OAIS model): • Ingest & submission • Access • Archive • Management (preservation & storage)

  6. OAIS Reference Model http://public.ccsds.org/publications/archive/650x0b1.pdf

  7. Receiving Submission Information • Packages • Quality assurance • Generating Archival Information Packages • Extracting descriptive information • coordinating updates. http://public.ccsds.org/publications/archive/650x0b1.pdf

  8. Monitoring the environment of the OAIS Providing recommendations to ensure that information stored in the OAIS remains accessible Data migration Monitoring technology environment

  9. OAIS REFERENCE MODEL Populating, maintaining & accessing descriptive information which identifies archive holdings & administrative data used to manage the archive Database updates Reports

  10. OAIS REFERENCE MODEL Soliciting and negotiating submission agreements with producers Auditing submissions to ensure they meet archive standards Configuration management of system hardware and software.

  11. OAIS REFERENCE MODEL Support users determining the existence, description, location and availability of information stored in the OAIS Allowing users to request and receive information products

  12. Recommendations • Build capacity and models • Policy infrastructure to create culture of sharing • Research initiatives • Education

  13. Capacity and Models Support creation of collaboratories that model essential infrastructure for data-enabled research. • Scaled proposals: small, medium, large • Collaborations among stakeholders • Instantiate data models, technical and organization architectures • Discipline-based or cross-disciplinary • Incorporate sustainability plan

  14. Policy Create policy to ensure contribution of research data as shared research asset, enabling reuse in new research contexts - shared public goods concept • Incentives for researchers to contribute data to collaborative environments (deposit) • Support structures and training • Encourage archive-ready data and objects with appropriate metadata and formats

  15. Research Invest in problem-based research that will fuel collaborative data environments. • Interoperable data models • Specification of IP and access rights • Security and trust • Collaboration tools • Cross-disciplinary discovery • Automatic generation of metadata

  16. Education Stimulate the development of expert data curators and informed scientific community. • Partner with IMLS to support new programs in data science/curation. • Develop scientist capacity and culture to contribute to data collaboratives. • Next generation scientist and information specialist

More Related