1 / 26

Creating a pragmatic pan-European framework for permanent access to the records of science

Creating a pragmatic pan-European framework for permanent access to the records of science. Dr. Peter Tindemans Chairman Task Force Permanent Access. Summary. 3 ICT Infrastructures: Networks, High Performance Computing, GRIDs

brasen
Download Presentation

Creating a pragmatic pan-European framework for permanent access to the records of science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Creating a pragmatic pan-European framework for permanent access to the records of science Dr. Peter Tindemans Chairman Task Force Permanent Access Peter Tindemans, Geneva, 10-07-06

  2. Summary • 3 ICT Infrastructures: Networks, High Performance Computing, GRIDs • 4th one will affect science as profoundly: an ‘infrastructure’ to provide long-term preservation of and access (“P&A”) to Records of Science (and digital heritage in general) • How does this look like? • How should we build it” • Major stakeholders from science, libraries, archives offer their strategic commitment to national governments and the EU to create in 3-5 years sufficient momentum Peter Tindemans, Geneva, 10-07-06

  3. Overview • Background: including Records of Science in Digital Heritage: Task Force Permanent Access. • The problem and range of technical solutions required • High-level, strategic, pragmatic approach: need and essence • How does ‘European Digital Infrastructure for Preservation of and Access to Records of Science’ looks like? • Alliance in the making • Financial challenge • Some issues Peter Tindemans, Geneva, 10-07-06

  4. 1. Background • Documents (+ images): libraries (+NASA, ESA) • Audiovisual media, cultural heritage: broadcasting organisations, museums, .. • Scientific and operational data: labs, communities, scientists, service providers,.. • Hence ‘curation’ • Digital libraries, digital archives, digital repositories • Preservation or perennial access Peter Tindemans, Geneva, 10-07-06

  5. Background in terms of process and getting political attention • Political attention for preservation focused on cultural heritage and on libraries (legal deposit) • Recently, inclusion of records of science in digital cultural heritage evolved from ‘records of history of science’ to ‘records of science in operation’; this concerns not just scope, but also nature of records: ‘data’ next to ‘documents’ (and other cultural physical artefacts) • Particular culmination point EU Conference “Permanent Access to the Records of Science” (National Library of Netherlands KB, Netherlands EU Presidency),1st November, 2004,The Hague. • Participants agreed to need to create European infrastructure for long-term preservation to and permanent access to records of science. KB urged to create a Task Force. Peter Tindemans, Geneva, 10-07-06

  6. Composition Task Force • Bertil Andersson, Chief Executive European Science Foundation; • Lynne Brindley, Chief Executive The British Library; • Wim van Drimmelen, Director General Koninklijke Bibliotheek; • Norbert Kroo, Secretary-General Hungarian Academy of Sciences; • Wolffried Stucky, professor Institute of Applied Informatics and Format Description Methods, Karlsruhe University, curator Max Planck Institute of Computer Science, Germany; • Malcolm Read, Executive Secretary Joint Information Systems Committee, UK; • Vincenzo Beruti , ESA/ESRIN; • John Wood, Chief Executive Council for the Central Laboratory of the Research Councils, UK; • Peter Hendriks, Board Springer Science and Business Media, Executive Board International Association of Scientific, Technical and Medical Publishers. • Tomas Lidman, Director General The National Archives of Sweden; • Peter Tindemans, chair, on behalf of the Koninklijke Bibliotheek. (reflects ‘data’ and ‘documents’) Peter Tindemans, Geneva, 10-07-06

  7. 2. Problem Science-angle • Individual scientist • Maintaining and accessing databases built up by individuals: e.g. Madison database on GDP • Requirements of journals and funders with regard to supplementary or original data • New cultural paradigm challenges individuals, universities, funders, etc. • Large research organisations and communities (CERN, ESA,..): volume of data • European social sciences data archives Libraries- and archives angle (“perennial storage”) • acidification threatened paper; obsolescence and volume explosion jeopardise digital heritage Other cultural heritage organisations • Similar to libraries and archives Peter Tindemans, Geneva, 10-07-06

  8. Dimensions of the problem • Technical • Digital data unstable • Perishable • Migration (or other techniques to ensure permanent accessibility if software and hardware changes) • Constant care and intervention • Interoperability • Volume • Economic • Cost estimates • Business model • Public good nature • Relation to Open Access • Build into normal R&D funding model • Digital rights/access management • Organisational, including ‘data model of world: (producing and (re-using data) Peter Tindemans, Geneva, 10-07-06

  9. Range of solutions required: RDD programme for curation in general, preservation in particular • Storing Petabytes to 100s of Petabytes to Exabytes, surviving changes in hardware and software technologies, retrieving information • Standardised approach to describe information (metadata) and management of information as successive ‘virtualisation’ layers (hardware, data, knowledge, workflows, trust, management) to enable fully automated, distributed solutions • Complex dynamic datasets and databases • Legal solutions for digital access and rights management • Economic business models based on value-chain analysis and public-good aspects • Technical tools, e.g. to overcome ‘museum of old ICT technologies’, and cumbersome migration Peter Tindemans, Geneva, 10-07-06

  10. Digital preservation methods suggested (Thibodeau, 2002) Peter Tindemans, Geneva, 10-07-06

  11. What has happened? • Documents (+images): research libraries (esp. US), deposit libraries (esp. Europe: BL, KB) • Data: individual labs, scientists, networks of archives • Some national efforts: UK, e.g. JISC, Digital Preservation Coalition (+ Digital Curation Centre), Germany (NESTOR) emerging, Netherlands • Some EU-funded projects: scattered, focus often on co-ordination • KB: back-up arrangements with several large scientific publishers • Some global co-operation, some efforts at standardisation Peter Tindemans, Geneva, 10-07-06

  12. 3. High-level, strategic, pragmatic approach: need • Increasing awareness among experts, sometimes institutions about size and complexity of problem of P&A. • Many projects, standards, good practices, etc. But: • No recognition in Europe that preserving and making accessible the digital heritage on very long time scales is strategic issue for • Organisations (with few exceptions) • Governments, • as well as many private sector parties no financing mechanism • In USA since 2002 National Digital Information Infrastructure and Preservation Program: Library of Congress working together with NSF, research libraries, archives etc; 100 M$ to start with; recently NARA got 300 M$ (emphasis still on documents). Peter Tindemans, Geneva, 10-07-06

  13. High-level approach: essence • Make ‘digital heritage’ stakeholders understand at ‘board level’ • economic and cultural importance of P&A for their strategic development • Involve public and private parties: essential to find business model based on • private and user interests and cost allocations, • public infrastructure: important ‘public good’ aspect. • Adopt non-technical ‘model of world’ as basis for the ‘infrastructure’ • Adopt practical way ahead • Where is highest impact possible? • Involve initially not too few, not too many stakeholders • Connect to ongoing activities: don’t replace, but integrate responsibilities Peter Tindemans, Geneva, 10-07-06

  14. a. Highest impact: focus on Records of Science, taken in broad sense Two worlds • Cultural heritage • Begins to include digital heritage • UNESCO; ‘memory institutions’: archives, deposit libraries, museums. • Science = ‘records of history of science’. • Politically increasingly visible: UNESCO, EU • Records of science. • S&T in digital age: ‘data’ next to ‘documents’; small part to spill into traditional archives. • ‘Science’ = S + T; NSE+BMS+SSH; Large scale data collection for operational services and science (meteorology, GIS, census, …); experiments + observations + simulations + surveys + census and poll + history records; data also includes ‘enriched’ and ‘curated’ data: “knowledge preservation” Gearing up best done by focusing on ‘Records of Science’ • Greatest momentum: • Inherent needs of scientific community and organisations • High ‘specific mass’ (including financial mass) • Covers broad field • Academic and deposit libraries, scientific publishers straddle two worlds. • Archives linked to e.g. historical, social and economic sciences. Peter Tindemans, Geneva, 10-07-06

  15. b. Not too few, not too many • Common European approaches: • ‘call for tender’ for projects, • All-inclusive approach: all stakeholders from 25 member states plus Commission, resolutions, communications, agency, …. • Instead focus on critical mass of stakeholders and focused action, i.e. • Emphasis on preservation (though preservation cannot be separated from building digital collections) • Aim to create ‘infrastructure’ • Aim to create growing consensus among and conditions for ‘communities’ and organisations and their particular preservation projects. Peter Tindemans, Geneva, 10-07-06

  16. 4. Model of the world Framework of conditions and rules of conduct • ‘Communities’ produce science (particle physics; social sciences; astronomy/space science; geophysics/oceanography/earth sciences/earth observation;..), are different, but have similar structural elements to house “Record of Science” • In some disciplines short-term role individual researchers • ‘Laboratories’ • Specialised data providers • Specialised publishers or web-based archives • Specialised reserch libraries • Cross-cutting horizontal structure too exists: • Scientific publishers, multidisciplinary open archives • Academic research libraries • Deposit libraries • Conventional archives • All are digital archives or repositories in digital world Peter Tindemans, Geneva, 10-07-06

  17. Community A Community B Community C labs labs labs special data providers special data providers special data providers general scientific publishers, general open archives, academic research libraries, deposit libraries, conventional archives Special publishers special publishers special publishers special research libraries special research libraries special research libraries community –specific provisions community –specific provisions community –specific provisions Peter Tindemans, Geneva, 10-07-06 Cross-disciplinary, cross- community conditions, mechanisms and provisions

  18. Transform into framework (‘infrastructure’) of real life organisations and operating conditions (for interoperability and collaboration) • Identify set of core physical digital archives in limited number of initial communities, and in horizontal layer (“critical mass” and ‘high specific mass’ are essential criteria) • These must OAIS-compliant to ensure proper archiving, interoperability and long-term preservation • Framework for metadata, Framework for persistent identifiers, and number of registries • Cost-effective preservation methods and services must be available • Common framework of principles and guidelines for management of access and rights (underlying the technical tools to implement this framework) • Financial mechanism for developing and testing implementation tools, techniques and services • a. Certification service providers, accredited according to b. Common European accreditation mechanism. Peter Tindemans, Geneva, 10-07-06

  19. 5. An Alliance in the making Aims • Establish wide consensus on framework (‘infrastructure’) for LTPA; initial focus on science • Accelerate significantly creation of its main building blocks • Work with national governments and EU to strengthen European strategies, policies and their implementation • Strengthen role European parties world-wide • Articulate and maintain ongoing R&D&D programme 3-5 years A “Rolling Stone” Peter Tindemans, Geneva, 10-07-06

  20. Tasks • Assisting communities: initial core set and others • Enhancing and consolidating consensus on the building blocks of the ‘infrastructure’ • Helping establish European funding mechanism • Helping establish European accreditation mechanism • Liasing with national governments and EU • Promoting sustainable business models • Raising awareness: funding bodies, professional societies, universities, ….. Peter Tindemans, Geneva, 10-07-06

  21. Core Alliance Partners • European Science Foundation • Some of most active libraries: British Library, KB • Some major scientific organisations: ESF, ESA, CERN, EMBL (EIROFORUM), CCLRC, Max Planck Gesellschaft, CESSDA are among those approached • Association of Scientific, Technical and Medical Publishers • Some major national archives • JISC, • ‘National coalitions’ for P&A, where they exist: UK, Germany, …. • Corporate associate members (e.g. ICT industry): ‘Customer-contractor’ principle Peter Tindemans, Geneva, 10-07-06

  22. Strengthening emerging consensus;Building on what is being done • Conceptualisation and standardisation, e.g. • OAIS • Dublin Metadata Core Initiative (but still very much library/document-oriented) • Draft Audit Checklist for Certification of Trusted Digital Repositories (RLG, NARA plus European experts) • Practical development and implementation, e.g. • Several EU-funded projects (but too much focus on co-ordination); important new ones: DRIVER, CASPAR (with e.g. CCLRC, ESA-ESRIN) • Strong national projects (but in few countries only); e.g. DARE (Netherlands) • Public-private agreements (e.g. libraries and publishers) • Audit and Certification of Digital Archives Project (CRL) to test audit 3 archives Peter Tindemans, Geneva, 10-07-06

  23. 6. Financial model • Need to create European, but strongly distributed infrastructure; • Need to make Europe visible, strong partner in global efforts Therefore: • Partners continue current efforts and investments • Partners contribute to establish small European organisation to co-ordinate Alliance efforts • ‘100 M€’ for the real action for European funding mechanism • not to be disbursed by Alliance; • for developments in communities the Alliance will work with; • to create the enabling conditions. • Leveraging national and further European funding • (central + decentralised funding totals to build this infrastructure much and much higher than 100 M €) Peter Tindemans, Geneva, 10-07-06

  24. Practicalities about the Alliance • Members: leading national or international organisations • Strategic allies: national coalitions or competence networks; commercial companies or vendors • Board; Director and some staff • Office in Brussels (at ESF’s COST office?) • Budget 3 years: 1.8 M€ • Per partner: ~ 75 k€ Peter Tindemans, Geneva, 10-07-06

  25. Workplan • Year 1 • Interface EU (FP7), ESFRI, national organisations • Facilitate information sharing about preservation approaches and support infrastructure (standards, authentication, registries, metadata capture mechanisms,..) • Gathering cost information • Involvement in on-going drafting archive certification standard • Identifying resources for science drive interoperability as potential basis for automated interoperability • Year 2 (apart from continuing interfacing) • Shared persistent identifiers scheme • Prototype interoperable search and discovery tools supported by common data models • Certification standard ready for submission to ISO; preliminary work on accreditation organisation • Some alignment of operating practices and use of Digital Rights Management and Authentication and Authorisation systems • Year 3 (apart from continuing interfacing) • Prototyping and testbed activities to put some into production use (e.g. applications to find and combine data and relevant publication material, supported by shared catalogues and data models, single sign-on access to non-public data, etc) • Co-operation on large scale storage solutions (exabyte) • Finalise business model for accreditation system • Year 4 + (only after evaluation) • Advanced development of interoperable virtualisation layers Peter Tindemans, Geneva, 10-07-06

  26. 7. Some Issues • ‘Raw’ data, so far much focus on documents • Model of world primarily based on international communities, or on national approaches (cf. networking with NRENs, connected via GEANT) • ALLIANCE to be set up as corporation or e.g. as consortium • How to get EU involved in a strategic (not individual project-based) and internally co-ordinated approach? Peter Tindemans, Geneva, 10-07-06

More Related