1 / 37

Mirko Albani, Vincenzo Beruti, Eugenia Forcada, Graeme Mason (European Space Agency)

ESA-ESRIN Earth Science Infrastructure requirements for Long Term Data Preservation and Access : the ESA approach. Mirko Albani, Vincenzo Beruti, Eugenia Forcada, Graeme Mason (European Space Agency) Raffaele Guarino (Cap-Gemini)

nyx
Download Presentation

Mirko Albani, Vincenzo Beruti, Eugenia Forcada, Graeme Mason (European Space Agency)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ESA-ESRINEarth Science Infrastructure requirements for Long Term Data Preservation and Access : the ESA approach Mirko Albani, Vincenzo Beruti, Eugenia Forcada, Graeme Mason (European Space Agency) Raffaele Guarino (Cap-Gemini) mirko.albani@esa.int, vincenzo.beruti@esa.int, eugenia.forcada@esa.int, graeme.mason@esa.intraffaele.guarino@capgemini.com APA Conference Helsinki, November 22nd 2010

  2. Presentation outline structure The Scientific Data e-infrastructure in Europe: the High Level Expert Group Vision The Earth Science Data Preservation and Access challenges ESA Earth Observation LTDP Framework: one step ahead for a tangible solution The ESA EO LTDP Preparatory Programme FIRST (Definition of LTDP User Requirements and Preserved Data Set Composition) study LAST (Long term data Archive Study on new Technologies) Conclusions

  3. The High Level Expert Groupon Scientific Data The High Level Expert Group on Scientific Data was created by the European Commission’s Directorate-General for Information Society and Media to prepare a “vision 2030” for the evolution of e-infrastructure to scientific data. The group was composed of specialists with competence in the digital libraries domain from the following categories: Memory organisations (libraries, archives, museums); Authors, publishers and content providers; ICT industry (e.g. search engines, technology providers); Scientific and research organisations, academia including ESA HLEG delivered the “vision 2030” report to the European Commission in October 2010. The vision focus in a scientific e-infrastructure that supports seamless access, use, re-use, and trust of data.

  4. The Vision into the future Data themselves become “the infrastructure” a valuable asset, on which science, technology, the economy and society can advance. The availability and open accessibility of reliable scientific data and associated knowledge would enable: Researchers to find, access and process the data they need being confident in their ability to use and understand data and evaluate the degree to which that data can be trusted. Producers of data: to benefit from opening it to broad access thanks to the return they can have (e.g. public organizations may see a rise in available funding as funding bodies have confidence that their investments in research are paying back extra dividends to society, through increased use and re-use of publicly generated data). Industry to have the appropriate returns. Policy makers to make decisions based on solid evidence and to monitor the impacts of these decisions.

  5. Open questions How can we make sure that data are collected together with the information necessary to reuse them and make them understandable in future ? How can we get researchers – or individuals – to contribute to the global data set? How do we overcome the problems of diversity – heterogeneity of data, but also of backgrounds and data-sharing cultures in the scientific community? How do we deal with the diversity of data repositories and access rules – within or between disciplines, and within or across national borders? How can we trust the data (e.g. authenticity) and guarantee their integrity? How can we decide what to preserve? How can we fund the preservation of our information resources in the long term?

  6. Presentation outline structure The Scientific Data e-infrastructure in Europe: the High Level Expert Group Vision The Earth Science Data Preservation and Access challenges ESA Earth Observation LTDP Framework: one step ahead for a tangible solution ESA EO LTDP Preparatory Programme FIRST (Definition of LTDP User Requirements and Preserved Data Set Composition) study LAST (Long term data Archive Study on new Technologies) Conclusions

  7. Earth Science requirements breakdown

  8. Earth Science: long time series of observations

  9. Diversity in: communities that generate and use the data (space agencies, universities and research centres, private users, etc..). data formats and types, which data and associated information is preserved. ways of analysing, sharing and handling data. methods and conditions for data discovery and access. Approaches for data integrity, quality, authenticity, etc. Sustainability of Earth Science data preservation is not guaranteed in Europe. Earth Science is a very scattered domain representing micro-cosmos within the wider Science domain but fully representative of its challenges and open points. Earth Science can be pathfinder for the wider science domain. Earth Science Data Preservation and Access: the challenges

  10. Earth Science domain: the needs Aiming to the creation of an infrastructure that allows: Interoperability and sharing of existing infrastructures for the preservation and access of Earth Science data. Sustainability of the data infrastructure through the availability of funding mechanisms (public good only?) that enable all to contribute as well as to benefit, through an increased return on investment. Availability of common policies and approaches for the preservation and access to the data. The solution requires joining forces: An international framework for collaboration and cooperation.

  11. What do we mean with Infrastructure ? • Does the infrastructure include at least: • Organizations mandates • Cooperation guidelines and policies, agreements, common plans • Legal and financial aspects • Standards and procedures • Cost commitments, sustainability • Data to be preserved and accessed (awareness, quality, reliability, trust, knowledge) and related metadata • Operational Data Repositories, data preservation, migration, • Researchers requirements • Technical physical resources (i.e. hardware, software, networking, web visibility access, etc.) • Maintenance, operation, evolution • There are major differences between HLEG Vision 2030 and e-IRG Roadmap 2010, both referring to an e- infrastructure?

  12. Considering the Earth Science LTDP Framework like a living entity, the following issues need to be addressed: What kind of environment is necessary for it to come to existence? How should it be nurtured? How will it be self-sustained? How could we know if its survival is at risk? Earth Science Data Preservation and Access Infrastructure : the issues

  13. Earth Science Infrastructure Vision for Data Preservation and Access Self-sustained Nurturing • Wide benefits • Economical and technical sustainability • Mutual data enrichment • More Value added services • Continuous improvement Parenting • from rules to structured processes • Investment • Infrastructure enhancements • Auditing and validation of LTDP • Continue improvement of LTDP • Individual entity adopts (de-facto standard) Birth • Extended rules • Training • Stakeholders participating in building • First evolution of infrastructure (tools and technologies) • State-of-the-art • Basic rules • Basic infrastructures • Minimal investments • EO Data provider vision The living ecosystem and sustained business model. Change Management Transition Continuous support and keep motivation Acceptance of new LTDP culture and Ownership Starting point User/ Stakeholder approach Reactive Interactive Poorly reactive Proactive

  14. Presentation outline structure The Scientific Data e-infrastructure in Europe: the High Level Expert Group Vision The Earth Science Data Preservation and Access challenges European Earth Observation LTDP Framework: one step ahead for a tangible solution ESA EO LTDP Preparatory Programme • FIRST (Definition of LTDP User Requirements and Preserved Data Set Composition) study • LAST (Long term data Archive Study on new Technologies) Conclusions

  15. ESA Earth Observation data preservation and access: background and challenge • ESA’s EO archives: • Content of EO data archives is extending from a few years to several decades. • Very valuable data representing scientific long time-series for a large number of applications. • ~150 TB archived in early ‘90s, more than 4 PB archived today and over ~27 PB expected in the next 10 years (Level 0 data). • Factor of 4 if also Level 1-2 products considered. • Same trend is expected in other European EO Satellites owners archives.

  16. ESA EO Archives Data volume expected trend

  17. Eueropean Earth Observation LTDP Framework Birth: one step ahead for a tangible solution To support the huge amount of scientific data needed by Earth Sciences, ESA is coordinating an initiative to set-up an European Earth Observation LTDP framework considering possible extension to Earth Science data. ESA has initially targeted the LTDP framework to the Earth Observation space sensed data and has launched the ESA Long Term Data Preservation (LTDP) Programme for an initial period of three years (2009-2011).

  18. European Earth Observation LTDP Framework context ESA is working since years in setting up an international EO LTDP framework for collaboration and cooperation in the Earth Observation domain Interoperability among existing infrastructures is becoming a reality in the Global Monitoring Environment and Security (GMES) project context None of the European organizations can do this alone, cooperation is mandatory and beneficial for all. Value of the sum of entities’ value is higher To be done: Involve the full Earth Observation data value chain (including researchers, individuals, scientists, value adders). Identify approach to guarantee the sustainability of the existing data infrastructure. Extend the framework to the wider Earth Science domain. Define an architecture for a pan-European Earth Science data infrastructure based on existing and new infrastructures.

  19. European EO LTDP Framework: Goal and Principles Goal: to preserve the European, and Canadian, EO space data sets for an unlimited time-span ensuring and facilitating their accessibility and usability respecting the individual entities data policies. Principles: To preserve the huge amount of scientific data we need an LTDP community supported by interacting organizations This LTDP community have to preserve the data and provide access services of value to customers, who are themselves members Over time, LTDP members will evolve their capabilities and roles, and will tend to align themselves with the directions collectively defined. New perspectives and collective growth of the different actors

  20. European Agencies joining forces • A leading group of European Agencies owning EO mission archives have joined forces in the LTDP Working Group to address the LTDP challenge together since 2008. • Under the Ground Segment Coordination Body control. • ESA consultation with its Member States. • Consultations with all European EO mission and data archive owners(http://earth.esa.int/gscb/ltdp/LTDP_Agenda.html).

  21. European EO LTDP Framework parenting A collaborative framework open to all possible members: with distributed and heterogeneous entities and facilities: to reach an harmonized infrastructure for the preservationand access of the European EO Space Data. Sustained through a multilateral cooperation with multiple funding sources from at least the European EO data owners. Each European EO Space data owners contributes: With their ideas and expertise and possibly their infrastructure. Progressively adhering to commonly defined and agreed Guidelines and Procedures using each one resources in a more efficient manner to achieve the end goal Progressive implementation based on a stepwise approach (short, mid, long-term activities)

  22. The LTDP international context ESA membership Participation to several projects Standards (e.g. OAIS) WGISS LTDP task A GEO subtask is under Approval for a long term Data Preservation strategy From space to in-situ data

  23. LTDP Framework ESA major achievements • Promoted the creation of the European LTDP Framework in Earth Observation, • defined the major cooperation areas and the related implementation plan • Implementation of the basic rules, “The guidelines”, published reflecting • the consensus of the European EO Data providers. • Defined the initial data set to be preserved, including the related glossary • Leading the coordination of LTDP in the Earth Observation domain with the • European partners: • - Information flow (workshops, web site, conferences • - Watch up the EO user needs and context for EO • - Sharing technology development information • - Participation to several EC initiatives (Caspar, Genesis DR, Parse, AParse, etc.) • Achieved the approval of an EO LTDP ESA funded preparatory • programme for ESA missions, during ESA C-MIN 2008, and in a broader • scope exploiting also cooperation with EC DG INFOSO

  24. Presentation outline structure The Scientific Data e-infrastructure in Europe: the High Level Expert Group Vision The Earth Science Data Preservation and Access challenge ESA LTDP Earth Observation Framework: one step ahead for a tangible solution ESA EO LTDP Preparatory Programme • FIRST (Definition of LTDP User Requirements and Preserved Data Set Composition) study • LAST (Long term data Archive Study on new Technologies) Conclusions

  25. LTDP Workplan 2009 - 2011 European LTDP European LTDP Analysis & Implementation Framework Programme Studies ( ESA archives ) Coordination Preparation Technical Content Knowledge ( activities for period Consolidate user preservation beyond Complete , maintain & 2012 onwards ) requirements data archiving evolve the LTDP Common Guidelines Integrity of archives ESA archives ( involving CEOS / GEO ) Define data set to be archived beyond telemetry European LTDP Create a European Data security Framework collaborative technical Evaluate impact of framework Cost latest technologies Interoperability and standardization Funding options Analyse & coordinate with other international ESA general and Archives exploitation LTDP projects ( e . g . mandatory budgets funded by EC ) EC Contributions National Budgets Schedule ESA LTDP Preparatory Programme 2009-2011 Work Plan Starting from Earth Observation LTDP programme approved at ESA for the period 2009-2011. To achieve Earth Science LTDP programme approved at ESA for the period 2009-2011.

  26. ESA LTDP Programme Workplan LTDP Workplan 2009-2011 • Consolidate user requirements • Define Earth Science datasets composition • Assess European capabilities • Review current LTDP guidelines • Analysis and studies • FIRST • LAST • European LTDP framework coordination • Participation to European initiatives • LTDP programme preparation • Implementation (ESA archives) • Technologies for data preservation • Maturity level of such technologies • Gap analysis of ESA practices • Risks concerned with technologies

  27. Presentation outline structure The Scientific Data e-infrastructure in Europe: the High Level Expert Group Vision The Earth Science Data Preservation and Access challenge ESA LTDP Earth Observation Framework: one step ahead for a tangible solution ESA EO LTDP Preparatory Programme • FIRST (Definition of LTDP User Requirements and Preserved Data Set Composition) study • LAST (Long term data Archive Study on new Technologies) Conclusions

  28. FIRST (Definition of LTDP User Requirements and Preserved Data Set Composition) study • Objectives: • Collection and definition of requirements and needs from Earth Science users in terms of Long Term Preservation of Earth Science data, products and related information. • Definition of the composition of the Earth Science data set that should be preserved in the long term. • Preliminary assessment of the current capabilities of European archives and provision of recommendations for the set-up of a European LTDP Framework. • Kicked off in June 2010 with a duration of 8 months. • User requirements collected through a questionnaire and interviews. • Involvement of international experts, GEO Communities of practices and User Interface Committee.

  29. Study methodology Project’s headlines and streams • LTDP/FIRST • Looking at source documents • Performing a survey • Analysing similar initiatives • Consolidate user requirements • Identify domains and applications • Identify concerned needs • Perform a survey / cross check • Define Earth Science datasets composition • Define dimensions for assessment • Identify owners, providers and other roles • Capture and assess • Assess European capabilities • Analyse missing, TBD and uncompleted with respect to requirements / needs • Propose updating • Propose way forward for the LTDP Framework • Review current LTDP guidelines

  30. Sources of requirements • gcos-138.pdf • GCOS Climate observation needs (PR 2004/2008, et. al.) • GMES Services: Geoland2, MyOcean, SAFER, MACC • GMES/GISC in-situ data • Climate Modelling User Group, Deliverable 1.2, Requirement Baseline Document • CCI EO Data requirements. • ECV/GCOS 107 • European Strategy Forum on Research Infrastructure • http://inspire.jrc.ec.europa.eu/index.cfm Feedback analysis process GCOS GMES Requirements CCI ESFRI Inspire/JRC Comparison with similar initiatives

  31. LAST (Long term data Archive Study on new Technologies) • Objectives: • Study the different archiving technologies mature for operation in the short and mid term time-frame or available in the long-term and able to satisfy at best ESA and EO data providers requirements in terms of digital information preservation. • Kicked off in May 2010 with a duration of 12 months. • Activities: • Requirements Definition and Due Diligence • Archiving Technology Survey • Testing and Benchmarking • Technology workshop held in Madrid in July 2010 with several participants from international space agencies. • Technical web site implemented: http://ltdpts.eo.esa.int/ • to be used as reference site for archiving technology and LTDP technical issues (e.g. forums, etc).

  32. Presentation structure The Scientific Data e-infrastructure in Europe: the High Level Expert Group Vision The Earth Science Data Preservation and Access challenge ESA LTDP Earth Observation Framework: one step ahead for a tangible solution ESA EO LTDP Programme • FIRST (Definition of LTDP User Requirements and Preserved Data Set Composition) study • LAST (Long term data Archive Study on new Technologies) Conclusions

  33. ESA LTDP programme in brief • Preservation of Earth Science data in the long term is a major challenge in Europe. • LTDP cooperation activities aimed at guaranteeing the preservation of European data are ongoing in the Earth Observation domain under ESA coordination. • ESA is already active: • In applying the EO LTDP Common Guidelines to its own missions. • In implementing essential enhancements in its facilities focusing on data preservation and enhancement of data access. • In preparing an LTDP programme proposal for the period beyond 2012 to be presented to its Delegations for approval. • ESA plans to extend the LTDP Initiative to the wider Earth Science domain, enlarging the current Earth Observation scenario with key stakeholders, to guarantee preservation, access and exploitation of Earth Science data and information for the benefit of European users and citizens.

  34. Users view major elements • Major generic considerations as output from several user surveys results • (FIRST, PARSE, ESA User Service, etc.): • Preservation is a common and widely recognized need to allow performing today and future Earth Science research and application activities • Preserve all forever is the mantra for the users • A user centered vision should be applied rather than a provider centered one • Preservation and future re-use of data paves the way to further knowledge growth • Harmonization among different actors and providers should be a primary objective • Access to data and services should be simple and costless • Security issues are matter of providers/owners and should minimize the effort on user side • Time-series durations/windows needs are different for different applications and utilizations • Periodic reprocessing of data is a necessity

  35. Infrastructure availability • The opportunity of having infrastructure for Earth Science data preservation is a need • An infrastructure for data preservation and access would positively impact the daily work • would solve some of the identified concerns and barriers • could improve data preservation, access and data distribution systems • would allow to try new data assimilation methods and apply models • would allow the virtualisation of data location and maintenance • would allow to take advantage of computational resources not accessible otherwise • APA could play a fundamental role in building up the infrastructure: generation of a generic scalable model ?

  36. The operational challenge: from simple data repository to an operational set up • The implementation of an harmonized infrastructure will allow progressively to reach in long term an “Operational” Earth Science Data preservation & access system that requires at least : • Clear mandates for all partners • Coordinated approach between all partners and adoption of common policies / guidelines • Commonality and harmonization of services • Standardization at different levels • Adequate systems infrastructure implementation, availability and maintenance • Sharing of interoperable resources, networking, computing power • Long term sustainability

  37. END Thank you

More Related