eudat towards a european collaborative data infrastructure n.
Skip this Video
Loading SlideShow in 5 Seconds..
EUDAT Towards a European Collaborative Data Infrastructure PowerPoint Presentation
Download Presentation
EUDAT Towards a European Collaborative Data Infrastructure

Loading in 2 Seconds...

play fullscreen
1 / 17

EUDAT Towards a European Collaborative Data Infrastructure - PowerPoint PPT Presentation

  • Uploaded on

EUDAT Towards a European Collaborative Data Infrastructure. Damien Lecarpentier – CSC, IT Center for Science, Finland ISC’11, Hamburg, 20 June 2011. Outline of the talk. EUDAT concept EUDAT consortium EUDAT service approach Expected benefits and challenges of a CDI.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'EUDAT Towards a European Collaborative Data Infrastructure' - royal

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
eudat towards a european collaborative data infrastructure

EUDATTowards a European Collaborative Data Infrastructure

Damien Lecarpentier – CSC, IT Center for Science, Finland

ISC’11, Hamburg, 20 June 2011


Outline of the talk

  • EUDAT concept
  • EUDAT consortium
  • EUDAT serviceapproach
  • Expectedbenefits and challenges of a CDI
eudat key facts and objectives
EUDAT Key facts and objectives
  • Initiativefundedthrough FP7 e-InfrastructureCall 9 (WP11): INFRA-2011-1.2.2: Data infrastructure for e-Science (november 2010)
  • Call 9 Objective: ”Establish a peristent and robustserviceinfrastructure for scientific data in Europe thatresponds to the need of data-intensive Science of 2020”
  • Budget 43M€
  • EUDAT selected for funding (three-yearproject)
  • Officialstartingdate: 1st October 2011
  • Biggestbudget of the call: 9,3 M€ EC Grant
  • Total Budget: 16,3 M€
  • Consortium
  • 23 partnersrepresenting 13 countries
  • 15 usercommunitiesfrom a widerange of disciplines (Biomed, Earth Science, Climate, SSH, etc.)
  • Targets
  • EUDAT objective: “To deliver a Collaborative Data Infrastructure (CDI) with the capacity and capability for meeting researchers’ needs in a flexible and sustainable way, across geographical and disciplinary boundaries.”
  • The infrastructure must be Collaborative
  • The infrastructure must be driven by researchers’ needs
  • The infrastructure must be sustainable yet flexible
  • The infrastructure must be pan-European
  • The infrastructure must be multi-disciplinary

The current data infrastructurelandscape: challenges and opportunities

  • Long history of data management in Europe: several existing data infrastructures dealing with established and growing user communities (e.g., ESO, ESA, EBI, CERN)
  • New Research Infrastructures are emerging and are also trying to build data infrastructure solutions to meet their needs (CLARIN, EPOS, ELIXIR, ESS, etc.)
  • A large number of projects providing excellent data services (EURO-VO, GENESI-DR, Geo-Seas, HELIO, IMPACT, METAFOR, PESI, SEALS, etc.)
  • However, most of these infrastructures and initiatives address primarily the needs of a specific discipline and user community
  • Challenges
  • Compatibility, interoperability, and cross-disciplinary research
  • Data growth in volume and complexity (the so-called “data tsunami”)
    • strong impact on costs threatening the sustainability of the infrastructure
  • Opportunities
  • Potential synergies do exist: although disciplines have different ambitions, they have common basic needs and requirements that could be matched with generic pan-European services supporting multiple communities and ensuring greater interoperability.
  • Strategyneeded at pan-Europeanlevel
towards a collaborative data infrastructure
Towards a Collaborative Data Infrastructure

Source: HLEG report, p. 31

  • EUDAT willfocus on buildingthisgeneric data infrastructurelayer and offer a trusteddomain for long term data preservationaccompaniedwithrelatedservices to store, identify, authenticate and minethese data.
  • Thisneedbedone in closecollaborationwith the Communities
    • Coreservicesmustmatch the requirements of the communities
    • Communityservicescanalsobeincorporated into the common data serviceinfrastructurewhentheyare of use to othercommunities.

The EUDAT Communities (byfield)

  • EUDAT targetsallscientificdisciplines(disciplineneutral):
    • To enable the capture and identifycross-disciplinerequirements
    • To involving the scientists of all the communities in the shaping of the
    • infrastructure and itsservices

EUDAT Services Activities – Iterative Design

  • EUDAT’s Services activity is concerned with identification of the types of data services needed by the European research communities, delivering them through a federated data infrastructure and supporting their users
  • 1. CapturingCommunitiesRequirements (WP4)
  • Services to bedeployedmustbebased on usercommunitiesneeds
  • Strongengagement and collaborationwithusercommunities (EUDAT communities and beyond) to capturerequirements
  • 2. Building the services (WP5)
  • Userrequirementsmustbematchedwithavailabletechnologies
  • Need to identify:
    • availabletechnologies and tools to develop the required services (technologyappraisal)
    • gaps and marketfailuresthatshouldbeaddressedby EUDAT researchactivities
  • Services must be designed, built and tested in a pre-production test bed environment and made available to WP4 for evaluation by their users
  • 3. Deploying the services and operating the federatedinfrastructure (WP6)
  • Services mustbedeployed on the EUDAT infrastructure and made available to users, withinterfaces for cross-site, cross-communityoperation
  • Reliability, 24h/7d availability and accessibility of the shared services, withoperationalsecurity, data integrity and compliancewithstakeholderrequirements and policies.
eudat core services
EUDAT core services

Core services arebuilding blocks ofEUDAT‘s Common Data Infrastructure

mainlyincluded on bottomlayerofdataservices

  • Fundamental Core Services
  • Long-termpreservation
  • Persistent identifierservice
  • Data accessandupload
  • Workspaces
  • Web executionandworkflowservices
  • Single Sign On (federated AAI)
  • Monitoringandaccountingservices
  • Network services
  • Extended Core Services (community-supported)
  • Joint metadataservice
  • Joint dataminingservice

No need to match the needs of all at the same time, addressing a group of communities can be very valuable, too


Service Model Approach and Generic Collaboration

Generic Service Model

  • Fundamental Core Servicesmeetstronglyoverlappingservicerequirements
  • Extended Core Servicesaremainlycommunity-supported, communityrequirementsaretypicallyoverlapping between somedisciplines

Collaboration between Teams

  • Fundamental Core Servicesareoperatedandsupportedby an Operations Team which collaboratesacrosstheparticipating centres.
  • Extended Core Servicesandotherjoint multi-disciplinaryservice must becommunity-supported, therequirementsareoverlapping between a specificsubsetofdisciplines

EUDAT Timeline

1st User Forum

2nd User Forum

3rd User Forum

4th User Forum










First Services available




Service deployment






Expectedbenefits of a Collaborative Data Infrastructure

  • Enabling multi-disciplinary data intensive research and collaboration
    • Development of common services supporting research communities
      • Support to existing scientific communities’ infrastructures
      • Support to smaller communities through access to sophisticated services
    • Inter-disciplinary collaboration and exploitation of synergies between communities
      • Communities from different disciplines working together to build services
      • Data sharing between disciplines
    • Collaboration with other large-scale infrastructure
      • European e-Infrastructures: Géant, PRACE,EGI, etc.
      • Global initiatives in the US, Japan, Australia, etc.
  • Ensuring wide access to and preservation of data in a sustainable way
    • A robust generic infrastructure capable of handling the scale and complexity of data that will be generated over the next 10-20 years
      • Greater access to existing data and better management of data for the future
      • Increased security by managing multiple copies in geographically distant locations
    • Put Europe in a competitive position for important data repositories of world-wide relevance
  • Economies of scale and cost-efficiency
    • Shared resources and work are less costly

Challenges and Opportunities

  • Deliveringhighlevelmulti-disciplinary data services
    • Achieving a highlevel of interoperability in the context of diversity of data, researchdisciplines and practices
      • Need to stronglyinvolve the differentcommunities in the design and evaluation of services
      • EUDAT as a platform to discussinteroperabilityissues (alongwithotherinitiatives: e.g DAITF)
  • Building trustamongstakeholders
    • Trust between serviceproviders and usersbutalso between the researchers and disciplinesthemselves
    • Trust in the EUDAT infrastructure, the data deposited and collected, data integrity
  • Ensuring the sustainaibility of the infrastructure
    • Providing a framework and a plan to ensure the continuity of servicesbeyond the immediatefundingwindow, through the settingup of a sustainableentity
      • Funding and business models
      • Parnerships (new communities, industry, etc.) and governancemodels

The beginning of a long journey…

“Do the difficult things while they are easy and do the great things while they are small. A journey of a thousand miles must begin with a single step.”

Lao Tzu


How to get in touchwith EUDAT?

Kimmo Koski, CSC - IT Center for Science

EUDAT Project Coordinator

Peter Wittenburg, Max Planck Institute for Psycholinguistics at Nijmegen (MPI-PL)

EUDAT Scientific Coordinator

Damien Lecarpentier, CSC - IT Center for Science

EUDAT Project Manager

  • EUDAT@ISC’11
  • BoF session on “e-Infrastructure for science in Europe”, on Tuesday 21 June, 14:30-15:15, Hall B
  • Partners’ booths at ISC:
  • CSC #146
  • BSC # 114
  • DKRZ # 140
  • EPCC # 152