E-Science and Grid
This presentation is the property of its rightful owner.
Sponsored Links
1 / 26

e-Science and Grid The VL-e approach PowerPoint PPT Presentation


  • 65 Views
  • Uploaded on
  • Presentation posted in: General

e-Science and Grid The VL-e approach. L.O. (Bob) Hertzberger Computer Architecture and Parallel Systems Group Department of Computer Science Universiteit van Amsterdam [email protected] Background information experimental sciences. Experiments become increasingly more complex

Download Presentation

e-Science and Grid The VL-e approach

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


E science and grid the vl e approach

e-Science and GridThe VL-e approach

L.O. (Bob) Hertzberger

Computer Architecture and Parallel Systems GroupDepartment of Computer ScienceUniversiteit van Amsterdam

[email protected]


E science and grid the vl e approach

Background informationexperimental sciences

  • Experiments become increasingly more complex

    • Driven by detector developments

      • Resolution increases

      • Automation & robotization increases

  • Results in an increase in amount and complexity of data

  • Something has to be done to harness this development

    • Virtualization of experimental resources: e-Science


The application data crisis

The Application data crisis

  • Scientific experiments start to generate lots of data

    • medical imaging (fMRI): ~ 1 GByte per measurement (day)

    • Bio-informatics queries:500 GByte per database

    • Satellite world imagery: ~ 5 TByte/year

    • Current particle physics: 1 PByte per year

    • LHC physics (2007): 10-30 PByte per year

  • Data is often very distributed


Paradigm shift in life science

Paradigm shift in Life science

  • Past experiments where hypothesis driven

    • Evaluate hypothesis

    • Complement existing knowledge

  • Present experiments are data driven

    • Discover knowledge from large amounts of data

      • Apply statistical techniques


The what of e science

The what of e-Science

  • e-Science is the application domain “Science” of Grid & Web

    • More thanonly coping with data explosion

    • A multi-disciplinary activity combining human expertise & knowledge between:

      • A particular domain scientist

      • ICT scientist

  • e-Science demands a different approach to experimentation becausecomputer is integrated part of experiment

    • Consequence is a radical change in design for experimentation

  • e-Science should apply and integrate Web/Grid methods where and whenever possible


  • E science and grid the vl e approach

    GT1

    GT2

    OGSI

    Started far apart in apps & tech

    Have been

    converging

    WSRF

    WSDL 2,

    WSDM

    WSDL,

    WS-*

    HTTP

    Grid and Web ServicesConvergence

    Grid

    Web

    Definition of Web Service Resource Framework(WSRF) makes explicit distinction between “service” and stateful entities acting upon service i.e. the “resources”

    Means that Grid and Web communities can move forward on a common base

    Ref: Foster


    Grid service offerings

    Grid service ‘offerings’

    • Capability to run programs and scripts on remote sites on demand

    • Ability to exchange and replicate large bulk-data sets

    • Replica location services for files based on logical names

    • Job monitoring using a distributed relational information system

    • Resource brokering and transparent access to remote facilities

    • Management of user groups, roles and access rights


    Relation to european grid infrastructures

    Relation to European Grid infrastructures

    • Common European e-Infrastructure middleware (EGEE) for core grid services

    • Based on successful EU DataGrid, CrossGrid, and LCG software suite

    • Already deployed worldwide on a O(100) site production facility

    • Support through EGEE Regional Operations Centre (SARA and NIKHEF)

      EGEE: Enabling Grids for E-science in Europe (EU FP6)


    Levels of grid abstraction

    Levels of Grid abstraction

    Semantic/Knowledge Web/Grid

    Information Web/Grid

    Data Grid

    Computational Grid


    E science objectives

    e-Science Objectives

    • It should enhance the scientific process by:

    • Stimulating collaborationby sharing data & information

      • Improve re-use of data & information

    • Combing data and information from different modalities

      • Sensor data & information fusion

    • Realize the combination of real life & (model based) simulation experiments

    • It should result in:

    • Computer aided support for rapid prototyping of ideas

      • Stimulate the creativity process

    • It should realize that by creating & applying:

      • New computing methodologies and an infrastructure stimulating this

    • We try to do this via the Virtual Lab for e-Science (VL-e) project


    Virtual lab for e science research philosophy

    Virtual Lab for e-Science research Philosophy

    • Multidisciplinary research & development of related ICT infrastructure

    • Generic application support

      • Application cases are drivers for computer & computational science and engineering research


    E science and grid the vl e approach

    Grid/Web Services

    Harness multi-domain distributed resources

    VL-e project

    Data

    Intensive

    Science/

    HEP

    Bio-

    Informatics

    Medical

    Diagnosis &

    Imaging

    Bio-

    Diversity

    Food

    Informatics

    Dutch

    Telescience

    VL-e

    Application Oriented Services

    Management

    of comm. &

    computing


    E science and grid the vl e approach

    Virtual Lab for e-Science research Philosophy

    • Multidisciplinary research and development of related ICT infrastructure

    • Generic application support

      • Application cases are drivers for computer & computational science and engineering research

      • Problem solving partly generic and partly specific

      • Re-use of components via generic solutions whenever possible


    E science and grid the vl e approach

    Application pull

    Grid/ Web Services

    Harness multi-domain distributed resources

    Application

    Specific

    Part

    Application

    Specific

    Part

    Application

    Specific

    Part

    Potential Generic

    part

    Potential Generic

    part

    Potential Generic

    part

    Management

    of comm. &

    computing

    Virtual Laboratory

    Application Oriented Services

    Management

    of comm. &

    computing

    Management

    of comm. &

    computing


    Generic e science aspects

    Generic e-Science aspects

    • Virtual Reality Visualization & user interfaces

    • Imaging

    • Modeling & Simulation

      • Interactive Problem Solving

    • Data & information management

      • Data modeling

      • dynamic work flow management

    • Content (knowledge) management

      • Semantic aspects

      • Meta data modeling

        • Ontologies

    • Wrapper technology

    • Design for Experimentation


    Virtual lab for e science research philosophy1

    Virtual Lab for e-Science research Philosophy

    • Multidisciplinary research and development of related ICT infrastructure

    • Generic application support

      • Application cases are drivers for computer & computational science and engineering research

      • Problem solving partly generic and partly specific

      • Re-use of components via generic solutions whenever possible

    • Rationalization of experimental process among others the experimental pipeline

      • Reproducible & comparable


    Issues for a reproducible scientific experiment

    parameters/settings,

    algorithms,

    intermediate results,

    software packages,

    algorithms

    Parameter settings,

    Calibrations,

    Protocols

    raw data

    processed data

    presentation

    acquisition

    processing

    sensors,amplifiers

    imaging devices,, …

    conversion, filtering,analyses, simulation, …

    visualization, animationinteractive exploration, …

    Rationalization of the experiment and processes via protocols

    Metadata

    Issues for a reproducible scientific experiment

    experiment

    interpretation

    Much of this is lost when an experiment is completed.


    S cientific w orkflow m anagement s ystems in an e science environment

    Domain specific Applications

    SWMS

    High level workflow services

    User support

    Engine

    Knowledge

    Information

    e-Science framework

    Computing tasks

    Data management

    Generic Grid middleware

    Grid infrastructure

    Scientific Workflow Management Systems in an e-Science environment

    • Functionalities:

      • Automating experiment routines;

      • Rapid prototyping of experimental computing systems;

      • Hiding integration details between resources;

      • Managing experiment lifecycle;

    • Cross different layers of middleware for managing:

      • Data;

      • Computing;

      • Information;

      • Knowledge.


    Virtual lab for e science research philosophy2

    Virtual Lab for e-Science research Philosophy

    • Multidisciplinary research and development of related ICT infrastructure

    • Generic application support

      • Application cases are drivers for computer & computational science and engineering research

      • Problem solving partlygeneric and partly specific

      • Re-use of components via generic solutions whenever possible

    • Rationalization of experimental process

      • Reproducible & comparable

    • Two research experimentation environments

      • Proof of concept for application experimentation

      • Rapid prototyping for computer & computational science experimentation


    The vl e infrastructure

    The VL-e infrastructure

    Application

    specific

    service

    Medical

    Application

    Telescience

    Bio ASP

    Application

    Potential

    Generic service

    &

    Virtual

    Lab. services

    Virtual Lab.

    rapid prototyping

    (interactive simulation)

    Test & Cert.

    VL-software

    Virtual Laboratory

    Additional

    Grid Services

    (OGSA services)

    Test & Cert.

    Grid Middleware

    Grid Middleware

    Grid

    &

    Network

    Services

    Network Service

    (lambda networking)

    Test & Cert.

    Compatibility

    Surfnet

    VL-e Certification Environment

    VL-e Experimental Environment

    VL-e Proof of Concept Environment


    Infrastructure for applications

    Infrastructure for Applications

    • Applications are a driving force of the PoC

    • Experience shows applications value stability

    • Foster two-way interaction to make this happen


    Vl e poc environment

    VL-e PoC environment

    • Latest certified stable software environment of core grid and VL-e services

    • Core infrastructure built around clusters and storage at SARA and NIKHEF (‘production’ quality)

      • Good basis for Tier-1

    • Controlled extension to other platforms and distributions

    • On the user end: install needed servers: user interface systems, storage elements for data disclosure, grid-secured DB access

    • Focus on stability and scalability


    Hosted services for vl e

    Hosted services for VL-e

    • Key services and resources are offered centrally for all applications in VL-e

    • Mass data and number crunching on the large resources at SARA

    • Storage for data replication & distribution

    • Persistent ‘strategic’ storage on tape

    • Resource brokers, resource discovery, user group management


    Why such a complex scheme

    Why such a complex scheme?

    • “software is part of the infrastructure”

    • stability of core software needed to develop the new scientific applications

    • enable distributed systems management (who runs what version when?)

    “the grid is one big error amplifier”

    “computers make mistakes like humans, only much, much faster”


    Building a scalable infrastructure

    Building a scalable infrastructure

    With good code, stable releases & supportyou can build large working systems, useful to science


    Conclusions

    Conclusions

    • e-Science is a lot more more than trying to cope with data explosion alone

    • Implementation of e-Science systems requires further rationalization and standardization of experimentation process

    • e-Science success demands the realization of an environment allowing

      • application driven experimentation &

      • rapid dissemination of feed back of these new methods

    • We try to do that via development of Proof of Concept

    • Good basis for HEP Tier-1


  • Login