V gisc simdat project a virtual gisc
This presentation is the property of its rightful owner.
Sponsored Links
1 / 24

V-GISC/SIMDAT project – a Virtual GISC PowerPoint PPT Presentation


  • 80 Views
  • Uploaded on
  • Presentation posted in: General

V-GISC/SIMDAT project – a Virtual GISC. Alfred Hofstadler, Matteo Dell’Acqua ECMWF. Project History:. May 2002: Thirteenth session WMO Regional Association VI: “…agreed that the concept of a Virtual GISC had merit…” June 2002: V-GISC in RA-VI Kick-off Meeting

Download Presentation

V-GISC/SIMDAT project – a Virtual GISC

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


V gisc simdat project a virtual gisc

V-GISC/SIMDAT project – a Virtual GISC

Alfred Hofstadler, Matteo Dell’Acqua

ECMWF


Project history

Project History:

  • May 2002: Thirteenth session WMO Regional Association VI: “…agreed that the concept of a Virtual GISC had merit…”

  • June 2002: V-GISC in RA-VI Kick-off Meeting

    • Partners: DWD, Meteo France, UK Met-Office, EUMETSAT, ECMWF

    • Steering Group + 4 working groups: Policy, Data, Communications, Dissemination/Acquisition

  • 2003: SIMDAT project proposal submitted to EU

  • 1 September 2004: contract with EU is signed

  • October 2004: V-GISC steering group decides to move V-GISC development into the SIMDAT project

  • November 2004: SIMDAT Kick-off meeting

    • 4 V-GISC working groups are mapped onto SIMDAT working groups: Virtual Organisation, Ontologies, GRID Infrastructure, Access to Distributed Data

  • February 2005: First (V-)GISC-demonstrator at CBS


Simdat introduction

SIMDAT - Introduction

  • Data Grids for Process and Product Development using Numerical Simulation and Knowledge Discovery

  • 4 years project funded by the EU

    • Contract with EU was signed on 1 September 2004

  • SIMDAT focuses on 4 applications

    • Product design in automotive and aerospace

    • Process design in pharmacology

    • Service provision in meteorology

  • Objective of SIMDAT is to use data grid technology to resolve a complex problem for each of the 4 applications

  • Budget of 11 M € of which 10.5% for meteorological activity

  • 320 men/month taking into account EU funding and the contribution from the partners


Simdat strategy

SIMDAT - Strategy

  • 7 Grid-technology areas have been identified to achieving SIMDAT objectives

  • Integrated Grid infrastructure offering basic services to applications

  • Access to data distributed on Grid sites

  • Management of Virtual Organisation

  • Ontologies

  • Integration of analysis services

  • Workflows

  • Knowledge Services

Phase 1: Connectivity

Phase 2: Interoperability

Phase 3: Knowledge

.Virtual Data Repository

. Introduction of grid technologies research

.Deployment of Grid infrastructure with particular attention to data transport and management

. Distributed DB access

Workflows for next-generation aggregated knowledge capture, discovery and mining


Meteorology application project aims

Meteorology application : Project Aims

  • 5 partners: DWD, Meteo-France, UK Met Office, EUMETSAT and ECMWF

    • 3 “potential” GISCs : DWD, Meteo-France, UK Met Office

    • 2 DCPCs : ECMWF, EUMETSAT

  • Instead of each National Met Service having a GISC (Global Information System Centre)

  • The V-GISC will be seen as a normal GISC and will fulfil the WMO Information System technical requirements

  • The project will build the foundations of the V-GISC by developing an infrastructure that brings together the data of the partners and provides access to the distributed meteorological databases


Meteorology application project aims1

Meteorology application : Project Aims

  • A complex problem: To build a Virtual GISC, an integrated and scalable framework for the collection and sharing of distributed data that will offer:

    • A single view of meteorological information which is distributed amongst the 5 partners

    • Improve visibility and access to meteorological data through a comprehensive discovery service based on metadata development

    • Offer a variety of reliable reliable delivery services (routine dissemination of and collection of data)

    • Provide a global access control policy managed by the partners and integrated into their existing security infrastructure

    • Quality of services, reliability and security

    • Processing services and shared data manipulation facilities

  • The software developed within the project will be made available to WMO


Grid technology

GRID Technology

  • Grid technology will be used

    • To connect the diverse data sources and create a Virtual Database

    • To enable flexible, secure collaboration through virtual organisation

  • Data Grid technology presents an architectural framework that aims to provide access to distributed data in a simple,secure, reliable and scalable manner from a widely distributed set of computers and across various administrative boundaries

  • The essential characteristics of a Data Grid are:

    • Reference a dataset by a unique identifier

    • Discover dataset by attributes

    • Track multiple copies of a single file, and ultimately locate the "nearest" copy

    • Move files from one point on the grid to another point (push, pull and third party copy)

  • The domain of the V-GISC is an ideal candidate to exploit such a framework


V gisc infrastructure

Grid infrastructure for sharing data

Monitoring

Logging

Control

Error tracking

Security

AuthenticationAuthorization

Audit

Interoperability interfaces for data/metadata exchange

Management

User registration

DB admin

Catalogue admin

Interface to offer a single view of the data

- Discovery facilities

- Request/Subscription

Dissemination/acquisition mechanisms

mechanisms to synchronise metadata

V-GISC infrastructure


Meteo requirements

Meteo requirements


V gisc conceptual view

V-GISC Conceptual view

  • Through the Distributed Portal users searches for and retrieves data, subscribe to services subject to authentication and authorization

  • The Virtual Database Service provides a single view of partners databases


V gisc conceptual view1

V-GISC Conceptual view

  • Virtual Database

    • Provide the unified view of all the shared datasets through a distributed catalogue

    • Maintain the distributed catalogue amongst the partners using synchronization mechanisms

    • Provide interfaces with the legacy databases

    • Implement data replication mechanisms

    • Preserve the integrity of the data

  • Access Facilities

    • Collection & Dissemination services that support secure, efficient and reliable transport mechanisms

    • Quality of Service (QoS): Traffic Prioritization, Queuing mechanisms, Scheduling

    • Discovery service by browsing the catalogue or using a keyword search engine

    • Interactive and batch interfaces

  • VO

    • Security Services (CA, AuthN, AuthZ, Audit,…)

    • Users management

    • Data policy management

    • Monitoring and control


V gisc distributed architecture

V-GISC Distributed Architecture

  • V-GISC node is installed on each partner site

    • All the nodes are interconnected through a dedicated secure communication channel; The Database Communication Layer (DCL)

    • All the nodes exchange messages through the DCL

  • The architecture is decentralized

    • No central point where all the nodes are declared

    • No single point of failure

  • The network of nodes is self-organized

    • The network dynamically accepts new nodes and is aware of node disconnections

    • The network organizes its topology and indicates to the entering new nodes their position within the network

    • No manual intervention on the nodes to accepts new peers


V gisc distributed architecture1

V-GISC Distributed Architecture


V gisc node

V-GISC Node

  • Each node maintains a copy of the global catalogue describing data available through the V-GISC

    • The catalogue synchronization is done using the DCL

  • Each node maintains a cache used to replicate data and to efficiently serve the users

  • A node is interfaced with the local legacy databases

  • A node has a Web Portal for interactive access

  • A node has a Grid/Web Service Portal for batch access and integration of the V-GISC in a bigger Grid

  • A node implement all services offered by the V-GISC


V gisc node functional design

V-GISC Node - Functional Design


Demonstrator functional view

Demonstrator – Functional View

  • To deploy a flexible infrastructure on top of which the Virtual Information Centre can be built

  • To use Grid technologies to federate databases located on partners site

  • To show to the user a unique view of data sets stored by at least 3 partners

  • To get a first implementation of the catalogue based on WMO core metadata

  • To offer first VO security services


Demonstrator design

Demonstrator - Design

  • 3 main components to build the virtual database: Data Repository, Catalogue Node and Portal

    • installed on each partner site and interconnected through a dedicated secure connection channel

  • Data Repository

    • Interface to the partners databases

    • Offers metadata information to describe, search, locate data

    • Offers interface to retrieve data from the associated local databases

  • Catalogue Node

    • Maintains the catalogue and ensures synchronisation

    • Harvests metadata and requests data from the data Repository

    • Ingests data and maintains the cache of the V-GISC

    • Serves clients: Portal or other Nodes

    • Monitors the execution of the requests

  • Distributed Portal

    • Offers interface to search/browse the V-GISC catalogue


Demonstrator architectural choices

Demonstrator - Architectural Choices

  • Grid Architecture that can accept any kind of Grid Technology

    • Free to choose any grid middleware (OGSA-DAI, GRIA, Glite, GT4) and pick the best component of each middleware that meets the V-GISC requirement

  • Catalogue Node built on a J2EE component framework

    • Solid framework used in production environment

    • Includes different services such as persistency, monitoring, configuration, etc

    • The framework can be seen as a kernel of components where it is easy to add services such as Grid services or Web services

  • Catalogue duplicated and synchronized on each site

    • To have a fast discovery (browse & search phase) phase

    • To have a reliable system (client redirection to another node in case of problems)


Demonstrator architecture

Demonstrator - Architecture


Demonstrator deployment

Demonstrator - Deployment


Problems and lessons learned 1

Problems and lessons learned - 1

  • Grid Middleware

    • Technology not mature for production environment

    • Middleware still evolving toward standards (WSRF, WSI, …)

  • Access to distributed data

    • No efficient and robust transport mechanism

    • No mechanism to duplicate and synchronize data

    • Difficult to ensure data integrity on huge data volumes

    • OGSA-DAI is promising, easy to understand and use


Problems and lessons learned 2

Problems and lessons learned - 2

  • Ontology / Metadata

    • Meteorological metadata are described using XML WMO-CORE metadata Profile

      • Metadata description larger than the data

      • Same information repeated in all metadata records  Unnecessary information is circulating over the network

      • Large metadata records slowing down the Database hosting the catalogue

    • Universal request language was not a solution to the virtual database problem

  • VO

    • No standard tools to manage users and data policies

    • No standard security policies


What s next

What’s next

  • Finalise the Connectivity phase (by M18/Mar 2006)

    • Connect EUMETSAT to the Grid (M12-M15/Sep-Dec 2005)

    • Enhance the architecture (M13-M18/Oct 2005-Mar 2006)

    • Implement Registration Authority (M16-M17/Jan-Feb 2006)

    • Improve metadata model (M13-M16/Oct 2005-Jan 2006)

    • Enhance distributed portal (M14-M16/Nov 2005-Jan 2006)

  • Introduce acquisition of data (M18-M24/Mar-Sep 2006)

  • Develop subscription service (M20-M28/May 2006-Jan 2007)

  • Start developing the Virtual Organisation

    • Monitoring and management of the system (M18-M24/Mar-Sep 2006)

    • User management and data access control (M24-M30/Sep 2006-Mar 2007)

  • Develop the discovery mechanism (M20-M25/May-Oct 2006)

  • Start testing with other potential GISC

    • Japan and Australia have expressed interest in joining the SIMDAT project


Global view coordination effort

Global View : Coordination Effort

  • Metadata

  • Request-reply mechanism

  • Exchange of catalogues

  • Definition on what data should be available and to whom

  • Virtual Organisation

  • Standardisation of services

  • Quality of Service

  • Security


  • Login