Mygrid
This presentation is the property of its rightful owner.
Sponsored Links
1 / 53

myGrid PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on
  • Presentation posted in: General

myGrid. Architectural issues in a bioinformatics Grid http://www.mygrid.org.uk Luc Moreau, University of Southampton, UK. Overview. Bioinformatics background myGrid facts Service oriented architecture Architectural issues Notification service Grid component model Service directory

Download Presentation

myGrid

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Mygrid

myGrid

Architectural issues in a

bioinformatics Grid

http://www.mygrid.org.uk

Luc Moreau,

University of Southampton, UK


Overview

Overview

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusions


Bioinformatics genomics

Bioinformatics & Genomics

  • Large amounts of data

  • Highly heterogeneous

    • Data types

    • Data forms

    • Community

  • Highly complex and inter-related

  • Volatile


Bioinformatics data

Bioinformatics Data

  • Descriptive as well as numeric

  • Literature

  • Analogy/ knowledge-based

Text Extraction


Bioinformatics analysis

Bioinformatics Analysis

  • Different algorithms

    • BLAST, FASTA, pSW

  • Different implementations

    • WU-BLAST, NCBI-BLAST

  • Different service providers

    • NCBI, EBI, DDBJ


The human genome project

The HGP will make available potentially thousands of targets for

Understanding biology & genetics

Drug discovery

Diagnostics

Many genes will be linked with diseases

Cancer

HIV

Parkinson’s

Asthma

Malaria

Autoimmune (arthritis)

Cardiovascular

Antibacterial & antifungal

The Human Genome Project


Drug discovery

Drug Discovery


In silico experimentation

In silico experimentation

  • Discovery of resources and tools, staging of operations, sharing of results

  • Process is as important as outcome

  • Science is dynamic – change happens

  • Scientific discovery is personal & global

  • Provenance and history


Overview1

Overview

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusion


Mygrid1

myGrid

  • EPSRC funded pilot project

  • Generic middleware within application setting

  • 36 month in 42 month performance period

  • Start 1st October

    • 16 full-time post docs altogether

    • 6 DTA studenships

    • 1 technical project manager

    • 1 system manager

    • 1 secretarial post


Mygrid consortium

myGrid consortium

  • Scientific Team

    • Biologists and Bioinformaticians

    • GSK, AZ, Merck KGaA, Manchester, EBI

  • Technical Team

    • Manchester, Southampton, Newcastle, Sheffield, EBI, Nottingham

    • IBM, SUN

    • GeneticXchange

    • Network Inference, Epistemics Ltd


Mygrid outcomes

myGrid outcomes

  • e-Scientists

    • Bioinformatics demonstrator (on cold carp)

  • Developers

    • myGrid-in-a-Box developers kit

    • Integrating some existing bioinformatics tools with myGrid


Overview2

Overview

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusions


Overview3

Overview

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusions


Architectural issues

Architectural Issues


Architectural issues1

Architectural Issues

  • Notification service


Vision

Vision

  • Asynchronous delivery and persistence of messages

  • Topics can be created and discovered on the fly

  • Subscribers can subscribe to topics, publishers can publish messages on a given topic

  • Peer to peer network of notification services

  • Topology can be re-organized to enhance reliability

  • Subscribers and publishers can negotiate over QoS


A notification service instance

notifications

Subscriber

Subscriber stub

Publisher stub

Publisher

Subscriberdelegator

publisherdelegator

QoS

A notification service instance


Federated notification services

Hub-1

Hub-2

Hub-3

NS-1-2

NS-2-2

NS-2-1

NS-1-3

NS-3-1

NS-1-1

NS-3-2

P-1-3-1

P-1-1-2

P-1-3-2

P-1-1-1

S-2-1-1

S-1-1-1

P-2-2-2

S-3-1-1

P-2-2-1

P-3-1-1

P-3-2-1

Federated notification services

  • Strong communication links between hubs

  • Efficient data replication

  • Simple notification routing


Qos negotiation protocol

QoS Negotiation Protocol


Current status

Current status

  • Push and pull messaging

  • Topic,message and publisher filter

  • WSDL interface

  • Workflow interaction

  • Integration with mySQL, openJMS, tomcat and Axis

  • Federated service (undergraduate project)

  • QoS negotiation (PhD work underway)

  • OGSA compliance


Experimentation

Experimentation

  • Windows and Unix platforms with Tomcat 4.0.5, Axis beta 3.0, OpenJMS 0.7.2 and mySQL 3.23.51

  • Aggregation test with 500 topics, 2,000 subscribers, 2,000 publishers and 10,000 registered subscriptions, 10,000 notifications

  • 72 hours non-stop subscribing/publishing with the above populations


Architectural issues2

Architectural Issues

  • Notification service


Architectural issues3

Architectural Issues

  • Notification service

  • Grid component model


Grid component model

Grid Component Model

The myGrid framework is a component model for flexible, simple and future-proof deployment and use of services on the Grid.


Problems addressed

Problems Addressed

  • For service developers and deployers:

    • Ease of development of sophisticated services by separation of concerns and re-use of third party functionality.

    • Consistent distribution of functionality over a set of services, e.g. access control, support for fault-tolerance.

    • Application of solutions to the above to services deployed using technologies such as OGSA Grid Services, Web Services and Enterprise JavaBeans.


Problems addressed1

Problems Addressed

  • For service clients:

    • Development of service clients that are not limited by the range of standards known at deployment time.

    • Control over how service operations are invoked, so that they can make use of the most suitable protocols supported by a service.

    • Provision of a standard client interface hiding the differences in deployment philosophy that each middleware technology brings.

    • Application of solutions to the above to services deployed using technologies such as OGSA Grid Services, Web Services and Enterprise JavaBeans.


Nested component model

Nested Component Model


Framework

Framework


Current status1

Current Status

  • Startpoints for Web Services

  • Deployment within nested containers

  • Facades for exposing EJBs as Web Services

  • Performance tests


Current implementation

Current Implementation


Current work

Current Work

  • Automated deployment in nested containers

  • Definition of containers for deployment-time configuration

  • Using containers to provide minimal functionality of OGSA Grid Services

  • Startpoints for EJBs, Grid Services


Experimentation1

Experimentation

  • Our experiments have shown that nesting in our containers is not costly compared to method invocation and nested inner classes

  • The cost of calling EJBs via the Web Service façade comes mostly from the use of SOAP, and the consequential requirement for conversion to/from objects


Architectural issues4

Architectural Issues

  • Notification service

  • Grid component model


Architectural issues5

Architectural Issues

  • Notification service

  • Grid component model

  • Service directory


Service directory views

Service Directory Views

  • Multiple service directories will co-exist (IBM, Microsoft, EBI, local institutions)

  • Need to attach metadata to service directory entries

  • Metadata is personal to the scientist: trust, perceived QoS, ontological description

  • Need for a mechanism to allow scientists to add their metadata and to make it available to other users as a “regular service directory”.


Views status

Views: status

  • Currently in design phase

    • Use cases in the process of being finalized

    • Preliminary specification of interfaces

    • More work is needed on policy languages

  • Design to be finalized by end of January

  • First prototype of core functionality 4 months later


Overview4

Overview

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusions


Conclusions

Conclusions

  • More architectural issues being addressed

    • Security (GSI, RBAC), but where is the community going?

    • Fault tolerance


Mygrid

  • Workflow enactment

    • WSFL compatible enactment engine

    • Support for fault tolerance, checkpointing, migration

    • Editor


Conclusions1

Conclusions

  • 4 months development cycle with “integration fest”

  • Our roadmap is based on a layered organisation of functionality


Mygrid in southampton

myGrid in Southampton

  • Luc Moreau, Michael Luck, David DeRoure

  • Terry Payne, Keith Decker

  • Simon Miles, Juri Papay, John Dickman, Xiaojian Liu, Claudia di Napoli, Vijay Dialani, Richard Lawley


Www mygrid org uk

www.mygrid.org.uk

m


  • Login