Mygrid
Download
1 / 53

myGrid - PowerPoint PPT Presentation


  • 103 Views
  • Uploaded on

myGrid. Architectural issues in a bioinformatics Grid http://www.mygrid.org.uk Luc Moreau, University of Southampton, UK. Overview. Bioinformatics background myGrid facts Service oriented architecture Architectural issues Notification service Grid component model Service directory

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'myGrid' - olisa


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Mygrid

myGrid

Architectural issues in a

bioinformatics Grid

http://www.mygrid.org.uk

Luc Moreau,

University of Southampton, UK


Overview
Overview

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusions


Bioinformatics genomics
Bioinformatics & Genomics

  • Large amounts of data

  • Highly heterogeneous

    • Data types

    • Data forms

    • Community

  • Highly complex and inter-related

  • Volatile


Bioinformatics data
Bioinformatics Data

  • Descriptive as well as numeric

  • Literature

  • Analogy/ knowledge-based

Text Extraction


Bioinformatics analysis
Bioinformatics Analysis

  • Different algorithms

    • BLAST, FASTA, pSW

  • Different implementations

    • WU-BLAST, NCBI-BLAST

  • Different service providers

    • NCBI, EBI, DDBJ


The human genome project

The HGP will make available potentially thousands of targets for

Understanding biology & genetics

Drug discovery

Diagnostics

Many genes will be linked with diseases

Cancer

HIV

Parkinson’s

Asthma

Malaria

Autoimmune (arthritis)

Cardiovascular

Antibacterial & antifungal

The Human Genome Project



In silico experimentation
In silico for experimentation

  • Discovery of resources and tools, staging of operations, sharing of results

  • Process is as important as outcome

  • Science is dynamic – change happens

  • Scientific discovery is personal & global

  • Provenance and history


Overview1
Overview for

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusion


Mygrid1
myGrid for

  • EPSRC funded pilot project

  • Generic middleware within application setting

  • 36 month in 42 month performance period

  • Start 1st October

    • 16 full-time post docs altogether

    • 6 DTA studenships

    • 1 technical project manager

    • 1 system manager

    • 1 secretarial post


Mygrid consortium
myGrid consortium for

  • Scientific Team

    • Biologists and Bioinformaticians

    • GSK, AZ, Merck KGaA, Manchester, EBI

  • Technical Team

    • Manchester, Southampton, Newcastle, Sheffield, EBI, Nottingham

    • IBM, SUN

    • GeneticXchange

    • Network Inference, Epistemics Ltd


Mygrid outcomes
myGrid outcomes for

  • e-Scientists

    • Bioinformatics demonstrator (on cold carp)

  • Developers

    • myGrid-in-a-Box developers kit

    • Integrating some existing bioinformatics tools with myGrid


Overview2
Overview for

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusions


Overview3
Overview for

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusions



Architectural issues1
Architectural Issues for

  • Notification service


Vision
Vision for

  • Asynchronous delivery and persistence of messages

  • Topics can be created and discovered on the fly

  • Subscribers can subscribe to topics, publishers can publish messages on a given topic

  • Peer to peer network of notification services

  • Topology can be re-organized to enhance reliability

  • Subscribers and publishers can negotiate over QoS


A notification service instance

notifications for

Subscriber

Subscriber stub

Publisher stub

Publisher

Subscriberdelegator

publisherdelegator

QoS

A notification service instance


Federated notification services

Hub-1 for

Hub-2

Hub-3

NS-1-2

NS-2-2

NS-2-1

NS-1-3

NS-3-1

NS-1-1

NS-3-2

P-1-3-1

P-1-1-2

P-1-3-2

P-1-1-1

S-2-1-1

S-1-1-1

P-2-2-2

S-3-1-1

P-2-2-1

P-3-1-1

P-3-2-1

Federated notification services

  • Strong communication links between hubs

  • Efficient data replication

  • Simple notification routing



Current status
Current status for

  • Push and pull messaging

  • Topic,message and publisher filter

  • WSDL interface

  • Workflow interaction

  • Integration with mySQL, openJMS, tomcat and Axis

  • Federated service (undergraduate project)

  • QoS negotiation (PhD work underway)

  • OGSA compliance


Experimentation
Experimentation for

  • Windows and Unix platforms with Tomcat 4.0.5, Axis beta 3.0, OpenJMS 0.7.2 and mySQL 3.23.51

  • Aggregation test with 500 topics, 2,000 subscribers, 2,000 publishers and 10,000 registered subscriptions, 10,000 notifications

  • 72 hours non-stop subscribing/publishing with the above populations


Architectural issues2
Architectural Issues for

  • Notification service


Architectural issues3
Architectural Issues for

  • Notification service

  • Grid component model


Grid component model
Grid Component Model for

The myGrid framework is a component model for flexible, simple and future-proof deployment and use of services on the Grid.


Problems addressed
Problems Addressed for

  • For service developers and deployers:

    • Ease of development of sophisticated services by separation of concerns and re-use of third party functionality.

    • Consistent distribution of functionality over a set of services, e.g. access control, support for fault-tolerance.

    • Application of solutions to the above to services deployed using technologies such as OGSA Grid Services, Web Services and Enterprise JavaBeans.


Problems addressed1
Problems Addressed for

  • For service clients:

    • Development of service clients that are not limited by the range of standards known at deployment time.

    • Control over how service operations are invoked, so that they can make use of the most suitable protocols supported by a service.

    • Provision of a standard client interface hiding the differences in deployment philosophy that each middleware technology brings.

    • Application of solutions to the above to services deployed using technologies such as OGSA Grid Services, Web Services and Enterprise JavaBeans.




Current status1
Current Status for

  • Startpoints for Web Services

  • Deployment within nested containers

  • Facades for exposing EJBs as Web Services

  • Performance tests



Current work
Current Work for

  • Automated deployment in nested containers

  • Definition of containers for deployment-time configuration

  • Using containers to provide minimal functionality of OGSA Grid Services

  • Startpoints for EJBs, Grid Services


Experimentation1
Experimentation for

  • Our experiments have shown that nesting in our containers is not costly compared to method invocation and nested inner classes

  • The cost of calling EJBs via the Web Service façade comes mostly from the use of SOAP, and the consequential requirement for conversion to/from objects


Architectural issues4
Architectural Issues for

  • Notification service

  • Grid component model


Architectural issues5
Architectural Issues for

  • Notification service

  • Grid component model

  • Service directory


Service directory views
Service Directory Views for

  • Multiple service directories will co-exist (IBM, Microsoft, EBI, local institutions)

  • Need to attach metadata to service directory entries

  • Metadata is personal to the scientist: trust, perceived QoS, ontological description

  • Need for a mechanism to allow scientists to add their metadata and to make it available to other users as a “regular service directory”.


Views status
Views: status for

  • Currently in design phase

    • Use cases in the process of being finalized

    • Preliminary specification of interfaces

    • More work is needed on policy languages

  • Design to be finalized by end of January

  • First prototype of core functionality 4 months later


Overview4
Overview for

  • Bioinformatics background

  • myGrid facts

  • Service oriented architecture

  • Architectural issues

    • Notification service

    • Grid component model

    • Service directory

  • Conclusions


Conclusions
Conclusions for

  • More architectural issues being addressed

    • Security (GSI, RBAC), but where is the community going?

    • Fault tolerance


Mygrid

  • Workflow enactment for

    • WSFL compatible enactment engine

    • Support for fault tolerance, checkpointing, migration

    • Editor


Conclusions1
Conclusions for

  • 4 months development cycle with “integration fest”

  • Our roadmap is based on a layered organisation of functionality


Mygrid in southampton
myGrid in Southampton for

  • Luc Moreau, Michael Luck, David DeRoure

  • Terry Payne, Keith Decker

  • Simon Miles, Juri Papay, John Dickman, Xiaojian Liu, Claudia di Napoli, Vijay Dialani, Richard Lawley