mygrid
Download
Skip this Video
Download Presentation
myGrid

Loading in 2 Seconds...

play fullscreen
1 / 53

myGrid - PowerPoint PPT Presentation


  • 103 Views
  • Uploaded on

myGrid. Architectural issues in a bioinformatics Grid http://www.mygrid.org.uk Luc Moreau, University of Southampton, UK. Overview. Bioinformatics background myGrid facts Service oriented architecture Architectural issues Notification service Grid component model Service directory

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' myGrid' - olisa


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
mygrid

myGrid

Architectural issues in a

bioinformatics Grid

http://www.mygrid.org.uk

Luc Moreau,

University of Southampton, UK

overview
Overview
  • Bioinformatics background
  • myGrid facts
  • Service oriented architecture
  • Architectural issues
    • Notification service
    • Grid component model
    • Service directory
  • Conclusions
bioinformatics genomics
Bioinformatics & Genomics
  • Large amounts of data
  • Highly heterogeneous
    • Data types
    • Data forms
    • Community
  • Highly complex and inter-related
  • Volatile
bioinformatics data
Bioinformatics Data
  • Descriptive as well as numeric
  • Literature
  • Analogy/ knowledge-based

Text Extraction

bioinformatics analysis
Bioinformatics Analysis
  • Different algorithms
    • BLAST, FASTA, pSW
  • Different implementations
    • WU-BLAST, NCBI-BLAST
  • Different service providers
    • NCBI, EBI, DDBJ
the human genome project
The HGP will make available potentially thousands of targets for

Understanding biology & genetics

Drug discovery

Diagnostics

Many genes will be linked with diseases

Cancer

HIV

Parkinson’s

Asthma

Malaria

Autoimmune (arthritis)

Cardiovascular

Antibacterial & antifungal

The Human Genome Project
in silico experimentation
In silico experimentation
  • Discovery of resources and tools, staging of operations, sharing of results
  • Process is as important as outcome
  • Science is dynamic – change happens
  • Scientific discovery is personal & global
  • Provenance and history
overview1
Overview
  • Bioinformatics background
  • myGrid facts
  • Service oriented architecture
  • Architectural issues
    • Notification service
    • Grid component model
    • Service directory
  • Conclusion
mygrid1
myGrid
  • EPSRC funded pilot project
  • Generic middleware within application setting
  • 36 month in 42 month performance period
  • Start 1st October
    • 16 full-time post docs altogether
    • 6 DTA studenships
    • 1 technical project manager
    • 1 system manager
    • 1 secretarial post
mygrid consortium
myGrid consortium
  • Scientific Team
    • Biologists and Bioinformaticians
    • GSK, AZ, Merck KGaA, Manchester, EBI
  • Technical Team
    • Manchester, Southampton, Newcastle, Sheffield, EBI, Nottingham
    • IBM, SUN
    • GeneticXchange
    • Network Inference, Epistemics Ltd
mygrid outcomes
myGrid outcomes
  • e-Scientists
    • Bioinformatics demonstrator (on cold carp)
  • Developers
    • myGrid-in-a-Box developers kit
    • Integrating some existing bioinformatics tools with myGrid
overview2
Overview
  • Bioinformatics background
  • myGrid facts
  • Service oriented architecture
  • Architectural issues
    • Notification service
    • Grid component model
    • Service directory
  • Conclusions
overview3
Overview
  • Bioinformatics background
  • myGrid facts
  • Service oriented architecture
  • Architectural issues
    • Notification service
    • Grid component model
    • Service directory
  • Conclusions
architectural issues1
Architectural Issues
  • Notification service
vision
Vision
  • Asynchronous delivery and persistence of messages
  • Topics can be created and discovered on the fly
  • Subscribers can subscribe to topics, publishers can publish messages on a given topic
  • Peer to peer network of notification services
  • Topology can be re-organized to enhance reliability
  • Subscribers and publishers can negotiate over QoS
a notification service instance

notifications

Subscriber

Subscriber stub

Publisher stub

Publisher

Subscriberdelegator

publisherdelegator

QoS

A notification service instance
federated notification services

Hub-1

Hub-2

Hub-3

NS-1-2

NS-2-2

NS-2-1

NS-1-3

NS-3-1

NS-1-1

NS-3-2

P-1-3-1

P-1-1-2

P-1-3-2

P-1-1-1

S-2-1-1

S-1-1-1

P-2-2-2

S-3-1-1

P-2-2-1

P-3-1-1

P-3-2-1

Federated notification services
  • Strong communication links between hubs
  • Efficient data replication
  • Simple notification routing
current status
Current status
  • Push and pull messaging
  • Topic,message and publisher filter
  • WSDL interface
  • Workflow interaction
  • Integration with mySQL, openJMS, tomcat and Axis
  • Federated service (undergraduate project)
  • QoS negotiation (PhD work underway)
  • OGSA compliance
experimentation
Experimentation
  • Windows and Unix platforms with Tomcat 4.0.5, Axis beta 3.0, OpenJMS 0.7.2 and mySQL 3.23.51
  • Aggregation test with 500 topics, 2,000 subscribers, 2,000 publishers and 10,000 registered subscriptions, 10,000 notifications
  • 72 hours non-stop subscribing/publishing with the above populations
architectural issues2
Architectural Issues
  • Notification service
architectural issues3
Architectural Issues
  • Notification service
  • Grid component model
grid component model
Grid Component Model

The myGrid framework is a component model for flexible, simple and future-proof deployment and use of services on the Grid.

problems addressed
Problems Addressed
  • For service developers and deployers:
    • Ease of development of sophisticated services by separation of concerns and re-use of third party functionality.
    • Consistent distribution of functionality over a set of services, e.g. access control, support for fault-tolerance.
    • Application of solutions to the above to services deployed using technologies such as OGSA Grid Services, Web Services and Enterprise JavaBeans.
problems addressed1
Problems Addressed
  • For service clients:
    • Development of service clients that are not limited by the range of standards known at deployment time.
    • Control over how service operations are invoked, so that they can make use of the most suitable protocols supported by a service.
    • Provision of a standard client interface hiding the differences in deployment philosophy that each middleware technology brings.
    • Application of solutions to the above to services deployed using technologies such as OGSA Grid Services, Web Services and Enterprise JavaBeans.
current status1
Current Status
  • Startpoints for Web Services
  • Deployment within nested containers
  • Facades for exposing EJBs as Web Services
  • Performance tests
current work
Current Work
  • Automated deployment in nested containers
  • Definition of containers for deployment-time configuration
  • Using containers to provide minimal functionality of OGSA Grid Services
  • Startpoints for EJBs, Grid Services
experimentation1
Experimentation
  • Our experiments have shown that nesting in our containers is not costly compared to method invocation and nested inner classes
  • The cost of calling EJBs via the Web Service façade comes mostly from the use of SOAP, and the consequential requirement for conversion to/from objects
architectural issues4
Architectural Issues
  • Notification service
  • Grid component model
architectural issues5
Architectural Issues
  • Notification service
  • Grid component model
  • Service directory
service directory views
Service Directory Views
  • Multiple service directories will co-exist (IBM, Microsoft, EBI, local institutions)
  • Need to attach metadata to service directory entries
  • Metadata is personal to the scientist: trust, perceived QoS, ontological description
  • Need for a mechanism to allow scientists to add their metadata and to make it available to other users as a “regular service directory”.
views status
Views: status
  • Currently in design phase
    • Use cases in the process of being finalized
    • Preliminary specification of interfaces
    • More work is needed on policy languages
  • Design to be finalized by end of January
  • First prototype of core functionality 4 months later
overview4
Overview
  • Bioinformatics background
  • myGrid facts
  • Service oriented architecture
  • Architectural issues
    • Notification service
    • Grid component model
    • Service directory
  • Conclusions
conclusions
Conclusions
  • More architectural issues being addressed
    • Security (GSI, RBAC), but where is the community going?
    • Fault tolerance
slide50
Workflow enactment
    • WSFL compatible enactment engine
    • Support for fault tolerance, checkpointing, migration
    • Editor
conclusions1
Conclusions
  • 4 months development cycle with “integration fest”
  • Our roadmap is based on a layered organisation of functionality
mygrid in southampton
myGrid in Southampton
  • Luc Moreau, Michael Luck, David DeRoure
  • Terry Payne, Keith Decker
  • Simon Miles, Juri Papay, John Dickman, Xiaojian Liu, Claudia di Napoli, Vijay Dialani, Richard Lawley