World wide in silico drug discovery against neglected and emerging diseases on grid infrastructures
Download
1 / 26

Nicolas Jacq HealthGrid Association, France Credit: WISDOM initiative - PowerPoint PPT Presentation


  • 104 Views
  • Uploaded on

World-wide in silico drug discovery against neglected and emerging diseases on grid infrastructures. Nicolas Jacq HealthGrid Association, France Credit: WISDOM initiative. Content. Overview of the WISDOM application Deployment on the EGEE grid and experience Conclusion. WISDOM.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Nicolas Jacq HealthGrid Association, France Credit: WISDOM initiative' - nicole


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
World wide in silico drug discovery against neglected and emerging diseases on grid infrastructures

World-wide in silico drug discovery against neglected and emerging diseases on grid infrastructures

Nicolas Jacq

HealthGrid Association, France

Credit: WISDOM initiative


Content
Content emerging diseases on grid infrastructures

  • Overview of the WISDOM application

  • Deployment on the EGEE grid and experience

  • Conclusion

Jacq, 16.04.2007


Wisdom
WISDOM emerging diseases on grid infrastructures

  • WISDOM (http://wisdom.healthgrid.org/)

    • Developing new drugs for neglected and emerging diseases with a particular focus on malaria.

    • Reduced R&D costs and accelerated R&D for emerging and neglected diseases

  • Three large calculations:

    • WISDOM-I (Summer 2005)

    • Avian Flu (Spring 2006)

    • WISDOM-II (Autumn 2006)

Jacq, 16.04.2007


In silico drug discovery presents unique challenges for Information Technologists and computer scientists

DRUG DISCOVERY

Clinical Phases (I-III)

IN SILICO DRUG DISCOVERY

Jacq, 16.04.2007


Simplified virtual screening process by docking
Simplified virtual screening process by docking Information Technologists and computer scientists

Successful examples

  • rapid,

  • cost effective…

    But there are limitations

  • Need for CPU and storage

Docking: predict how

small molecules bind

to a receptor of

known 3D structure

Jacq, 16.04.2007


Grid enabled high throughput virtual screening by docking
Grid-enabled high throughput virtual screening by docking Information Technologists and computer scientists

  • 1 to 30 mn per docking

  • A few MB by output

  • 100 CPU years, 1 TB

Millions of chemical

compounds

Docking

software

  • Challenges: Speed-up the process Manage the data

  • Large scale deployment on grid infrastructure

A few target structures

Jacq, 16.04.2007


Example in silico drug discovery on avian flu
Example: In silico drug discovery on avian flu Information Technologists and computer scientists

  • The goal is to study in silico the impact of selected point mutations on the efficiency of existing drugs and to find new potential drugs

  • A collaboration of 5 grid projects: Auvergrid, BioinfoGrid, EGEE-II, Embrace, TWGrid

  • Significant parameters:

    • 1 docking software: Autodock

    • 8 conformations of the target (N1 neuraminidase)

    • 300,000 selected compounds

    • 105 year CPU to dock all configurations on all compounds

  • Timescale:

    • First contacts: March 1st 2006

    • kick-off: April 1st 2006

    • Duration: 6 weeks

H5

N1

Credit: Y-T Wu

Jacq, 16.04.2007


Results
Results Information Technologists and computer scientists

Jacq, 16.04.2007


Example in silico results from avian flu data challenge
Example : In silico results from avian flu data challenge Information Technologists and computer scientists

  • 5 out of 6 known effective inhibitors can be identified in the first 15% of the ranking and in the first 5% reranked (2,250 compounds)

    • Enrichment = 5.5 and 111 (<1 in most cases)

  • Most known effective inhibitors lose their affinity in binding with a mutated target

Original type

E119A mutated type

E119A

GNA 11.5%

GNA 2.4%

11.5%

15% cut off

Jacq, 16.04.2007


Example in vitro results from avian flu data challenge
Example : In vitro results from avian flu data challenge Information Technologists and computer scientists

  • Experimental assay confirms 7 actives out of 123 purchased “potential hits” (interacting complexes with higher affinities and proper docked poses), which proved the usefulness of our work.

NA

Jacq, 16.04.2007


Content1
Content Information Technologists and computer scientists

  • Overview of the WISDOM application

  • Deployment on the EGEE grid and experience

  • Conclusion

Jacq, 16.04.2007


Requirements for a large scale deployment on grid
Requirements for a large scale deployment on grid Information Technologists and computer scientists

  • Adaptation of the application to the grid

  • Access to a large infrastructure providing maintained resources

  • Use of a production system providing automated and fault-tolerant job and file management

Jacq, 16.04.2007


Adaptation of the application to the grid
Adaptation of the application to the grid Information Technologists and computer scientists

  • The applications are not designed for grid computing.

  • The application code can not be modified.

  • A common strategy is to split the application into shorter tasks

  • License management for commercial software is not yet adapted for large infrastructure

Jacq, 16.04.2007


Access to a large infrastructure 1 3
Access to a large infrastructure (1/3) Information Technologists and computer scientists

  • A resource estimation is needed before the deployment

  • The application package requires installation (and testing)

  • An efficient and responsive user support of the infrastructure is required

Jacq, 16.04.2007


Access to a large infrastructure 2 3 the egee infrastructure
Access to a large infrastructure Information Technologists and computer scientists (2/3) : the EGEE infrastructure

  • EGEE added value:

    • Large computing and storage resources (>30000 CPUs, 50Pb)

    • 24 hours a day availability of resources

    • User support

    • Job and Data Management

    • Information and Monitoring

    • Security

  • Limitations for life science applications

    • Short jobs

    • Data confidentiality

    • Reliability of services

Real Time Monitor

Jacq, 16.04.2007


Access to a large infrastructure 3 3 biomedical virtual organization status
Access to a large infrastructure (3/3) : Biomedical Virtual Organization status

  • Biomed VO leader : V. Breton

  • ~80 participants, see http://egeena4.lal.in2p3.fr

  • Three active subgroups

    • Medical imaging (J. Montagnat)

    • Bioinformatics (C. Blanchet)

    • Drug discovery (V.Breton)

  • Biomedical VO manager: Y. Legré, [email protected]

  • See http://cic.in2p3.fr (VO information, publication of data challenge…)

  • 1 VOMS server, 1 LFC, +20 RBs

  • +100 CEs, +8,000 CPUs (but many users)

  • +110 SEs, ~Tens of TB available on disk

  • 27 countries

Jacq, 16.04.2007


Use of a production system
Use of a production system Organization status

  • Managing thousands of jobs and files is a manually labor-intensive task

    • Job preparation, submission and monitoring, output retrieval, failure identification and resolution, job resubmission…

  • The rate of submitted jobs must be carefully monitored

    • In order to avoid Resource Brokers overload

    • In order to efficiently use the resources

  • The amount of transferred data impacts on grid performance

    • The data must be installed on the grid

    • Storing subsets of the database instead of large unique compound files

  • Grid process introduces significant delays

    • The submitted jobs must be sufficiently long in order to reduce the impact of this middleware overhead

Jacq, 16.04.2007


Use of a production system1
Use of a production system Organization status

  • Other production system from HEP experiments on EGEE

    • The ATLAS production system - The ATLAS experiment

    • BOSS and CRAB - The CMS experiment

    • Alien - The Alice experiment

    • DIRAC - The LHCb experiment

    • DIANE - CERN

    • Ganga, a user interface

    • GridICE and Monalisa, two monitoring services for users

Jacq, 16.04.2007


Schema of the wisdom production environment
Schema of the WISDOM production environment Organization status

User Interface

User Interface

CEs &WNs

SEs

Submits the jobs

CEs &WNs

SEs

D

M

S

WMS

WMS

WISDOM production system

FlexX

job

FlexX

Checks job status Resubmits

Statistics

Structure file

FLEXlm

FlexLM

Compounds file

Statistics

license

license

Output file

Docking information

Local server

HealthGrid Server

Web Site

Web Site

WISDOM

DB

Output

DB

inputs

outputs

Jacq, 16.04.2007


A huge international effort for wisdom ii
A huge international effort for WISDOM-II Organization status

Significant contributions from EELA,

EUMedGRID and EUChinaGRID

Over 420 CPU years in 10 weeks

A record throughput of 100,000 docked compounds per hour

WISDOM calculations used FlexX from BioSolveIT

(6k free, floating licenses)

Jacq, 16.04.2007


Origin of failures during the wisdom i deployment
Origin of failures during the WISDOM-I deployment Organization status

Grid success rate 63% After substracting license server and WISDOM failures

Jacq, 16.04.2007


Success rates of the deployments
Success rates of the deployments Organization status

  • WISDOM-I

    • User success rate :46%

      • License server is a bottleneck

    • Grid success rate :63%

      • Heterogeneous and dynamic nature of the grid

        • Power cut, air-conditionning, mis-configuration, overload…

      • Stress usage

      • Automatic jobs (re)submission (“sink-hole” effect)

  • WISDOM against avian flu

    • Grid success rate:80%

      • Constant and slower job submission flow

      • Manual control of resubmission process

      • WISDOM fault-tolerance improved

      • Grid reliability improved (Workload Management System)

Jacq, 16.04.2007


Content2
Content Organization status

  • Overview of the WISDOM application

  • Deployment on the EGEE grid and experience

  • Conclusion

Jacq, 16.04.2007


Summary 1 2
Summary (1/2) Organization status

  • The experiments demonstrated how grid infrastructures have a tremendous capacity to mobilize very large CPU resources for well targeted goals during a significant period of time

    • 1st large scale deployment of life sciences application on a grid infrastructure

  • The deployments have been a very useful experience in identifying the limitations and bottlenecks of the EGEE infrastructure and middleware

  • The reliability is still the major issue for the WISDOM production system and the EGEE middleware

  • Large scale deployment still requires to be grid expert

Jacq, 16.04.2007


Summary 2 2
Summary (2/2) Organization status

  • WISDOM data challenge has demonstrated that collaborative production grids can be used for steps in the drug discovery process

    • 1st production of biochemical results on a grid infrastructure

  • The impact has significantly raised the interest of the research community on malaria.

  • Output data collection and presentation require improvements to speed-up the post-docking analysis

    • Storage of output metadata from the jobs in a relational database

    • Access to this database and to the docking output files is required

Jacq, 16.04.2007


Thank you
Thank you Organization status

  • To all members of the WISDOM collaboration for their contribution to the project

  • To all grid nodes which committed resources and allowed the success of the initiative

  • To all projects which supported the initiative by providing either computing resources or manpower to develop the WISDOM environment

  • To BioSolveIT by offering up to 6000 free licenses of FlexX

Jacq, 16.04.2007


ad