High throughput virtual molecular docking hadoop implementation of autodock4 on a private cloud
This presentation is the property of its rightful owner.
Sponsored Links
1 / 19

High-Throughput Virtual Molecular Docking: Hadoop Implementation of AutoDock4 on a Private Cloud PowerPoint PPT Presentation


  • 87 Views
  • Uploaded on
  • Presentation posted in: General

High-Throughput Virtual Molecular Docking: Hadoop Implementation of AutoDock4 on a Private Cloud. The Second International Emerging Computational Methods for the Life Sciences Workshop ACM International Symposium on High Performance Distributed Computing June 8, 2011, San Jose, CA.

Download Presentation

High-Throughput Virtual Molecular Docking: Hadoop Implementation of AutoDock4 on a Private Cloud

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


High throughput virtual molecular docking hadoop implementation of autodock4 on a private cloud

High-Throughput Virtual Molecular Docking: Hadoop Implementation of AutoDock4 on a Private Cloud

The Second International Emerging Computational Methods for the Life Sciences Workshop

ACM International Symposium on High Performance Distributed Computing

June 8, 2011, San Jose, CA

Sally R. Ellingson

Graduate Research Assistant

Center for Molecular Biophysics, UT/ORNL

Department of Genome Science and Technology, UT

Scalable Computing and Leading Edge Innovative Technologies (IGERT)

Dr. Jerome Baudry

PhD Advisor

Center for Molecular Biophysics, UT/ORNL

Department of BCMB, UT


Ultimate goal reduce the time and cost of discovering novel drugs

Ultimate Goal:Reduce the time and cost of discovering novel drugs


High throughput virtual molecular docking hadoop implementation of autodock4 on a private cloud

  • Virtual Molecular Docking

    • Novel Drug Discovery

    • Virtual high-throughput screenings (VHTS)

  • Cloud Computing

    • Advantages for VHTS

    • Kandinsky

    • Hadoop (MapReduce)

  • AutoDockCloud

    • Current Implementation

    • Future Implementations


Virtual molecular docking

Virtual Molecular Docking

  • Given a receptor (protein) and ligand (small molecule), predict

  • Bound conformations

    • Search algorithm to explore conformational space

  • Binding affinity

    • Force field to evaluate energetics


Autodock4 virtual docking engine http autodock scripps edu wiki autodock4

Autodock4Virtual Docking Enginehttp://autodock.scripps.edu/wiki/AutoDock4


Novel drug discovery

Novel Drug Discovery

Human HDAC4

HA3 crystal structure

ZINC03962325


Virtual high throughput screening vhts

Virtual High-Throughput Screening (VHTS)


Vhts with autodock4

VHTS with Autodock4


Potential advantages of cloud computing for vhts

Potential advantages of Cloud Computing for VHTS

  • Affordable access to compute resources (especially for small labs and classrooms).

  • Easy to use interface accessible through web for non-computer experts. Software maintained by experts.

  • Scalable resources for size of screening.


Kandinsky private cloud platform at ornl

KandinskyPrivate Cloud Platform at ORNL

  • Kandinsky, the Systems Biology Knowledgebase Computer, Sponsored by the Office of Biological and Environmental Research in the DOE Office of Science

  • 68 nodes X 16 cores/node = 1088 cores

  • 20 GbpsInfiniband Interconnect

  • Designed to support Hadoop applications and gain an understanding of the MapReduce paradigm.

  • 57 nodes for MapReduce tasks

  • 1 tasktracker per node

  • 10 map and 6 reduce tasks per node (16 tasks per node)

  • 570 map tasks and 342 reduce tasks can run simultaneously on Kandinsky


Hadoop

Hadoop

  • Scalable

  • Economical

  • Efficient

  • Reliable

    http://hadoop.apache.org/common/docs/current/api/overview-summary.html


Mapreduce programming paradigm used by hadoop

MapReduceprogramming paradigm used by Hadoop

people.apache.org

people.apache.org


Current autodockcloud implementation

Current AutoDockCloud Implementation

input=file names needed for each docking

map(input)

{

copy input to local working directory;

run AutoDock4 locally;

copy result file to HDFS;

}

*pre-docking set-up and post-docking analysis is currently done manually

*no reduce function is currently being used


Current autodockcloud implementation1

Current AutoDockCloud Implementation

Er Agonist screening from DUD as benchmark

450 speed-up with 570 available map slots on Kandinsky, private cloud at ORNL


Current autodockcloud implementation2

Percent of known ligands found

Percent of ranked database

Docking enrichment plot for ER agonist using AutoDockCloud and DUD.

Current AutoDockCloud Implementation


Future autodockcloud implementation

Future AutoDockCloud Implementation

input=ligand file from chemical compound database

map(input)

{

create pdbqt (AutoDock input file) from input;

run AutoDock4 locally;

find best scoring ligand structure;

save structure to HDFS;

return <score, ligand>;

}

reduce(<score, ligand>)

{

sort;

return ranked_database;

}

*pre-docking and post-docking will be automated and distributed

*less total I/O requirements


Future plans

Future Plans

  • Incorporate additional docking engines

    • AutodockVina

      • Less I/O

      • More efficient and accurate algorithm

      • No charge information needed

  • Deploy on Commercial Cloud (EC2)

  • Develop web interface


High throughput virtual molecular docking hadoop implementation of autodock4 on a private cloud

  • Virtual Molecular Docking

    • Novel Drug Discovery

    • Virtual high-throughput screenings (VHTS)

  • Cloud Computing

    • Advantages for VHTS

    • Kandinsky

    • Hadoop (MapReduce)

  • AutoDockCloud

    • Current Implementation

    • Future Implementations


Questions comments

Questions/Comments

Acknowledgements

  • Dr. Jerome Baudry (advisor)

  • Center for Molecular Biophysics, UT/ORNL

  • Genome Science and Technology, UT

  • Scalable Computing and Leading Edge Innovative Technologies (IGERT)

  • Avinash Kewalramani, ORNL

  • ECMLS and HPDC organizers and participants


  • Login