nimrod g and virtual lab tools for data intensive computing on grid drug design case study l.
Skip this Video
Loading SlideShow in 5 Seconds..
Nimrod-G and Virtual Lab Tools for Data Intensive Computing on Grid: Drug Design Case Study PowerPoint Presentation
Download Presentation
Nimrod-G and Virtual Lab Tools for Data Intensive Computing on Grid: Drug Design Case Study

Loading in 2 Seconds...

play fullscreen
1 / 25

Nimrod-G and Virtual Lab Tools for Data Intensive Computing on Grid: Drug Design Case Study - PowerPoint PPT Presentation

  • Uploaded on

Nimrod-G and Virtual Lab Tools for Data Intensive Computing on Grid: Drug Design Case Study. Rajkumar Buyya. Melbourne, Australia Grid. Economy Grid. Scheduling. Economics. Contents. Introduction Resource Management challenges Nimrod-G Toolkit

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Nimrod-G and Virtual Lab Tools for Data Intensive Computing on Grid: Drug Design Case Study

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
nimrod g and virtual lab tools for data intensive computing on grid drug design case study

Nimrod-G and Virtual Lab Tools for Data Intensive Computing on Grid: Drug Design Case Study

Rajkumar Buyya

Melbourne, Australia






  • Introduction
  • Resource Management challenges
  • Nimrod-G Toolkit
    • SPMD/Parameter-Study Creation Tools
    • Grid enabling Drug Design Application
    • Nimrod-G Grid Resource Broker
  • Scheduling Experiments on World Wide Grid
  • Conclusions
a typical grid environment and players
A typical Grid environment and Players

Resource Broker


Resource Broker

grid characteristics
Grid Characteristics
  • Heterogeneous
    • Resource Types: PC, WS, Clusters
    • Resource Architecture: CPU Arch, OS
    • Applications: CPU/IO/message intensive
    • Users and Owners Requirements
    • Access Price: different for different users, resources and time.
    • Availability: varies from time to time.
  • Distributed
    • Resources
    • Ownership
    • Users
    • Each have their own (private) policies and objectives.
  • Very much similar to heterogeneity and decentralization that is present in “human economies” (democratic and capitalist world).
  • Hence, we propose the use of “economics” as a metaphor for resource management and scheduling. It regulates supply and demand for resources and offers incentive for resource owners for contributing resources to the Grid.
grid tools for handling

Computational Economy


Data locality

Resource Allocation

& Scheduling

Uniform Access

System Management

Resource Discovery

Network Management

Grid Tools for Handling

Application Development

nimrod g grid resource broker
Nimrod-G: Grid Resource Broker
  • A resource broker for managing, steering, and executing task farming (parametric sweep/SPMD model) applications on Grid based on deadline and computational economy.
  • Based on users’ QoS requirements, our Broker dynamically leases services at runtime depending on their quality, cost, and availability.
  • Key Features
    • A single window to manage & control experiment
    • Persistent and Programmable Task Farming Engine
    • Resource Discovery
    • Resource Trading
    • Scheduling & Predications
    • Generic Dispatcher & Grid Agents
    • Transportation of data & results
    • Steering & data management
    • Accounting
parametric processing
Parametric Processing


Magic Engine for

Manufacturing Humans!

Multiple Runs

Same Program

Multiple Data

Killer Application for the Grid!

Courtesy: Anand Natrajan, University of Virginia

sample p sweep applications
Sample P-Sweep Applications

Bioinformatics: Drug Design / Protein Modelling

Combinatorial Optimization:

Meta-heuristic parameter estimation

Ecological Modelling: Control Strategies for Cattle Tick

Sensitivityexperiments on smog formation

Data Mining

Electronic CAD: Field Programmable Gate Arrays

High Energy Physics: Searching for Rare Events

Computer Graphics: Ray Tracing

Finance: Investment Risk Analysis

VLSI Design: SPICE Simulations

Civil Engineering:

Building Design

Network Simulation


Crash Simulation

Aerospace: Wing Design


virtual drug design data intensive computing on grid
Virtual Drug Design: Data Intensive Computing on Grid
  • A Virtual Laboratory for “Molecular Modelling for Drug Design” on Peer-to-Peer Grid.
  • It provides tools for examining millions of chemical compounds (molecules) in the Protein Data Bank (PDB) to identify those having potential use in drug design.
  • In collaboration with:
    • Kim Branson, Structural Biology, Walter and Eliza Hall Institute (WEHI)

dock input file

Molecule to be screened

Dock input file

score_ligand yes

minimize_ligand yes

multiple_ligands no

random_seed 7

anchor_search no

torsion_drive yes

clash_overlap 0.5

conformation_cutoff_factor 3

torsion_minimize yes

match_receptor_sites no

random_search yes

. . . . . .

. . . . . .

maximum_cycles 1

ligand_atom_file S_1.mol2

receptor_site_file ece.sph

score_grid_prefix ece

vdw_definition_file parameter/vdw.defn

chemical_definition_file parameter/chem.defn

chemical_score_file parameter/chem_score.tbl

flex_definition_file parameter/flex.defn

flex_drive_file parameter/flex_drive.tbl

ligand_contact_file dock_cnt.mol2

ligand_chemical_file dock_chm.mol2

ligand_energy_file dock_nrg.mol2

parameterize dock input file use nimrod tools gui language

Molecule to be screened

Parameterize Dock input file(use Nimrod Tools: GUI/language)

score_ligand $score_ligand

minimize_ligand $minimize_ligand

multiple_ligands $multiple_ligands

random_seed $random_seed

anchor_search $anchor_search

torsion_drive $torsion_drive

clash_overlap $clash_overlap

conformation_cutoff_factor $conformation_cutoff_factor

torsion_minimize $torsion_minimize

match_receptor_sites $match_receptor_sites

random_search $random_search

. . . . . .

. . . . . .

maximum_cycles $maximum_cycles

ligand_atom_file ${ligand_number}.mol2

receptor_site_file $HOME/dock_inputs/${receptor_site_file}

score_grid_prefix $HOME/dock_inputs/${score_grid_prefix}

vdw_definition_file vdw.defn

chemical_definition_file chem.defn

chemical_score_file chem_score.tbl

flex_definition_file flex.defn

flex_drive_file flex_drive.tbl

ligand_contact_file dock_cnt.mol2

ligand_chemical_file dock_chm.mol2

ligand_energy_file dock_nrg.mol2

create dock planfile 1 define variable and their value
Create Dock PlanFile1. Define Variable and their value

parameter database_name label "database_name" text select oneof "aldrich" "maybridge" "maybridge_300" "asinex_egc" "asinex_epc" "asinex_pre" "available_chemicals_directory" "inter_bioscreen_s" "inter_bioscreen_n" "inter_bioscreen_n_300" "inter_bioscreen_n_500" "biomolecular_research_institute" "molecular_science" "molecular_diversity_preservation" "national_cancer_institute" "IGF_HITS" "aldrich_300" "molecular_science_500" "APP" "ECE" default "aldrich_300";

parameter score_ligand text default "yes";

parameter minimize_ligand text default "yes";

parameter multiple_ligands text default "no";

parameter random_seed integer default 7;

parameter anchor_search text default "no";

parameter torsion_drive text default "yes";

parameter clash_overlap float default 0.5;

parameter conformation_cutoff_factor integer default 5;

parameter torsion_minimize text default "yes";

parameter match_receptor_sites text default "no";

parameter random_search text default "yes";

. . . . . .

. . . . . .

parameter maximum_cycles integer default 1;

parameter receptor_site_file text default "ece.sph";

parameter score_grid_prefix text default "ece";

parameter ligand_number integer range from 1 to 2000 step 1;

Molecules to be screened

create dock planfile 2 define task that jobs need to do
Create Dock PlanFile2. Define Task that jobs need to do

task nodestart

copy ./parameter/vdw.defn node:.

copy ./parameter/chem.defn node:.

copy ./parameter/chem_score.tbl node:.

copy ./parameter/flex.defn node:.

copy ./parameter/flex_drive.tbl node:.

copy ./dock_inputs/get_molecule node:.

copy ./dock_inputs/dock_base node:.


task main

node:substitute dock_base dock_run

node:substitute get_molecule get_molecule_fetch

node:execute sh ./get_molecule_fetch

node:execute $HOME/bin/dock.$OS -i dock_run -o dock_out

copy node:dock_out ./results/dock_out.$jobname

copy node:dock_cnt.mol2 ./results/dock_cnt.mol2.$jobname

copy node:dock_chm.mol2 ./results/dock_chm.mol2.$jobname

copy node:dock_nrg.mol2 ./results/dock_nrg.mol2.$jobname


use nimrod g
Use Nimrod-G

Submit & Play!

a nimrod g monitor



Legion hosts

Globus Hosts

Bezek is in both

Globus and Legion Domains

A Nimrod/G Monitor

Adaptive Scheduling Algorithms

Discover More Resources

Discover Resources

Establish Rates

Compose & Schedule

Evaluate & Reschedule

Meet requirements ? Remaining Jobs, Deadline, & Budget ?

Distribute Jobs

scheduling experiment on world wide grid testbed

WW Grid

Scheduling Experiment on World Wide Grid Testbed















AEI/Germany Lecce/Italy










deadline and budget constrained scheduling experiment
Deadline and Budget Constrained Scheduling Experiment
  • Workload:
    • 165 jobs, each need 5 minute of CPU time
  • Deadline: 2 hrs. and budget: 396000 units
  • Strategy: minimise time / cost
  • Execution Cost with cost optimisation
    • Optimise Cost: 115200 (G$) (finished in 2hrs.)
    • Optimise Time: 237000 (G$) (finished in 1.25 hr.)
    • In this experiment: Time-optimised scheduling run costs double that of Cost-optimised.
    • Users can now trade-off between Time Vs. Cost.

WW Grid

WW Grid

World Wide Grid (WWG)


North America



UVa: Linux Cluster

UD: Linux cluster

UTK: Linux cluster

Monash Uni.:


Linux cluster





Solaris WS




Tokyo I-Tech.:

ETL, Tuskuba

ZIB/FUB: T3E/Mosix

Cardiff: Sun E6500

Paderborn: HPCLine

Lecce: Compaq SC

CNR: Cluster

Calabria: Cluster

CERN: Cluster

Pozman: SGI/SP2

Linux cluster

Globus +


Chile: Cluster

Globus +


Globus +


South America

  • P2P and Grid Computing is emerging as a next generation computing platform for solving large scale problems through sharing of geographically distributed resources.
  • Resource management is a complex undertaking as systems need to be adaptive, scalable, competitive,…, and driven by QoS.
  • We proposed a framework based on “computational economies” and discussed several economic models for resource allocation and for regulating supply-and-demand for resources.
  • Scheduling experiments on World Wide Grid demonstrate our Nimrod-G broker ability to dynamically lease or rent services at runtime based on their quality, cost, and availability depending on consumers QoS requirements.
  • Easy to use tools for composing applications to run on Grid are essential to attracting and getting application community on board.
  • Economics paradigm for QoS driven resource management is essential to push P2P/Grids into mainstream computing!
download software information
Download Software & Information
  • Nimrod & Parameteric Computing:
  • Economy Grid & Nimrod/G:
  • Virtual Laboratory/Virtual Drug Design:
  • Grid Simulation (GridSim) Toolkit (Java based):
  • World Wide Grid (WWG) testbed:
    • Looking for new volunteers to grow 
      • Please contact me to barter your & our machines!
  • Want to build on our work/collaborate:
    • Talk to me now or email: