1 / 12

NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009

NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009. Using NGS to run a massive in silico drug screening against RAS. Dr. Narcis Fernandez-Fuentes RCUK Academic Fellow Computational Biology Group Section of Experimental Therapeutics University of Leeds.

rodd
Download Presentation

NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 Using NGS to run a massive in silico drug screening against RAS Dr. Narcis Fernandez-Fuentes RCUK Academic Fellow Computational Biology Group Section of Experimental Therapeutics University of Leeds

  2. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 Overview • Biological and therapeutic relevance of RAS family • How to run a large in silico screening using NGS infrastructure (my version) • Making files available to several NGS cores: SRB • Keeping NGS cores ‘busy’ • Copying back your stuff • Benefits for my research

  3. X NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 * Ras-GTP Ras-GDP - RAS proteins are small GTPases that function as signaling switches that control normal cell growth and differentiation Active Inactive Ras-GTP Ras-GDP Taken from Downward J, Nature Reviews Cancer 3, 11-22, 2003

  4. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 In silico screening x-Score, eHiTs, … 2. Scoring Library of chemical compounds (Drugs) 1. Docking PLCe PI3K RAL RAF P120GAP Software Autodock,Glide ,Gold, … RAS protein (RECEPTOR)

  5. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 A few numbers… • Time required per docking run : • ~ 5 hours • 2,5M hours • 102,000 days • 56 years • Receptor • RAS structure • Protein rigid during docking • Mg was kept during screening. • Drug Libraries • NCIDS set: 1,990 (~140,000)‏ • ZINC7 0.8 Lead-like set: 128,085 • ZINC7 0.8 Drug-like set: 83,331 ‏ • ZINC7 0.8 Fragment-like set: 62,175 • Timtec library ‘Actiprobe 25K’: 53,298 • Chembridge ‘EXPRESS pick’: 156,268 • ∑ = 487,147 (~ >3 millions)‏ • Autodock Parameters • 25,000,000 energy evaluations • 200 LGA runs • 300 initial population • Flexible ligand representation • Files required per docking run: • Receptor file (protein): ~500 Kb • Ligand file (drug): 100Kb-2Mb • Parameters file: ~200 Kb • Grid file: ~105 Kb • N(*) interaction map files: ~700 Kb • It generates 1 output file • On average 12 files = 5,9 M files (*) Where N=number of <> atom types

  6. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 I. Moving data from local machine to NGS II. Submitting jobs: Keeping NGS computers ‘busy’ III. Retrieving results SRB - Leeds ngs.leeds.ac.uk ngs.rl.ac.uk ngs.wmin.ac.uk ngs.oerc.ox.ac.uk

  7. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 SRB - Leeds I. Moving data from local machine to NGS 1. Generate all needed files locally ~/DL (LL, FL, NCIDS, TimTec, Chembridge) ~/DL/RECEPTOR (protein, parameter, grid, maps, etc) ~/DL/COMPOUNDS (compound files) ~/DL/list (text file) comp1 param_file map_file1 map_file2 … comp2 param_file map_file1 map_file2 … … 2. Transfer files to SRB: accessible to any NGS cluster Need a SRB account Install SRB client in your local machine Using SRB command: # Srsync -r DL s:DL

  8. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 SRB - Leeds II. Submitting jobs: Keeping NGS computers ‘busy’ 1. A different listfile was copy to each ngs cluster 2. A perl script was used to monitor the queue and submit new jobs (cron job: 5’): qstat if ‘Q’ state jobs quit else Sinit n_jobs_to_submit = (n_slots - n_jobs_R) open (LIST) # list while i < n_jobs_to_submit Smv comp(i) write submission script qsub i++ update(LIST) close(LIST) Sexit exit Important! - Have a valid proxy certificate otherwise SRB will fail: upload your own (myproxy)

  9. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 Crontab ngs0655@ngs:~> crontab –l ##CHECK QUEUE AND SUBMIT NEW JOBS */5 * * * * cd /gpfs/scratch/ngs0655/DOCKINGS_5; perl check_queue_cron.pl list /gpfs/scratch/ngs0655/DOCKINGS_5 \ queue.log > /dev/null 2>&1 ## CLEAN LOGS 30 * * * * cd /gpfs/scratch/ngs0655/DOCKINGS_5; \ls *.o* >list.log; perl clean_logs.pl < list.log > /dev/null \ 2>&1 ## COPY FINISHED JOBS TO MY LOCAL MACHINE 10 * * * * perl /gpfs/scratch/ngs0655/DOCKINGS_5/clean_dir.pl > /dev/null 2>&1 Example of a submission script #!/bin/bash # #PBS -S /bin/bash #PBS -j oe #PBS -N pradera #PBS -l walltime=15:00:00 scratchdir=/gpfs/scratch/ngs0655/DOCKINGS_3 input=08074014_1ras.dpf output=08074014_1ras.dlg compress=08074014_1ras.dlg.gz drug=08074014.pdbqt echo STARTED `date` echo $scratchdir $input $output $compress cd $scratchdir /usr/ngs/AUTODOCK_4_0_1 -p $input -l $output gzip $output mv -f $compress ./FINISHED rm -f $drug $input echo FINISHED `date`

  10. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 III. Retrieving results 1. Create an empty passphrase public/private key between ngs cores and my local machine: will allow scp data without having to type the password 2. Perl script was used to monitor finished tasks and transfer files (cron job: 30’): cd output_dir ls *.dlg.gz > output_files open (output_files) scp file_n to local_machine exit #!/usr/bin/perl -w #MONITOR OUTPUT DIRECTORY AND IF THERE ARE ANY FILES TRANSFER THEM TO MY #LOCAL MACHINE USING A PUBLIC/PRIVATE PASSPHRASELESS KEY my $scp = "/usr/bin/scp"; #REMOTE DIR my $remotedir ="narcis\@imm-pc2171.leeds.ac.uk:/scratch/data/LEAD-LIKE my $localdir = "/gpfs/scratch/ngs0655/DOCKINGS_3/FINISHED"; #LOCAL DIRECTORY opendir (CONF, "$localdir"); #READ OUTPUT DIR my @conffiles = readdir(CONF); foreach my $file (@conffiles) { if ($file =~ /\.dlg\.gz$/) { #GOOD ONE my $file2 = $localdir ."/". $file; system("$scp -rp -P 22 $file2 $remotedir");# TRANSFER FILE unlink $file2; # AND DELETE AFTERWARDS } }

  11. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 • April 2008 • Got my NGS account. • Initial tests, setting up scripts, etc. May 2008 – December 2008 - in silico screening April May June July Aug Sep Oct Nov Dec Jan Feb Mar April • January 2009 - onwards • In silico screening done • Score hits and select most promising binders • Starting experimental validation As of today: 203 compounds have been already tested (SPR) resulting in 18 validated binders

  12. NFF. Leeds Institute of Molecular Medicine. University of Leeds. 2009 Leeds Institute of Molecular Medicine Prof. Terry Rabbitts Dr. Tomo Tanaka Dr. David Perez Dr. Donna Petch School of Chemistry Prof. Peter Johnson Dr. Colin Fishwick Jayakanth Kankanala (JK) ngs@oxford Matteo Turilli ngs@leeds Shiv Kaushal Jason Lander Faculty of Biological Science Prof. Steven Homans Richard Malham ngs@westminster Thierry Delaitre

More Related