Dockcrunch and beyond the future of receptor based virtual screening
Sponsored Links
This presentation is the property of its rightful owner.
1 / 28

DockCrunch and Beyond... The future of receptor-based virtual screening PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

DockCrunch and Beyond... The future of receptor-based virtual screening. Bohdan Waszkowycz, Tim Perkins & Jin Li Protherics Molecular Design Ltd Macclesfield, UK. Outline. Structure-based virtual screening an achievable (and possibly useful) tool for drug discovery

Download Presentation

DockCrunch and Beyond... The future of receptor-based virtual screening

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

DockCrunch and Beyond...The future of receptor-based virtual screening

Bohdan Waszkowycz, Tim Perkins & Jin Li

Protherics Molecular Design LtdMacclesfield, UK


  • Structure-based virtual screening

    • an achievable (and possibly useful) tool for drug discovery

    • the DockCrunch validation study

  • Protherics’ experience since DockCrunch

    • methods: making VS a routine task

    • analysis: getting the most from your data

    • the future (and beyond)



molecular docking

Virtual Screening









screen smaller focused libraries

Why Use Molecular Docking?

  • Most detailed representation of binding site

    • overcomes simplifications of pharmacophores

    • identify both conservative and novel solutions

    • impetus for de novo design/optimisation

  • Broad range of analyses applicable

    • diverse scoring/selection criteria

  • Quality/throughput of available methods

    • good enough, despite technical limitations


  • Validation study for large-scale virtual screening

    • flexible ligand/rigid receptor docking

    • PRO_LEADS docking code using ChemScore scoring function

    • 1.1M druglike ACD-SC compounds

    • dock versus oestrogen receptor (agonist and antagonist structures)

    • collaboration with SGI

Oestradiol:Oestrogen Receptor Complex

Agonist Receptor

Antagonist receptor

DockedEnergy Profiles

  • Achieve good separation in terms of predicted binding affinity

DockCrunch Results

  • Demonstrated technical feasibility

    • 1.1M cpds docked in 6 days/64 processor Origin

    • implemented automated pre- and post-processing

  • Demonstrated potential for lead identification

    • successful discrimination of seeded known hits

    • activity for 21 out of 37 assayed compounds

    • ER binding affinities to 7nM Ki

    • novel non-steroidal chemistries

Since DockCrunch...

  • VS established as a routine CAMD task:

    • 2.2M structures docked in DockCrunch

    • 1.5M docked versus in-house target

    • 2.5M docked to date in external contracts

      • project 1: 0.25M Dec 2000

      • project 2: 0.25M Jan 2001

      • project 3: 1M Feb 2001

      • project 4: 1M March-April 2001

      • project 5: 0.5M to do in May...

      • diverse targets/databases/project objectives

Virtual Screening within Prometheus

Database preparation

e.g. salt removal, protonation





Database pre-filtering

select drug-like profile



Receptor-ligand docking

predict binding mode/affinity


graphical browsing,

subset selection


  • Tabu search + extended ChemScore function

    • robust prediction of binding free energy

    • 85% success rate achieved across diverse test set

  • Pre-calculated grids for energies/neighbour lists

    • defines extent of binding site

    • automatically/graphically defined

  • Selection of PRO_LEADS docking protocol

    • use standard protocol across all receptors

    • specific constraints or modified energy terms available if desired

Example of Grid Definition

cAMP-dependent kinase (1YDS)

contact surface coloured by lipophilicity

Docking Throughput

  • Standard protocols take 1–5 mins/ligand

    • e.g. typical VS run at ~4 min for 3M tabu steps

    • 250k cpds/week on 100 processor Linux cluster (VA Linux 750MHz PIII)

  • PLUNDER script for parallelization

    • automatic processing of ligand batches

    • balances processor workload

    • works across heterogeneous architectures

    • supplies running time statistics

    • handles hardware failures

Data Analysis and Subset Selection

  • Intrinsic problems of scoring functions:

    • cannot parameterize all critical interactions

    • try to take account of induced fit effects

    • calibrated only versus good binders

    • ignore co-operativity in binding

  • When applied to random datasets:

    • predicted affinity typically normal distributed

    • overestimates binding affinity of random set

       energy alone not ideal for subset selection

Achieving Better Selection

  • Need to supplement scoring function

    • consensus scoring schemes

  • Explore more fundamental descriptors of receptor:ligand complementarity

    • capture characteristics of diverse receptor types

    • assess deficiencies of existing scoring functions

    • use as simple filters or as pseudo energy terms

Enrichment RatesEffect of different selection criteria for ER set for recovery of seeded compounds

Requirements for Analysis Package

  • VS generates huge data output

    • want to be able to browse through entire dataset

  • Real-time navigation of large datasets

    • graphing property distributions

    • selections based on property filters

    • browsing of 3D models within selections

    • initiating additional property calculations

    • data transformations

    • writing subset/reports


Approach to Analysis

  • 1. Preliminary exploration

    • browse property distributions

    • comparisons with known ligands

  • 2. Initial elimination of poor structures

    • DockedEnergy, component energies

    • DE corrected for size/functionality

    • receptor:ligand steric complementarity

    • polar/lipophilic surface complementarity

Approach to Analysis

  • 3. Further filtering  define focused subsets

    • tighter 2D property filters

    • clustering by 2D chemistry

    • presence of key 3D binding interactions

      • specific H-bonds, specific lipo contacts, pocket occupancy, volume overlap with reference ligand/fragment, etc

    • similarity/diversity of 3D binding mode

      • 3D similarity descriptors

    • final ranking by DockedEnergy or hybrid energy/complementarity scoring function

DockedEnergy vs Size

Complementarity SpaceER and FXa datasets

Addressing More Difficult Cases - COX2

Knowns show clustering in property space despite modest DockedEnergy

Improvements in Docking Function

original docking function

some misdocked knowns

new docking function

more consistent docking

+ve shift in random energies

Comparison of filters in subset selection

87% pass

2D filters

37% pass

energy filters

  • Initial filtering to ~10%

    • energy filters

    • complementarity

    • 2D properties

  • Selection of final ~1% subset

    • 3D structural features

    • preferred binding motifs

    • 2D/3D diversity








22% pass

complementarity filters


  • Established VS as a routine CAMD task

    • focused software development

    • achieved success in drug discovery projects

  • VS is more than a black box

    • data mining is worthwhile

    • explore receptor-ligand complementarity to achieve good subset selection and point towards better scoring functions

Future Directions for VS

  • Exploit expanding computing resource

    • improved docking/scoring functions

    • improved receptor representations

  • Broader application of VS

    • evaluation of drugability of early targets

    • screening of very large virtual libraries

    • routine screening across protein families

    • DMPK issues

Tim Perkins Martin Harrison

Richard SykesCarol Baxter

Richard HallChris Murray

David FrenkelJin Li

David Sheppard

Thanks to:



  • Login