Developing benchmarking large scale docking lsd pipeline
1 / 21

Developing & Benchmarking Large-scale Docking (LSD) Pipeline - PowerPoint PPT Presentation

  • Uploaded on

Developing & Benchmarking Large-scale Docking (LSD) Pipeline. Niu Huang, 02/17/2004. LSD pipeline. Model Building (ModBase/PDB). Binding Site Refinement (PLOP/Modeller). LigBase. Post-docking Refinement (PLOP). Ligand Docking (DOCK3.5.5.4). Central Database System. Where are we now?.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Developing & Benchmarking Large-scale Docking (LSD) Pipeline' - aida

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Lsd pipeline
LSD pipeline

Model Building


Binding Site Refinement



Post-docking Refinement


Ligand Docking


Central Database System

Docking pipeline

Target Protein






Test case from j med chem mcgovern shoichet 2003
Test case(from J. Med. Chem., McGovern & Shoichet, 2003)










Expert vs automated docking
Expert vs automated docking

  • Enrichment plots comparing the performance of an expert (dark blue), automated procedure (magenta, referred to Test10), and random enrichment (black).

Dhfr cont 1
DHFR cont. 1











Dhfr cont 2
DHFR cont. 2

Using focused set of spheres appears to be essential for reducing the noise caused by inaccurate scoring function that favors the wrong docking poses, which is alleviated by only using the spheres filled in hot spot region.

Test1 docked ligands top scored mddr decoys

Test10 docked ligands top scored mddr decoys

DHFR cont. 3

Case analysis (Aldose Reductase)

The conformational flexibility of the binding site appears to contribute to the poor enrichment as implicated by crystal structures, however it may be also due to other factors such as, lack of protein desolvation penalty in scoring function.

* Structure, 1997, 5:601-612

AR cont. 1

  • Correlation coefficients between electrostatic energy and total energy, vdw energy and total energy are 0.74 and 0.66 for docked ligands, individually, 0.62 and -0.33 for docked top 500 decoys. Clearly, electrostatic interaction is way too favorable and dominate the interaction energy score for docked decoys, which might be remedied by including the protein desolvation penalty.

Parp cont 1

Docked ligands

Top scored MDDR decoys

PARP cont. 1

Case analysis ache
Case analysis (AChE)

  • Poor enrichment (5.0 % of db to find 25% of known ligands) appears to be caused by the large number of improbable docking poses. The AChE binding cavity is large with many waters and more than one clear binding region in the pocket; no direct hydrogen bonds between the ligand and the protein have been observed, only water-bridged hydrogen bonds, which presents a particular hard case to dock to. (Jacobsson, JMC, 2004)

  • Can we do something about it to improve our docking for such cases?

Case analysis thrombin
Case analysis (Thrombin)

Multiple binding sub-sites? anything to do with the way to generate dockable database

and the way to match spheres?

Preliminary conclusion
Preliminary Conclusion

  • A fully automated docking procedure and a consistent parameter set for Grids generation, Docking and Scoring appear to perform well across all the tested systems.

  • Cofactor, iron and structural waters involving in ligand binding are required to be carefully inspected, as well as protonation states of amino acid residues in binding site.

  • “larger binding pocket, more extensive sampling – INDOCK.3” is required (validated by DHFR, TS, thrombine and GART test sets).

  • Docking spheres and delphi spheres can be generated by using different schemes. Focused set of matching spheres were shown to be critical for systems like DHFR, TS and GART, and indicates that the information of hot spot in binding pocket will be important for directing docking.

  • Careful interpretation of docking results (energy component analysis) should be regularly employed to identify possible errors caused by certain factors.

High quality test sets
High quality test sets

Enrichment data sets (known ligands and decoys datasets)

  • Susan test set

  • Enolase test set

  • NCTR ER data set: 232 diverse compounds, covers a 106 – fold range in a validated ER competitive binding assay, and NCTR AR data set: 202 diverse compounds (Tong, 2001)

  • McMaster DHFR data set (

  • Compumine ERalpha , MMP3, AChE and fXa data sets (

    Docking and scoring test sets (experimental structures and binding affinities)

  • CCDC/Astex validation test set: 308 crystal complexes (

  • X-CScore dock set: 100 crystal complexes and binding affinities (wang, et al. 2003)


  • What is the first and possibly major second putative major principal component that if fixed would make the enrichment better?

  • For each improvement that could be made, your estimate of what should be done, how much effort, likelihood of improvement.

  • Closely look at the active site residues (ionization and protonation states) , use top decoy compounds to identify the residues that contribute to overestimation of the docking energy.


  • John @ Shoichet

  • CK @ Jacobson

  • Ursula & Eswar @ Sali