1 / 32

2 nd Joint Sheffield Conference on Chemoinformatics: Computational Tools for Lead Discovery

2 nd Joint Sheffield Conference on Chemoinformatics: Computational Tools for Lead Discovery. Flexsim-R: A new 3D descriptor for combinatorial library design and in-silico screening. Outline. Introduction The Flexsim-R Methodology Validation Conclusion and Outlook. Introduction.

Download Presentation

2 nd Joint Sheffield Conference on Chemoinformatics: Computational Tools for Lead Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2nd Joint Sheffield Conference on Chemoinformatics: Computational Tools for Lead Discovery Flexsim-R: A new 3D descriptor for combinatorial library design and in-silico screening

  2. Outline • Introduction • The Flexsim-R Methodology • Validation • Conclusion and Outlook

  3. Introduction What is Flexsim-R? Flexsim-R calculates 3D descriptors for reagents, based on the virtual affinity fingerprint idea

  4. Motivation to develop Flexsim-R • Reagent-based descriptors are important for • combinatorial library design • virtual screening experiments • bioisosteric replacements • rational augmentation of inhouse reagent pool • For large combinatorial libraries, product-based descriptor calculation is often not feasible -> possible solution: reagent-based product selection (e.g. by a GA) • Descriptor calculation should be fast and automizable • Descriptor should be related to experimental affinity data • Encouragement by virtual affinity fingerprint methods

  5. In-vitro Affinity Fingerprints Terrapin's Affinity Fingerprint Approach: (Kauvar et al., Chemistry & Biology, 1995, 2, 107-118) A1 A2 A3 A4 A5 A6 A7 A8 L1 Molecular similarity is defined by in-vitro binding patterns ("Affinity Fingerprints") of a ligand set (L) in reference binding assays (A) L2 L3 L4 L5 L6

  6. Virtual Affinity Fingerprints (VAF) Terrapins in-vitro screening in diverse reference assays is simulated • by Computational Docking into a reference panel of protein pockets (Docksim, Flexsim-X) • by Computational Fitting onto a reference panel of small molecules (Flexsim-S) (Briem and Lessel, Perspectives in Drug Discovery and Design, 20 (2000) 231-244)

  7. The Flexsim-R Method

  8. Protein pocket The Flexsim-R Method Problems with Rgroups in conventional VAF approaches: • Rgroups tend to be smaller than „drug-like“ molecules • Alignment rule by common core attachment point gets lost Solution: Core-constrained multiple-site docking

  9. 3. Protein Binding Pockets 2. Common Core 1. Rgroup Set The Flexsim-R Method Components of core-constrained multiple-site docking:

  10. The Flexsim-R Method First step: • Docking of common core group with FlexX • Multiple (e.g. 50 best) solutions are stored • RMS threshold can be applied to prevent clustering

  11. The Flexsim-R Method Example: Thrombin active site with 50 best FlexX solutions of hydantoin (RMS threshold = 2.0)

  12. Descriptor Matrix Protein pocket Core Pos1 Core Pos2 ... R1 15.5 ... R2 11.2 ... ... R3 21.7 ... ... ... ... The Flexsim-R Method Second step: • Docking of core group + rgroup with FlexX • Pre-stored core positions serve as reference • FlexX scores are stored in descriptor matrix 15.7 22.0 13.5

  13. The Flexsim-R Method

  14. The Flexsim-R Method

  15. The Flexsim-R Method

  16. Pocket 1 Pocket 2 Pocket 3 C1 C1 C2 C2 C3 C3 C1 C2 C3 R1 R2 R3 ... The Flexsim-R Method Multiple protein pockets -> Concatenated descriptor matrix

  17. X1 X2 X4 X3 C1 C1 C1 C2 C2 C2 C3 C3 C3 C1 C2 C3 R1 R2 R3 ... The Flexsim-R Method Multiple core attachment points -> Concatenated descriptor matrix

  18. The Flexsim-R Method Example: Hydantoin Core 4 attachment points * 7 protein pockets * 50 FlexX solutions -> descriptor vector length = 1,400

  19. The Flexsim-R Method Test set for method development and evaluation: • Rgroups: 20 natural amino acids • Core groups: • 7 protein pockets: 1dwc, 1eed, 1pop, 2tsc, 3cla, 3dfr, 5ht2 (model)

  20. Correlation Analysis • Analyses were performed to check correlation between • different protein pockets • different cores • different attachment points • Analyses are based on euclidian distance matrices for all 190 pairwise amino acid vector combinations

  21. Correlation Analysis • Correlation matrix of protein pockets: (hydantoin core, all 4 attachment points)

  22. Correlation Analysis • Correlation matrix of core groups: (all 7 protein pockets, all attachment points)

  23. Correlation Analysis • Correlation matrix of attachment points: (hydantoin core, all 7 protein pockets)

  24. Correlation Analysis Reduction of descriptor vector length (dimensionality) : • no PCA was performed, since we want to get information about the most uncorrelated descriptor columns • instead, an elimination method has been applied: • the complete pairwise correlation matrix is calculate • all pairs of columns with correlation coefficient (r) above a user-defined threshold (e.g. 0.7) are considered for elimination • from each correlating pair, that column is eliminated which can be better described by multiple linear regression of the remaining descriptors • resulting matrix doesn‘t contain pairs of columns with correlation coefficient above the threshold

  25. Descriptor set 1 Descriptor set 3 Descriptor set 2 Correlation Analysis Example: hydantoin core, all 7 proteins, all 4 attachment points

  26. Correlation Analysis Thrombin with three most information-rich core positions

  27. Descriptor Validation • Five peptide datasets, taken from literature(Refs. in Matter, H., J. Peptide Res. 52 (1998) 305-314) • Product descriptors are generated by concatenation of respective reagent descriptors • Validation by PLS Analysis • leave-one-out (LOO) and leave-random-groups-out (LRGO) cross-validation

  28. Descriptor Validation • Datasets:

  29. ACE BIT BRA ENK BR9 Descriptor Validation: Results Leave-random-groups-out (LRGO) results:

  30. Summary • Flexsim-R comprises a novel virtual affinity fingerprint method, which calculates meaningful 3D descriptors for reagents • High correlation between different cores and attachment points • For 3 out of 5 validation sets, significant cross-validated q2 values could be obtained • Rgroup alignment problem is tackled inherently • Flexsim-R calculations are fast and can be automated easily: • only clipped reagent structures are required • core positions need to be calculated only once

  31. Outlook • More validation sets have to be tested (e.g. „real-life“ combichem dataset) • Is there a set of descriptors, which works well for different datasets? • Integration in Boehringer Ingelheim library design and virtual screening workflow

  32. Acknowledgements • Alexander Weber (Boehringer Ingelheim/University of Marburg) • Andreas Teckentrup (Boehringer Ingelheim) • Hans Matter (Aventis) • BMBF for financial support

More Related