1 / 21

Shuxing Zhang, Alexander Golbraikh and Alex Tropsha The Laboratory for Molecular Modeling

Development of Novel Geometrical Chemical Descriptors and Their Application to the Prediction of Ligand-Protein Binding Affinity. Shuxing Zhang, Alexander Golbraikh and Alex Tropsha The Laboratory for Molecular Modeling School of Pharmacy University of North Carolina at Chapel Hill

yael-hodge
Download Presentation

Shuxing Zhang, Alexander Golbraikh and Alex Tropsha The Laboratory for Molecular Modeling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Development of Novel Geometrical Chemical Descriptors and Their Application to the Prediction of Ligand-Protein Binding Affinity Shuxing Zhang, Alexander Golbraikh and Alex Tropsha The Laboratory for Molecular Modeling School of Pharmacy University of North Carolina at Chapel Hill November 28, 2014

  2. Problem Given a protein-ligand complex, predict ligand binding affinity.

  3. Knowledge-based (Statistical) Potentials • Two Body potentialsPMF Muegge, I.; Martin, Y.C.; J.Med.Chem.1999, 42, 791-804BLEEPMitchell, J.B.; Laskowski R.A.; Alex A.; Thornton, J.M.; J. Comp. Chem. 1999, 20,1165-1176DrugScoreGohlke, H.; Hendlich, M.; Klebe,G.; J Mol Biol 2000, 295, 337-356 SMoGDeWitte, R. S.; Shakhnovich, E.I. J Am. Chem. Soc. 1996, 118,11733-11744SMoG2001Ishchenko. A. V.; Shakhnovich, E. I.; J. Med. Chem. 2002, 45, 2770-2780 • Four-Body contact potential (By Jun Feng)

  4. Full Atom-based Delaunay tessellation of Protein-ligand Interface (5HVP)

  5. Three Types of Tetrahedra at Protein-ligand Interface RLLL RRLL RRRL RRRL: Formed by 3 receptor atoms and 1 ligand atoms RRLL: Formed by 2 receptor atoms and 2 ligand atoms RLLL: Formed by 1 receptor atoms and 3 ligand atoms

  6. Earlier work: Four-Body Statistical Contact Scoring Function Based on Delaunay Tessellation

  7. Correlation between experimental and calculated binding free energy for PMF dataset using four-body scoring function

  8. Comparison of Current Scoring Functions

  9. Multiple CG descriptors of protein-ligand interface and correlation with ligand affinity • Define the ligand-receptor interface by the means of DT • Calculate chemical descriptors for nearest neighbor atom quadruplets. • Use statistical data modeling approach to correlate descriptors and affinity

  10. Descriptors derived from atomic electronegativity µ: Electronegativity (chemical potentials) of atoms Q: Partial charges on atoms Η: Hardness kernel

  11. Atom Type Definition based on En values There are 554 possible interfacial quadruplet composition types. After processing 517 complexes, 100 are found to occur with high frequency (at least 50 times).

  12. 2.5 C_R 3.0 N_R 2.4 S_L O_L 3.4 Descriptor Calculation m: m-th tetrahedral composition type j: Vertex of a tetradedron n: Number of m-th composition type Thus, there are 100 descriptors for each protein-ligand complex

  13. Flowchart of Novel Descriptor Generation Process files and assign atom type based on EN value Define interaction interface with DT and record all interfacial tetrahedra 264 complexes Classify interfacial tetrahedra into different composition types and calculate their EN values (Descriptors) Correlate with Binding

  14. ^ {Binding affinity} = K{descriptor diversity} Structure Binding CG Descriptors Comp.1 Value1 D1 D2 D3 D4 Comp.2 Value2 " " " " Comp.3 Value3 " " " " - - - - - - - - - - - - - - Comp.N-264 Value264 " " " " Data Modeling Goal: Establish correlations between descriptors and the binding affinity capable of predicting binding of novel complexes

  15. Diversity of the dataset: 264 Complexes, 33 families

  16. Data Modeling Workflow Y-Randomization Multiple Training Sets Variable Selection kNN to build models Split 240 into Training and Test Sets 264 Complexes Only accept models that have a q2 > 0.6 R2 > 0.6, etc. Multiple Test Sets Binding Prediction Randomly Exclude 24 Complexes as External Set Validate Predictive Models with Randomly Selected External Sets (24)

  17. k Nearest Neighbor (kNN) with Variable Selection N times Leave out one complex from the training set and calculate distance between the eliminated and all remaining compounds (in the original 100 descriptor space) Randomly select a subset of descriptors (a hypothetical descriptor pharmacophore) Leave out a complex SA N times Find k nearest neighbors in the training set Predict the binding affinity of the eliminated complex by weighted kNN using the identified k nearest neighbors. Select acceptable models (with q2 > 0.6) Calculate the predictive ability (q2) of the model

  18. Correlation of Actual ~ Predicted Binding Affinity for 49 Test Set Complexes

  19. Correlation of Actual ~ Predicted Binding Affinity for 24 Complexes with Best Model

  20. Comparison of Current Scoring Functions

  21. Conclusions • Novel geometrical chemical descriptors have been developed • These simple yet fundamental descriptors can be used to predict binding affinity using correlation approaches; have high prediction power for diverse ligand-protein structures • The statistical models can be used for fast and accurate scoring of complexes resulting from docking studies

More Related