1 / 26

John Mitchell; James McDonagh ; Neetika Nath

John Mitchell; James McDonagh ; Neetika Nath. Rob Lowe; Richard Marchese Robinson . RF-Score: a Machine Learning Scoring Function for Protein-Ligand Binding Affinities . Ballester, P.J. & Mitchell, J.B.O. (2010) Bioinformatics 26, 1169-1175 .

chava
Download Presentation

John Mitchell; James McDonagh ; Neetika Nath

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. John Mitchell; James McDonagh; NeetikaNath Rob Lowe; Richard Marchese Robinson

  2. RF-Score: a Machine Learning Scoring Functionfor Protein-Ligand Binding Affinities • Ballester, P.J. & Mitchell, J.B.O. (2010) Bioinformatics 26, 1169-1175

  3. Calculating the affinities of protein-ligand complexes: • For docking • For post-processing docking hits • For virtual screening • For lead optimisation • For 3D QSAR • Within series of related complexes • For any general complex • Absolute (hard!) • Relative • A difficult, unsolved problem.

  4. Three existing approaches … 1. Force fields

  5. Three existing approaches … 2. Empirical Functions

  6. Three existing approaches … 2. Empirical Functions

  7. Three existing approaches … 3. Knowledge based

  8. How knowledge-based scoring functions have worked … • P-L complexes from PDB • Assign atoms to types • Find histograms of type-type distances • Convert to an ‘energy’ • Add up the energies from all P-L atom pairs

  9. This conversion of the histogram into an energy function uses a “reverse Boltzmann” methodology. • Thus it “assumes” that the atoms of protein and ligand are independent particles in equilibrium at temperature T. • For a variety of reasons, these are poor assumptions …

  10. Molecular connectivity: atom-atom distances are miles from being independent. • Excluded volume effects. • No physical basis for assuming such an equilibrium. • Changes in structure with T are small and not like those implied by the Boltzmann distribution.

  11. We thought about this … … and wrote a paper saying “It’s not true, but it sort of works”

  12. We thought about this … … and wrote a paper saying “It’s not true, but it sort of works”

  13. Then we had a better idea – could we dispense with the reverse Boltzmann formalism?

  14. Instead of assuming a formula that relates the distance distribution to the binding free energy … • … use machine learning to learn the relationship from known structures and binding affinities.

  15. Instead of assuming a formula that relates the distance distribution to the binding free energy … • … use machine learning to learn the relationship from known structures and binding affinities. • And persuade someone to pay for it!

  16. Random Forest Predicted binding affinity

  17. Random Forest ● Introduced by Briemann and Cutler (2001) ● Development of Decision Trees (Recursive Partitioning): ● Dataset is partitioned into consecutively smaller subsets ● Each partition is based upon the value of one descriptor ● The descriptor used at each split is selected so as to optimise splitting ● Bootstrap sample of N objects chosen from the N available objects with replacement

  18.  The Random Forest is a just forest of randomly generated decision trees … … whose outputs are averaged to give the final prediction

  19. Building RF-Score PDBbind 2007

  20. Building RF-Score PDBbind 2007

  21. Validation results: PDBbind set  Following method of Cheng et al. JCIM 49, 1079 (2009)  Independent test set PDBbind core 2007, 195 complexes from 65 clusters

  22. Validation results: PDBbind set • RF-Score outperforms competitor scoring functions, at least on our test • RF-Score is available for free from our group website

  23. John Mitchell; James McDonagh; NeetikaNath Rob Lowe; Richard Marchese Robinson

More Related