slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
John Mitchell; James McDonagh ; Neetika Nath PowerPoint Presentation
Download Presentation
John Mitchell; James McDonagh ; Neetika Nath

Loading in 2 Seconds...

play fullscreen
1 / 26

John Mitchell; James McDonagh ; Neetika Nath - PowerPoint PPT Presentation


  • 135 Views
  • Uploaded on

John Mitchell; James McDonagh ; Neetika Nath. Rob Lowe; Richard Marchese Robinson . RF-Score: a Machine Learning Scoring Function for Protein-Ligand Binding Affinities . Ballester, P.J. & Mitchell, J.B.O. (2010) Bioinformatics 26, 1169-1175 .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

John Mitchell; James McDonagh ; Neetika Nath


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. John Mitchell; James McDonagh; NeetikaNath Rob Lowe; Richard Marchese Robinson

    2. RF-Score: a Machine Learning Scoring Functionfor Protein-Ligand Binding Affinities • Ballester, P.J. & Mitchell, J.B.O. (2010) Bioinformatics 26, 1169-1175

    3. Calculating the affinities of protein-ligand complexes: • For docking • For post-processing docking hits • For virtual screening • For lead optimisation • For 3D QSAR • Within series of related complexes • For any general complex • Absolute (hard!) • Relative • A difficult, unsolved problem.

    4. Three existing approaches … 1. Force fields

    5. Three existing approaches … 2. Empirical Functions

    6. Three existing approaches … 2. Empirical Functions

    7. Three existing approaches … 3. Knowledge based

    8. How knowledge-based scoring functions have worked … • P-L complexes from PDB • Assign atoms to types • Find histograms of type-type distances • Convert to an ‘energy’ • Add up the energies from all P-L atom pairs

    9. This conversion of the histogram into an energy function uses a “reverse Boltzmann” methodology. • Thus it “assumes” that the atoms of protein and ligand are independent particles in equilibrium at temperature T. • For a variety of reasons, these are poor assumptions …

    10. Molecular connectivity: atom-atom distances are miles from being independent. • Excluded volume effects. • No physical basis for assuming such an equilibrium. • Changes in structure with T are small and not like those implied by the Boltzmann distribution.

    11. We thought about this … … and wrote a paper saying “It’s not true, but it sort of works”

    12. We thought about this … … and wrote a paper saying “It’s not true, but it sort of works”

    13. Then we had a better idea – could we dispense with the reverse Boltzmann formalism?

    14. Instead of assuming a formula that relates the distance distribution to the binding free energy … • … use machine learning to learn the relationship from known structures and binding affinities.

    15. Instead of assuming a formula that relates the distance distribution to the binding free energy … • … use machine learning to learn the relationship from known structures and binding affinities. • And persuade someone to pay for it!

    16. Random Forest Predicted binding affinity

    17. Random Forest ● Introduced by Briemann and Cutler (2001) ● Development of Decision Trees (Recursive Partitioning): ● Dataset is partitioned into consecutively smaller subsets ● Each partition is based upon the value of one descriptor ● The descriptor used at each split is selected so as to optimise splitting ● Bootstrap sample of N objects chosen from the N available objects with replacement

    18.  The Random Forest is a just forest of randomly generated decision trees … … whose outputs are averaged to give the final prediction

    19. Building RF-Score PDBbind 2007

    20. Building RF-Score PDBbind 2007

    21. Validation results: PDBbind set  Following method of Cheng et al. JCIM 49, 1079 (2009)  Independent test set PDBbind core 2007, 195 complexes from 65 clusters

    22. Validation results: PDBbind set • RF-Score outperforms competitor scoring functions, at least on our test • RF-Score is available for free from our group website

    23. John Mitchell; James McDonagh; NeetikaNath Rob Lowe; Richard Marchese Robinson