Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 John Mitchell; James McDonagh; Neetika Nath Rob Lowe; Richard Marchese Robinson.

Similar presentations


Presentation on theme: "1 John Mitchell; James McDonagh; Neetika Nath Rob Lowe; Richard Marchese Robinson."— Presentation transcript:

1 1 John Mitchell; James McDonagh; Neetika Nath Rob Lowe; Richard Marchese Robinson

2 RF-Score: a Machine Learning Scoring Function for Protein-Ligand Binding Affinities Ballester, P.J. & Mitchell, J.B.O. (2010) Bioinformatics 26, 1169-1175

3

4 Calculating the affinities of protein-ligand complexes:  For docking  For post-processing docking hits  For virtual screening  For lead optimisation  For 3D QSAR  Within series of related complexes  For any general complex  Absolute (hard!)  Relative A difficult, unsolved problem.

5 Three existing approaches … 1. Force fields

6 Three existing approaches … 2. Empirical Functions

7 Three existing approaches … 2. Empirical Functions

8 Three existing approaches … 3. Knowledge based

9 How knowledge-based scoring functions have worked …  P-L complexes from PDB  Assign atoms to types  Find histograms of type-type distances  Convert to an ‘energy’  Add up the energies from all P-L atom pairs

10

11

12  This conversion of the histogram into an energy function uses a “reverse Boltzmann” methodology.  Thus it “assumes” that the atoms of protein and ligand are independent particles in equilibrium at temperature T.  For a variety of reasons, these are poor assumptions …

13  Molecular connectivity: atom-atom distances are miles from being independent.  Excluded volume effects.  No physical basis for assuming such an equilibrium.  Changes in structure with T are small and not like those implied by the Boltzmann distribution.

14 We thought about this … … and wrote a paper saying “It’s not true, but it sort of works”

15 We thought about this … … and wrote a paper saying “It’s not true, but it sort of works”

16 Then we had a better idea – could we dispense with the reverse Boltzmann formalism?

17  Instead of assuming a formula that relates the distance distribution to the binding free energy … … use machine learning to learn the relationship from known structures and binding affinities.

18  Instead of assuming a formula that relates the distance distribution to the binding free energy … … use machine learning to learn the relationship from known structures and binding affinities.  And persuade someone to pay for it!

19 Random Forest Predicted binding affinity

20 Random Forest ● Introduced by Briemann and Cutler (2001) ● Development of Decision Trees (Recursive Partitioning): ● Dataset is partitioned into consecutively smaller subsets ● Each partition is based upon the value of one descriptor ● The descriptor used at each split is selected so as to optimise splitting ● Bootstrap sample of N objects chosen from the N available objects with replacement

21  The Random Forest is a just forest of randomly generated decision trees … … whose outputs are averaged to give the final prediction

22 Building RF-Score PDBbind 2007

23 Building RF-Score PDBbind 2007

24 Validation results: PDBbind set  Following method of Cheng et al. JCIM 49, 1079 (2009)  Independent test set PDBbind core 2007, 195 complexes from 65 clusters

25 Validation results: PDBbind set  RF-Score outperforms competitor scoring functions, at least on our test  RF-Score is available for free from our group website

26 26 John Mitchell; James McDonagh; Neetika Nath Rob Lowe; Richard Marchese Robinson


Download ppt "1 John Mitchell; James McDonagh; Neetika Nath Rob Lowe; Richard Marchese Robinson."

Similar presentations


Ads by Google