Download presentation
Presentation is loading. Please wait.
Published byLynne Hunter Modified over 6 years ago
1
Generating, Maintaining, and Exploiting Diversity in a Memetic Algorithm for Protein Structure Prediction Mario Garza-Fabre, Shaun M. Kandathil, Julia Handl, Joshua Knowles, Simon C. Lovell Presentation by Michiel Braat, Hugo Heemskerk, Kambiz Sekandar and Matthijs de Wachter
2
Protein structure prediction
– Applicable in medicine – We have: amino acid sequence – We want: 3d model of protein – Not the same as dynamic process of protein folding
3
Folds By Thomas Splettstoesser ( - Own work, CC BY-SA 3.0,
4
Problems of protein structure prediction
1) Combinatorial explosion 2) Difficult to explore diverse set of protein folds 3) Energy configuration function of proteins is 3A) Deceptive 3B) Inaccurate
5
The problems in GA terms
1) Combinatorial explosion 2) Difficult to explore diverse set of protein folds 3) Energy configuration function of proteins is 3A) Deceptive 3B) Inaccurate 1) Rugged fitness landscape 2) Loss of diversity 3A) Deceptive fitness function 3B) Inaccurate fitness function
6
Solutions – Genetic local search (memetic algorithm)
– Specialised genetic operators – Generalised stochastic ranking – Conformational diversity measures (tell apart compact structures with different folds)
7
Protein structure construction algorithms
– Homologous proteins (global similarity), hard problem – Fragment-assembly (local similarity), more recent, seems more promising – Turns protein folds into combinatorial optimisation – Worse for larger proteins/with many self-touching segments
8
Fragment-assembly – Divide target protein into amino acid fragments
– Match then extract fragments of known proteins – Recombine fragments with an optimisation scheme – Generates a low-resolution model – Key advantage: no prior similar proteins required
9
Rosetta heuristic – This is the local search algorithm used
– Uses fragment-assembly protein as base model – Varies backbone torsion angles (“protein skeleton” rotations)
10
Rosetta-based memetic algorithm (RMA)
Rosetta as local search strategy Genetic operators use specific problem knowledge (about secondary structures) Ranked Selection over Parents+Offspring Evaluation of the energy state as only evaluation (for now...)
11
RMA - variation 2 point crossover on loops
loop locations based on secondary structure predictions
12
RMA - Mutation Mutation by fragment insertion
Only done on amino acid residues part of a loop
13
Energy evaluation VS RMSD
Optimal energy function does not always give the best conformation to the real thing! RMSD corresponds to the real structure of the proteins Root mean square deviation (distances between secondary structures)
14
RMA vs Rosetta 1000 local searches 30 different proteins
Rosetta = blue RMA = red 1enh
15
Genetic Operator and Exploration
Influence of the specific genetic operators on the exploration of the different folds Experiment with: no operators normal 2 point crossover and normal rosetta mutations original RMA original RMA with wrong secondary structure information
16
Genetic Operator and secondary structure
Protein: 1ehn 3 secondary structures distances between structures darker red = more exploration
17
How to deal with inaccuracies
No correlation between energy and RMSD Diversity is a measure for RMSD
18
How to deal with inaccuracies
Stochastic ranking for dealing with 2 criteria Algorithm is based on a bubble-sort like procedure Based on probabilities
19
Experimental Results Three different values for the parameter of stochastic ranking were analysed: ρ ∈ {.45, .5, .55} These were compared with Rosetta, RMA with energy-based selection and each other.
20
Experimental Results Stochastic ranking reduced selection pressure
R = Rosetta E = Energy-based selection RMA S = Stochastic based selection RMA with ρ = {.55, .5, .45} Stochastic ranking reduced selection pressure All forms of RMA, except ρ = .45, outperformed Rosetta
21
Experimental Results R = Rosetta E = Energy-based selection RMA S = Stochastic based selection RMA with ρ = {.55, .5, .45} Consideration of structural diversity has increased the likelihood of the RMA reaching and preserving more native-like conformations ρ = .5 seems to produce the most competitive performance Cases where RMA cannot outperform Rosetta tend to be associated with higher energies, and thus more difficult for RMA to retain When energy and RMSD are well-correlated, the stochastic strategy still has competitive results
22
Experimental Results Fragment-assembly methods rely on the existence of native-like configurations in the conformational space defined by the fragment libraries employed For some targets no native-like structures were sampled This may mean the libraries used for this study are lacking, and deserves further investigation. For instance, 1tul and 1dhn
23
Diversity Generation and Preservation
Next, we examine the effect of the genetic operators and the survival selection strategy on the diversity generation and preservation.
24
Diversity Generation and Preservation
Without genetic operators, the energy-driven RMA (i) produces compact, well-defined solution clusters. The lack of mechanisms boosting exploration and high selection pressure can lead to premature convergence. Adding recombination and mutation (ii), and using stochastic selection (iii) both increase diversity. Combining these (iv), however, gives the best results.
25
Diversity Generation and Preservation
Having two criteria causes a drop in offspring survival. This slows the convergence speed and results in higher diversity.
26
Discussion Cons: Accuracy was only tested on known protein structures
Pros: Generally, applying GAs to other fields of study leads to new challenges in genetic computation research Specifically in this paper: Inaccurate fitness function ⇒ Solution: Selecting for diversity
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.