Molecular modelling José R. Valverde CNB/CSIC © José R. Valverde, 2014 CC-BY-NC-SA.

Molecular modelling José R. Valverde CNB/CSIC jrvalverde@cnb.csic.es © José R. Valverde, 2014 CC-BY-NC-SA

Contents Ab initio modelling Ab initio Quantum Mechanics Ab initio Molecular Mechanics Homology modelling Homology Threading Structure prediction

Ab initio modelling Predict 3D structure from chemical formula Out of purely theoretical models Maximum quality (given state of the art) Maximum cost (nothing is assumed)

Ab initio QM The best possible approximation Tremendous computational cost N 3 – N 8 (on the number of elementary particles) Unfeasible for all but smallest systems Modern approaches on the order of N (medium sized systems) Linearly scaling DFT Multipoles and cutoffs MOZYME

Ab initio MM/MD Classical mechanics treatment E = E bonded + E non-bonded E non-bonded scales to N 2 Soft charged spheres joined by springs Ignore bond-breaking and formation From scratch approaches Tractable only for small-medium size systems Large systems Workable if we can start close to the solution

Ab initio modeling Hard problem solvable for small proteins/fragments (~ < 200 a.a.) Energy-based / fragment-based Model from scratch using physical principles Evolutionary covariation to predict contacts Working for hundreds of residues Servers Quark:zhanglab.ccmb.med.umich.edu/QUARK/ Robetta: http://robetta.bakerlab.org/ Evfold: http://evfold.org/evfold-web/evfold.do

Homology modelling Conformational search is a combinatorial problem: N! In the best case we know all factors involved But we don't We may assume that similar functions share similar structures and sequences Start from similar structure Assume it is close to the solution Apply an energy minimization step.

Know your problem The first step in any simulation is knowing your problem What do you want to know What is already known? Is the solution already known? Is there an approximate solution? What are the characteristics, properties and constrains of your system? Bibliography Database searches

Search for the structure Is the structure already known? Do we know the sequence? Blast/FastA against PDB Look for exact matches A mutation or polymorphism means we need to build a model If we ignore the sequence Direct PDB searches If we succeed, we can ignore the rest: we do have the structure

Search for a template Start from a sequence New sequence Old sequence mutated Use FASTA format on a text editor Select a modeling method General good similarity (>40%): Homology Twilight zone (30-40%): Threading Local similarities: partial models

Steps Template recognition and alignment Alignment correction Backbone generation Loop modeling Side-Chain modeling Model optimization Model validation

Finding templates Run blast/FastA against PDB Do we have matches with enough similarity? > 40% : homology model 30-40% : threading What is the sequence coverage Complete or nearly complete: homology Domain of interest: homology Partial domains : threading

Align sequence to template We will use the known structure(s) as template We must pair residues in the sequence with unknown structure with residues in the sequence(s) with known structure Start with an automatic alignment program Review and correct the alignment e. g. check domain/secondary structure limits

Generate backbone Using as template the backbone coordinates of the known structure(s) assign coordinates to the backbone of the problem sequence N, C , C, O If they are the same, the side chain can also be assigned Note that there may be discordances between templates

Loop/turn modeling Most discordances will affect loops Different lengths, greater flexibility Cut out loops Model loops separately Knowledge based: refer to known PDB structures Loop databases Energy based (ab initio): minimize an energy function

Side chain modeling When side chains are not the same, we cannot use the reference coordinates Use available knowledge (gathered from PDB) to select the most appropriate rotamer Dunbrack rotamer library Look for a rotamer that favours packing and lower energy ~90% accuracy on hydrophobic core (tightly packed), ~50% for surface residues

Refine initial model What next? It may be sensible to stop here If similarity is very high (point mutation) After ensuring no steric clashes (try rotamers) It we know there are no major changes If validation shows no major conflicts If it agrees with prior knowledge Perform additional refinement cycles Minimization (to avoid strong conflicts) MD (to allow for conformational changes) Simulated annealing

Model optimization In general, more stable structures will have lower internal energy Compute energy Use energy to optimize the structure Molecular Mechanics Force field Various minimization algorithms: Quick and dirty Slow and accurate

Further optimization Validate your structure Check against prior knowledge Optimization might have found a local minimum Look for alternate configurations Simulated Annealing Molecular Dynamics Simulation times are too short (ps-ns) Validate and repeat

Model Validation All models tend to contain errors Errors may accumulate after “refinement”! There may be “errors” in the template(s) Check for energetic conflicts Check for “normality” Requires caution Compare with template(s) Decide if they are significant (affect relevant portions of the structure)

Homology modelling Check CASP results Some studies report that automated modeling may be safer than human-curated Less subjective decisions Start with public servers Automatic: Upload a sequence and wait 3D-Jigsaw, CPHmodels, Robetta, EasyPred3D... Semiautomatic: Upload a sequence and participate WHATIF, HOMER, SwissModel with DeepView

MetaServers GeneSilico www.genesilico.pl/meta2/ Wide variety of predictions LOMETS: zhanglab.ccmb.med.umich.edu/LOMETS Local MetaThreading server Protein Model Portal: www.proteinmodelportal.org Interactive Modeling Access existing models

Servers CPHmodels http://www.cbs.dtu.dk/services/CPHmodels/ HHpred http://toolkit.tuebingen.mpg.de/hhpred ModBase http://modbase.compbio.ucsf.edu/ LOOPP http://cbsuapps.tc.cornell.edu/loopp.aspx Zhang Lab http://zhanglab.ccmb.med.umich.edu/

Servers (continued) Phyre 2 http://www.sbg.bio.ic.ac.uk/~phyre2/ (ps) 2 http://ps2.life.nctu.edu.tw/ PsiPred http://bioinf.cs.ucl.ac.uk/psipred/ M4T http://manaslu.aecom.yu.edu/M4T/ SwissModel http://swissmodel.expasy.org/

Threading Generalization of homology modeling Homology: align sequence to sequence Threading: align sequence to structures Limited number of basic folds in nature Amino acids have environmental preferences for specific folds Used when sequence identity < 25%

Threading components Library of core fold templates (e.g. from SCOP, CATH, FSSP, PDB...) Function to evaluate quality of assignment Consider a. a. preferences for location, structure, neighbors... Method for aligning sequences to fold templates Method for choosing best template among alignments

Threading software HHpred http://toolkit.tuebingen.mpg.de/hhpred RaptorX http://raptorx.uchicago.edu/ Phyre 2 http://www.sbg.bio.ic.ac.uk/phyre2 SPARKS X sparks-lab.org/yueyang/server/SPARKS-X/ 3D-JigSaw http://bmm.cancerresearchuk.org/~3djigsaw/

Questions? Image by DasWortgewand. CC0 http://pixabay.com/en/head-human-head-half-profile-196541/

Molecular modelling José R. Valverde CNB/CSIC © José R. Valverde, 2014 CC-BY-NC-SA.

Similar presentations

Presentation on theme: "Molecular modelling José R. Valverde CNB/CSIC © José R. Valverde, 2014 CC-BY-NC-SA."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Molecular modelling José R. Valverde CNB/CSIC © José R. Valverde, 2014 CC-BY-NC-SA.

Similar presentations

Presentation on theme: "Molecular modelling José R. Valverde CNB/CSIC © José R. Valverde, 2014 CC-BY-NC-SA."— Presentation transcript:

Similar presentations

About project

Feedback