Protein Structure Visualisation

Slides:



Advertisements
Similar presentations
Protein structure prediction.. Protein folds. Fold definition: two folds are similar if they have a similar arrangement of SSEs (architecture) and connectivity.
Advertisements

Tutorial Homology Modelling. A Brief Introduction to Homology Modeling.
Protein Tertiary Structure Prediction
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
1 Levels of Protein Structure Primary to Quaternary Structure.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Thomas Blicher Center for Biological Sequence Analysis
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Tertiary Structure Prediction Methods Any given protein sequence Structure selection Compare sequence with proteins have solved structure Homology Modeling.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
COMPARATIVE or HOMOLOGY MODELING
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Representations of Molecular Structure: Bonds Only.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
Secondary structure prediction
Modelling Genome Structure and Function Ram Samudrala University of Washington.
MolIDE2: Homology Modeling Of Protein Oligomers And Complexes Qiang Wang, Qifang Xu, Guoli Wang, and Roland L. Dunbrack, Jr. Fox Chase Cancer Center Philadelphia,
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Predicting Protein Structure: Comparative Modeling (homology modeling)
Programme Last week’s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues Summary.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Homology Modeling 原理、流程,還有如何用該工具去預測三級結構 Lu Chih-Hao 1 1.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
Arginine, who are you? Why so important?. Release 2015_01 of 07-Jan-15 of UniProtKB/Swiss-Prot contains sequence entries, comprising
Sequence similarity, BLAST alignments & multiple sequence alignments
Computational Structure Prediction
Protein Structure and Properties
Structure of the Rho Family GTP-Binding Protein Cdc42 in Complex with the Multifunctional Regulator RhoGDI  Gregory R. Hoffman, Nicolas Nassar, Richard.
Protein Structure Prediction and Protein Homology modeling
Protein Sequence Alignments
Cathode (attracts (+) amino acids)
Figure 3.14A–D Protein structure (layer 1)
Haixu Tang School of Inforamtics
Aligning Sequences You have learned about: Data & databases Tools
Ross Alexander Robinson, Xin Lu, Edith Yvonne Jones, Christian Siebold 
It og Sundhed Thomas Nordahl Petersen, Associate Professor
Homology Modeling.
Levels of Protein Structure
Protein structure prediction.
Programme Last week’s quiz results + Summary
How to Test an Assertion
Chi-Hon Lee, Kalle Saksela, Urooj A Mirza, Brian T Chait, John Kuriyan 
Protein Homology Modelling
Crystal Structure of the MHC Class I Homolog MIC-A, a γδ T Cell Ligand
Ross Alexander Robinson, Xin Lu, Edith Yvonne Jones, Christian Siebold 
Volume 3, Issue 2, Pages (February 1995)
Chapter 18 Naturally Occurring Nitrogen-Containing Compounds
Volume 91, Issue 5, Pages (November 1997)
Volume 15, Issue 6, Pages (December 2001)
Structure of the Rho Family GTP-Binding Protein Cdc42 in Complex with the Multifunctional Regulator RhoGDI  Gregory R. Hoffman, Nicolas Nassar, Richard.
Volume 91, Issue 5, Pages (November 1997)
Hideki Kusunoki, Ruby I MacDonald, Alfonso Mondragón  Structure 
Thomas Nordahl Petersen, Associate Prof, Food DTU
Structure of an IκBα/NF-κB Complex
Homology modeling in short…
Thomas Nordahl Petersen, Associate Bioinformatics, DTU
Structure of GABARAP in Two Conformations
Structure of the Mtb CarD/RNAP β-Lobes Complex Reveals the Molecular Basis of Interaction and Presents a Distinct DNA-Binding Domain for Mtb CarD  Gulcin.
Volume 4, Issue 4, Pages (October 1999)
The Structure of the MAP2K MEK6 Reveals an Autoinhibitory Dimer
Presentation transcript:

Protein Structure Visualisation PyMOL

What is PyMOL? Open-source molecular viewing program http://www.pymol.org

PyMOL Representations Surfaces Ray-tracing (rendering) Lines, sticks, ribbon, spheres, cartoon(s) Surfaces Transparency, quality Ray-tracing (rendering) Modes

Selections & Objects Every molecule (pdb file) is an object. Selections refer to objects Make smaller or composite objects Changes in representation can affect objects or selections.

Links PDB (protein structure database) PyMOL home: PyMOL manual: www.pdb.org/ PyMOL home: http://pymol.sourceforge.net/ PyMOL manual: http://pymol.sourceforge.net/newman/user/toc.html PyMOL Wiki: http://www.pymolwiki.org/index.php/Main_Page PyMOL settings (documented): http://cluster.earlham.edu/detail/bazaar/software/pymol/modules/pymol/setting.py

Aesthetics of Molecular Images Personal taste General guidelines: Focus on relevant parts Strip away unnecessary information Find good viewing angle! Choice of representation No need for excessive graphics Good figures are (almost) self-explanatory.

Molecular Images Molecular graphics is an example of selective reduction of complexity.

Molecular Images Molecular graphics is an example of selective reduction of complexity.

Molecular Images Molecular graphics is an example of selective reduction of complexity.

Molecular Images Molecular graphics is an example of selective reduction of complexity.

Molecular Images Molecular graphics is an example of selective reduction of complexity.

Molecular Images Molecular graphics is an example of selective reduction of complexity.

Molecular Images Molecular graphics is an example of selective reduction of complexity.

Molecular Images Molecular graphics is an example of selective reduction of complexity.

Molecular Images Molecular graphics is an example of selective reduction of complexity.

Figures in PyMOL

Representations ray_trace_mode 0

Representations ray_trace_mode 1

Representations ray_trace_mode 2

Representations ray_trace_mode 3

Representations

Representations

Representations

Homology Modelling

No Structure = No Go? Do homology modelling. Validate the model. Use known protein structures to make reliable models. Validate the model. Adjust expectations and use accordingly!

Basic concepts - Evolution preserves important functions - Function is performed by specific structure Most of the sequences don't fold

Basic concepts Homologous proteins

Basic concepts Homologous proteins derive from the same ancestral protein

Basic concepts Homologous proteins derive from the same ancestral protein Orthology occurs Paralogy occurs

Basic concepts Homologous proteins derive from the same ancestral protein Orthology occurs with Speciation Structure and function very conserved Paralogy occurs with Duplication Structure conserved

Homology modeling Identify a homolog with solved structure (structural template) using sequence similarity to copy the conserved regions

Homology modeling Identify a homolog with solved structure (structural template) using sequence similarity to copy the conserved regions Structure and function are more conserved than sequence

How Often Can We Do It? Currently ~100,000 structures in the PDB Approx. ~12000 with sequence ident. <30 % and resolution <3.0 Å. 25% of all sequences can be modelled. 50% can be assigned to a fold class.

How Well Can We Do It? Sali, A. & Kuriyan, J. Trends Biochem. Sci. 22, M20–M24 (1999) 

How Is It Done? Identify template(s) Improve alignment Initial alignment Improve alignment Backbone generation Loop modelling Side chains Refinement Validation 

Template Identification Search with sequence Blast (>60%) Psi-Blast (>30%) profile-profile alignment (>15%) Fold recognition methods (>10%) Use biological information Functional annotation in databases Active site/motifs

alignment pipelines Hhpred (profile-profile comparison) 3 psiblast/hhblits cycles (improve alignment locally with mac) - find homologs with 2 cycles of cs-blast realign with muscle/mafft check ins/dels use the alignment with HHpred

Alignment

1 2 3 4 5 6 7 8 9 10 11 12 13 14 PHE ASP ILE CYS ARG LEU PRO GLY SER ALA GLU ALA VAL CYS PHE ASN VAL CYS ARG THR PRO --- --- --- GLU ALA ILE CYS PHE ASN VAL CYS ARG --- --- --- THR PRO GLU ALA ILE CYS F D I C R L P G S A E V 6 -2 -3 2 -1 N 1 5 8 T

1 2 3 4 5 6 7 8 9 10 11 12 13 14 PHE ASP ILE CYS ARG LEU PRO GLY SER ALA GLU ALA VAL CYS PHE ASN VAL CYS ARG THR PRO --- --- --- GLU ALA ILE CYS PHE ASN VAL CYS ARG --- --- --- THR PRO GLU ALA ILE CYS F D I C R L P G S A E V 6 -2 -3 2 -1 N 1 5 8 T

Improving the Alignment 1 2 3 4 5 6 7 8 9 10 11 12 13 14 PHE ASP ILE CYS ARG LEU PRO GLY SER ALA GLU ALA VAL CYS PHE ASN VAL CYS ARG THR PRO --- --- --- GLU ALA ILE CYS PHE ASN VAL CYS ARG --- --- --- THR PRO GLU ALA ILE CYS From ”Professional Gambling” by Gert Vriend http://www.cmbi.kun.nl/gv/articles/text/gambling.html

Thumb rules no deletion with terminals far apart in space no insertion in the core check for conserved residues, especially G, P, C check for motif conservation be careful with secondary structure prediction identify disordered/accessory regions

Template Quality Selecting the best template is crucial! The best template may not be the one with the highest % id (best p-value…) Template 1: 93% id, 3.5 Å resolution  Template 2: 90% id, 1.5 Å resolution 

Backbone Generation Generate the backbone coordinates from the template for the aligned regions. Several programs can do this, most of the groups at CASP6 use Modeller: http://www.salilab.org/modeller/

Side Chains Side chain rotamers are dependent on backbone conformation. Most successful method in CASP6 was SCWRL by Dunbrack et al.: Graph-theory knowledge based method to solve the combinatorial problem of side chain modelling. http://dunbrack.fccc.edu/scwrl4/index.php

Side Chains Prediction accuracy: High for buried residues Low for surface residues Experimental reasons: side chains at the surface are more flexible. Theoretical reasons: much easier to handle hydrophobic packing in the core than the electrostatic interactions including H-bonds to waters. Myoglobin

Side Chains If the seq. id is high, the networks of side chain contacts may be conserved. Keeping the side chain rotamers from the template may be better than predicting new ones.

Error Recovery Errors in the model can NOT be recovered at a later step The alignment can not make up for a bad choice of template. Loop modeling can not make up for a poor alignment. The step where the errors were introduced should be redone.

Model refinement Close loops with Rosetta, modeller, FREAD SCWRL4 for side chains Check for error/clashes before refining don't over-refine (KobaMin)

Validation Most programs will get the bond lengths and angles right. Model Rama. plot ~ template Rama. plot. select a high quality template! Inside/outside distributions of polar and apolar residues.

Model Validation – ProQ ProQ is a neural network-based predictor Structural features  quality of a protein model. ProQ is optimized to find correct models… …NOT (necessarily) native structures. Two quality measures: MaxSub & LGscore Arne Elofssons group: http://www.sbc.su.se/~bjorn/ProQ/

Structure Validation Program Suites ProCheck http://www.ebi.ac.uk/thornton-srv/software/PROCHECK/ WhatIf server http://swift.cmbi.ru.nl/whatif/