Comparison of Exemplars of Rotamer Clusters Across the Proteinogenic Amino Acids

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
Phylogenetic reconstruction
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Structural bioinformatics
Determination of alpha-helix propensities within the context of a folded protein Blaber et al. J. Mol. Biol 1994.
Multiple alignment June 29, 2007 Learning objectives- Review sequence alignment answer and answer questions you may have. Understand how the E value may.
Determination of alpha-helix propensities within the context of a folded protein Blaber et al. J. Mol. Biol 1994.
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
Sequence comparisons June 23, 2009 Learning objectives-Understand the concept of sliding window programs. Understand difference between identity, similarity.
Computational Biology, Part 2 Sequence Comparison with Dot Matrices Robert F. Murphy Copyright  1996, All rights reserved.
Sequence Alignments Revisited
Basics of protein structure and stability III: Anatomy of protein structure Biochem 565, Fall /29/08 Cordes.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Proteins: Levels of Protein Structure Conformation of Peptide Group
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Comparative Protein Modeling Jason Wiscarson ( Lloyd Spaine ( Comparative or homology modeling, is a computational.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
COMPARATIVE or HOMOLOGY MODELING
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Proteins: Secondary Structure Alpha Helix
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Doug Raiford Lesson 19.  Framework model  Secondary structure first  Assemble secondary structure segments  Hydrophobic collapse  Molten: compact.
Construction of Substitution Matrices
Protein Structure 1 Primary and Secondary Structure.
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Bioinformatics 2 -- lecture 9
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Lab Meeting 10/08/20041 SuperPose: A Web Server for Automated Protein Structure Superposition Gary Van Domselaar October.
Automated Refinement (distinct from manual building) Two TERMS: E total = E data ( w data ) + E stereochemistry E data describes the difference between.
Introduction to Sequence Alignment. Why Align Sequences? Find homology within the same species Find clues to gene function Practical issues in experiments.
We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained.
Protein Structure BL
Linear Algebra Review.
Computational Structure Prediction
Fig. 9. Scatter plots of 500 inferred rates versus their simulated values with a model tree with six sequences and d = 0.1 for (a) ML and (b) EB-EXP. The.
Comparing hydropathic interactions between Phenylalanine and Tyrosine
Protein Structure Prediction and Protein Homology modeling
Protein Sequence Alignments

Hierarchical Structure of Proteins
CSE 4705 Artificial Intelligence
courtesy of C. Chothia Most proteins in biology have been produced by the duplication, divergence and recombination of the members of a small.
Comparison of Phenylalanine v
Volume 84, Issue 5, Pages (May 2003)
NSAF and GeneChip data have similar distribution properties.
Phosphorylation and sequence disorder in microtubule-associated protein Tau.A, schematic illustration of the domain profile of Tau with all known phosphorylation.
Ligand Docking to MHC Class I Molecules
Protein Structures.
Prediction of protein structure
Homology Modeling.
Protein structure prediction.
Accounting for side-chain flexibility in protein-ligand docking: 3D Interaction Homology as an approach of quantifying side-chain flexibility of Tyrosine.
Plot of the deviation of the predicted pI value of every peptide spectrum from the average pI calculated for each fraction for validated (a) and non-validated.
Analysis of the phosphomimetic ProP‐PD selection results
Protein Structure Prediction by A Data-level Parallel Proceedings of the 1989 ACM/IEEE conference on Supercomputing Speaker : Chuan-Cheng Lin Advisor.
Accounting for side-chain flexibility in protein-ligand docking: 3D Interaction Homology as an approach of quantifying side-chain flexibility of Tyrosine.
Structure of the BRCT Repeats of BRCA1 Bound to a BACH1 Phosphopeptide
Chloe Jones, Isabel Gonzaga, and Nicole Anguiano
Protein structure prediction
Nat. Rev. Clin. Oncol. doi: /nrclinonc
A, protein structure and functional domains of hdbpB/YB-1, hdbpA, hdbpC, NF-YA, NF-YB, and NF-YC. A, protein structure and functional domains of hdbpB/YB-1,
Amino acid sequence of antigenic fragments 52/3 and UL57/3 of pUL44 and pUL57, respectively, and sequence comparison by dot plot analysis. Amino acid sequence.
Presentation transcript:

Comparison of Exemplars of Rotamer Clusters Across the Proteinogenic Amino Acids

Comparison of Exemplars of Rotamer Clusters Across the Proteinogenic Amino Acids Proteins: Drug targets Pharmaceutical agents Methods of determining similarity Sequence alignment Homology Modelling Threading

MVHFTAEEKAAVTSLWSKMNVEEAG...

Figure adapted from Lovell et al (2003). Depicts the phi-psi angles on a proline.

Figure adapted from Lovell et al (2003).

Figure adapted from Ahmed et al (2015)

The Question Do there exist highly-similar groups of rotamers among disparate types of amino acids?

Rotamers in silico Each rotamer is represented as a set of four maps, each representing a class of forces affecting the conformation of the protein.

Using known clusters of rotamers, we can find their exemplars* Procedure Using known clusters of rotamers, we can find their exemplars* and use those to represent their respective clusters By repeating the clustering procedure used in Ahmed et al (2015), we can determine whether these exemplars cluster as well as they do in Ahmed et al (2015). In brief: Calculate the similarity between each pair of rotamer cluster exemplars, ...sort them into groups based on those similarity scores, ...and then validate the results. *Exemplars - those rotamers most similar to the map quartet derived from averaging the cluster members.

L(Gt) transforms individual points in the map to a form more appropriate to our calculations D(I,J) finds the similarity between two rotamer maps. We repeat this four times, once for each map in the exemplar quartet. C(m,n) calculates a weighted average of the maps, creating an overall score of similarity. The calculated similarity of each pair of exemplars is stored in a matrix

Clustering We use k-means clustering. K-means clustering involves splitting a set of datapoints into k groups. To determine a good value of k, we use the “gap statistic”.

Perform clustering for a range of cluster counts. Validation In brief: Scramble matrix of similarity scores 250 times to generate 250 new similarity matrices. Perform clustering for a range of cluster counts. Calculate sum of squared error (SSE) for the actual data and the generated data Plot SSE vs number of clusters. Figure adapted from http://www.mattpeeples.net/kmeans.html

Figure adapted from Ahmed et al (2015)

Figure adapted from Ahmed et al (2015)

Figure adapted from Ahmed et al (2015)

Figure adapted from Ahmed et al (2015)

References 1. Ahmed, M. et al (2015). 3D interaction homology: The structurally known rotamers of tyrosine derive from a surprisingly limited set of information-rich hydropathic interaction environments described by maps. Proteins 83: 1118-1136. 2. Ramachandran, G.N., Ramakrishnan, C. & Sasisekharan, V. (1962). Stereochemistry of Polypeptide Chain Configurations. J. Mol. Biol. 7: 95-99. 3. Lovell, S. et al (2003). Structure Validation by C-alpha Geometry: Phi, Psi, and C-beta Deviation. Proteins 50: 437-450. 4. Forster, M.J. (2002). Molecular modelling in structural biology. Micron 33: 365-384. 5. Ho, B.K. & Brasseur, R. (2005). The Ramachandran plots of glycine and pre-proline. BMC Structural Biology 5:14.