Protein Structure Alignment Human Myoglobin pdb:2mm1 Human Hemoglobin alpha-chain pdb:1jebA Sequence id: 27% Structural id: 90% Another example: G-Proteins:

Slides:



Advertisements
Similar presentations
1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
Advertisements

3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
A 3-D reference frame can be uniquely defined by the ordered vertices of a non- degenerate triangle p1p1 p2p2 p3p3.
Seminar in structural bioinformatics Multiple structural alignment of proteins By Elad Kaspani.
Tertiary protein structure viewing and prediction July 1, 2009 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Structural Bioinformatics Workshop Max Shatsky Workshop home page:
Protein Structure, Databases and Structural Alignment
Alignment of Flexible Molecular Structures. Motivation Proteins are flexible. One would like to align proteins modulo the flexibility. Hinge and shear.
Agenda A brief introduction The MASS algorithm The pairwise case Extension to the multiple case Experimental results.
Largest Common Point Set (LCP) problem Given e>0 and two point sets A and B find a transformation T and equally sized subsets A’ (a subset of A) and B’
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Finding Compact Structural Motifs Presented By: Xin Gao Authors: Jianbo Qian, Shuai Cheng Li, Dongbo Bu, Ming Li, and Jinbo Xu University of Waterloo,
FLEX* - REVIEW.
Structural Bioinformatics Workshop Max Shatsky Workshop home page:
CISC667, F05, Lec15, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (II) Distance-based methods.
Protein Tertiary Structure Comparison Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University of Missouri-Columbia.
Protein threading Structure is better conserved than sequence
Object Recognition. Geometric Task : find those rotations and translations of one of the point sets which produce “large” superimpositions of corresponding.
A unified statistical framework for sequence comparison and structure comparison Michael Levitt Mark Gerstein.
1 Alignment of Flexible Protein Structures Based on: FlexProt: Alignment of Flexible Protein Structures Without a Pre-definition of Hinge Regions / M.
Structural Bioinformatics Seminar Dina Schneidman
MASS and MultiProt methods. Problem Definition Input: a collection of 3D protein structures Goal: find substructures common to two or more proteins.
1 Seminar in structural bioinformatics Pairwise Structural Alignment Presented by: Dana Tsukerman.
Protein Structure Prediction Samantha Chui Oct. 26, 2004.
Model Database. Scene Recognition Lamdan, Schwartz, Wolfson, “Geometric Hashing”,1988.
Protein Structure Alignment
Prediction of Local Structure in Proteins Using a Library of Sequence-Structure Motifs Christopher Bystroff & David Baker Paper presented by: Tal Blum.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
1 Fingerprint Classification sections Fingerprint matching using transformation parameter clustering R. Germain et al, IEEE And Fingerprint Identification.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Structural alignments of Proteins using by TOPOFIT method Vitkup D., Melamud E., Moult J., Sander C. Completeness in structural genomics. Nature Struct.
Chapter 9 Superposition and Dynamic Programming 1 Chapter 9 Superposition and dynamic programming Most methods for comparing structures use some sorts.
Structure superposition ≠ Structure alignment Lecture 11 Chapter 16, Du and Bourne “Structural Bioinformatics”
PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches Gaurav Sahni, Ph.D.
Basic Computations with 3D Structures
EECS 730 Introduction to Bioinformatics Structure Comparison Luke Huan Electrical Engineering and Computer Science
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Classifier Evaluation Vasileios Hatzivassiloglou University of Texas at Dallas.
Hugh E. Williams and Justin Zobel IEEE Transactions on knowledge and data engineering Vol. 14, No. 1, January/February 2002 Presented by Jitimon Keinduangjun.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Conformational Space.  Conformation of a molecule: specification of the relative positions of all atoms in 3D-space,  Typical parameterizations:  List.
Protein Strucure Comparison Chapter 6,7 Orengo. Helices α-helix4-turn helix, min. 4 residues helix3-turn helix, min. 3 residues π-helix5-turn helix,
DALI Method Distance mAtrix aLIgnment
Protein Classification Using Averaged Perceptron SVM
A data-mining approach for multiple structural alignment of proteins WY Siu, N Mamoulis, SM Yiu, HL Chan The University of Hong Kong Sep 9, 2009.
Protein Classification. Given a new protein, can we place it in its “correct” position within an existing protein hierarchy? Methods BLAST / PsiBLAST.
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
Pair-wise Structural Comparison using DALILite Software of DALI Rajalekshmy Usha.
MINRMS: an efficient algorithm for determining protein structure similarity using root-mean-squared-distance Andrew I. Jewett, Conrad C. Huang and Thomas.
EMBL-EBI MSDfold (SSM) A web service for protein structure comparison and structure searches Eugene Krissinel
Structural alignment methods Like in sequence alignment, try to find best correspondence: –Look at atoms –A 3-dimensional problem –No a priori knowledge.
How to detect the change of model for fitting. 2 dimensional polynomial 3 dimensional polynomial Prepare for simple model (for example, 2D polynomial.
Jürgen Sühnel Supplementary Material: 3D Structures of Biological Macromolecules Exercise 1:
Side-Angle-Side Congruence by basic rigid motions A geometric realization of a proof in H. Wu’s “Teaching Geometry According to the Common Core Standards”
Topics in bioinformatics CS697 Spring 2011 Class 12 – Mar Molecular distance measurements Molecular transformations.
EMBL-EBI Eugene Krissinel SSM - MSDfold. EMBL-EBI MSDfold (SSM)
An Efficient Index-based Protein Structure Database Searching Method 陳冠宇.
Local Flexibility Aids Protein Multiple Structure Alignment Matt Menke Bonnie Berger Lenore Cowen.
Find the optimal alignment ? +. Optimal Alignment Find the highest number of atoms aligned with the lowest RMSD (Root Mean Squared Deviation) Find a balance.
Protein Structure Comparison
Generating, Maintaining, and Exploiting Diversity in a Memetic Algorithm for Protein Structure Prediction Mario Garza-Fabre, Shaun M. Kandathil, Julia.
Protein Structures.
DALI Method Distance mAtrix aLIgnment
Protein Structure Alignment
Robert Fraser, University of Waterloo
Presentation transcript:

Protein Structure Alignment Human Myoglobin pdb:2mm1 Human Hemoglobin alpha-chain pdb:1jebA Sequence id: 27% Structural id: 90% Another example: G-Proteins: 1c1y:A, 1kk1:A6-200 Sequence id: 18% Structural id: 72%

Transformations Translation Translation and Rotation Rigid Motion (Euclidian Trans.) Translation, Rotation + Scaling

Inexact Alignment. Simple case – two closely related proteins with the same number of amino acids. T Question: how to measure an alignment error?

Distance Functions Two point sets: A={a i } i=1…n B={b j } j=1…m Pairwise Correspondence: (a k 1,b t 1 ) (a k 2,b t 2 )… (a k N,b t N ) (1) Exact Matching: ||a k i – b t i ||=0 (2) Bottleneck max ||a k i – b t i || (3) RMSD (Root Mean Square Distance) Sqrt( Σ||a k i – b t i || 2 /N)

Superposition - best least squares (RMSD – Root Mean Square Deviation) Given two sets of 3-D points : P={p i }, Q={q i }, i=1,…,n; rmsd(P,Q) = √  i |p i - q i | 2 /n Find a 3-D rigid transformation T * such that: rmsd( T * (P), Q ) = min T √  i |T(p i ) - q i | 2 /n A closed form solution exists for this task. It can be computed in O(n) time.

Correspondence is Unknown find those rotations and translations of one of the point sets which produce “large” superimpositions of corresponding 3-D points. Given two configurations of points in the three dimensional space, T

A 3-D reference frame can be uniquely defined by the ordered vertices of a non- degenerate triangle p1p1 p2p2 p3p3

Sequence Based Structure Alignment Run pairwise sequence alignment. Based on sequence correspondence compute 3D transformation (least square fit can be applied). Iteratively improve structural superposition. Not a good approach – sequence alignment can be incorrect.

Structure Alignment (Straightforward Algorithm) For each pair of triplets, one from each molecule which define ‘almost’ congruent triangles compute the rigid transformation that superimposes them. Count the number of aligned point pairs and sort the hypotheses by this number.

For the highest ranking hypotheses improve the transformation by replacing it by the best RMSD transformation for all the matching pairs. Complexity : O(n 3 m 3 ) * O(nm). Applying 3D grid gives practically O(n 3 m 3 ) * O(n) If one exploits protein backbone geometry + 3D grid : O(nm) * O(n)

Structural Alignment Approaches 1.Generate a set of 3D transformations. 2.Compute 3D alignment for each transformation. Two interrelated problems: 3D transformation and point correspondence (matching, alignment) 1.Generate a set of 3D transformations. 2.Cluster similar transformations. 3.Compute 3D alignment for each cluster representative. Geometric Hashing: Combines transformation and correspondence detection in one scheme. Some methods:

Accuracy improvement during detection of 3D transformation. Instead of 3 points use more. How many? Align any possible pair of fragments - F ij (k) i j i+k-1 j+k-1

Accept F ij (k) if rmsd( F ij (k) ) <  Complexity O(n 3 n) * O(n) (assume n~m) (For each F ij (k) we need compute its rmsd) can be reduced to O(n 3 ) * O(n)

Improvement : BLAST idea - detect short similar fragments, then extend as much as possible. j i+1 j+1 i j-1 i-1 a i-1 a i a i+1 b j-1 b j b j+1 k t k+l-1 t+l-1 Complexity: O(n 2 )*O(n) Extend while: rmsd( F ij (k) ) < 

Sequence-order Independent Alignment P:Q:

4-helix bundle 2cbl:A 1f4n:A 1b3q 1rhg:A

Sequence Order Independent Alignment

2cbl:A 1f4n 1rhg:A 1b3q chain A chain B

E. A. NALEFSKI and J. J. FALKE The C2 domain calcium-binding motif: Structural and functional diversity Protein Sci : The C2 domain calcium-binding motif

TRAF-Immunoglobulin Ensemble - helices ; - strands  Ensemble: 8 proteins from 2 folds.  Core: sandwich of 6 strands  Runtime: 21 seconds E- strand

Rasmol – Molecular VisualizationRasmol SCOP - Structural Classification of ProteinsSCOP MultiProt - Protein Structural (pairwise/multiple) AlignmentMultiProt MASS – Secondary Structure Based (pairwise/multiple) AlignmentMASS Some Links