Lab Meeting 10/08/20041 SuperPose: A Web Server for Automated Protein Structure Superposition Gary Van Domselaar October.

Slides:



Advertisements
Similar presentations
Multiple Alignment Anders Gorm Pedersen Molecular Evolution Group
Advertisements

PCA + SVD.
Geometry (Many slides adapted from Octavia Camps and Amitabh Varshney)
Chapter 4.1 Mathematical Concepts
Multiple Sequence Alignment. An alignment of heads.
1 “INTRODUCTION TO BIOINFORMATICS” “SPRING 2005” “Dr. N AYDIN” Lecture 4 Multiple Sequence Alignment Doç. Dr. Nizamettin AYDIN
Linear Algebra and SVD (Some slides adapted from Octavia Camps)
CS 4731: Computer Graphics Lecture 7: Introduction to Transforms, 2D transforms Emmanuel Agu.
Chapter 4.1 Mathematical Concepts. 2 Applied Trigonometry Trigonometric functions Defined using right triangle  x y h.
Multiple alignment June 29, 2007 Learning objectives- Review sequence alignment answer and answer questions you may have. Understand how the E value may.
CSCE 590E Spring 2007 Basic Math By Jijun Tang. Applied Trigonometry Trigonometric functions  Defined using right triangle  x y h.
Appendix: Automated Methods for Structure Comparison Basic problem: how are any two given structures to be automatically compared in a meaningful way?
Performance Optimization of Clustal W: Parallel Clustal W, HT Clustal and MULTICLUSTAL Arunesh Mishra CMSC 838 Presentation Authors : Dmitri Mikhailov,
Multiple sequence alignment
Pairwise Alignment Global & local alignment Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis.
Dali: A Protein Structural Comparison Algorithm Using 2D Distance Matrices.
Sequence analysis of nucleic acids and proteins: part 1 Based on Chapter 3 of Post-genome Bioinformatics by Minoru Kanehisa, Oxford University Press, 2000.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Multiple Sequence Alignments
Dynamic Programming. Pairwise Alignment Needleman - Wunsch Global Alignment Smith - Waterman Local Alignment.
Screw Rotation and Other Rotational Forms
Sequence comparison: Score matrices Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
Protein Structures.
Chapter 5 Multiple Sequence Alignment.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Point set alignment Closed-form solution of absolute orientation using unit quaternions Berthold K. P. Horn Department of Electrical Engineering, University.
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
Multiple Sequence Alignment May 12, 2009 Announcements Quiz #2 return (average 30) Hand in homework #7 Learning objectives-Understand ClustalW Homework#8-Due.
Chapter 9 Superposition and Dynamic Programming 1 Chapter 9 Superposition and dynamic programming Most methods for comparing structures use some sorts.
Structure superposition ≠ Structure alignment Lecture 11 Chapter 16, Du and Bourne “Structural Bioinformatics”
Basic Computations with 3D Structures
Lecture 3.31 Superposition & Threading † Gary Van Domselaar University of Alberta † Slides adapted from David Wishart.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
Multiple Sequence Alignments Craig A. Struble, Ph.D. Department of Mathematics, Statistics, and Computer Science Marquette University.
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Sequence Alignment Only things that are homologous should be compared in a phylogenetic analysis Homologous – sharing a common ancestor This is true for.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Protein Strucure Comparison Chapter 6,7 Orengo. Helices α-helix4-turn helix, min. 4 residues helix3-turn helix, min. 3 residues π-helix5-turn helix,
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu BIOINFORMATICS Structures Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a (last edit.
Problem For the 5 x 3 x -in. angle cross
Adding a Sequence of numbers (Pairing Method)
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
MINRMS: an efficient algorithm for determining protein structure similarity using root-mean-squared-distance Andrew I. Jewett, Conrad C. Huang and Thomas.
Construction of Substitution matrices
DNA, RNA and protein are an alien language
Structural alignment methods Like in sequence alignment, try to find best correspondence: –Look at atoms –A 3-dimensional problem –No a priori knowledge.
CSCE 552 Fall 2012 Math By Jijun Tang. Applied Trigonometry Trigonometric functions  Defined using right triangle  x y h.
Jürgen Sühnel Supplementary Material: 3D Structures of Biological Macromolecules Exercise 1:
II-1 Transformations Transformations are needed to: –Position objects defined relative to the origin –Build scenes based on hierarchies –Project objects.
Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison.
EMBL-EBI Eugene Krissinel SSM - MSDfold. EMBL-EBI MSDfold (SSM)
An Efficient Index-based Protein Structure Database Searching Method 陳冠宇.
Local Flexibility Aids Protein Multiple Structure Alignment Matt Menke Bonnie Berger Lenore Cowen.
Techniques for Protein Sequence Alignment and Database Searching G P S Raghava Scientist & Head Bioinformatics Centre, Institute of Microbial Technology,
Find the optimal alignment ? +. Optimal Alignment Find the highest number of atoms aligned with the lowest RMSD (Root Mean Squared Deviation) Find a balance.
Lab Lab 10.2: Homology Modeling Lab Boris Steipe Departments of Biochemistry and.
Transforms.
Protein Structure Comparison
Multiple Sequence Alignment
Comparison of Exemplars of Rotamer Clusters Across the Proteinogenic Amino Acids
In Bioinformatics use a computational method - Dynamic Programming.
Protein Structures.
Protein structure prediction.
Superposition and Transposition
Volume 15, Issue 9, Pages (September 2007)
Presentation transcript:

Lab Meeting 10/08/20041 SuperPose: A Web Server for Automated Protein Structure Superposition Gary Van Domselaar October 08, 2004

Lab Meeting 10/08/20042 Introduction Who Cares? Review of Superposition Identifying Corresponding Points Between Structures Multiple Structure Superposition RMSD Calculation The SuperPose Web Site

Lab Meeting 10/08/20043 Who Cares? NMR Spectroscopists 1YUA-26 Chains 1YUA, 26 Chains

Lab Meeting 10/08/20044 Who Cares? Structural Biologists 1YUA-26 Chains 1LUC:A, 1CNV

Lab Meeting 10/08/20045 Who Cares? Evolutionary Biologists 1MYK & 1MYN - 1BK8

Lab Meeting 10/08/20046 Principles of Superposition 1MYK & ● How do we superimpose these two cubes?

Lab Meeting 10/08/20047 Principles of Superposition 1MYK & 1.Identify corresponding points.

Lab Meeting 10/08/20048 Principles of Superposition 1MYK & 2.Identify the common center and the principle axes for each structure.

Lab Meeting 10/08/20049 Principles of Superposition 3.Translate the two structures so their centers overlap.

Lab Meeting 10/08/ Principles of Superposition 4.Rotate the two structures so the average distance between corresponding points is minimized,and their principal axes overlap.

Lab Meeting 10/08/ Principles of Superposition Rotations can be accomplished by multiplying each atom coordinate with an appropriate rotation matrix, but this is slow: cos  sin  0 -sin  cos  cos  sin  0 -sin  cos  Clockwise about xClockwise about z cos  -sin  0 sin  cos  cos  -sin  0 sin  cos  Counterclockwise about xCounterclockwise about z

Lab Meeting 10/08/ Principles of Superposition A faster way is to use quaternion-based superposition to both rotate and minimize the sum of residuals S.K.Kearsley, On the orthogonal transformation used for structural comparisons, Acta Cryst. A45, 208 (1989) structure.llnl.gov/xray/comp/suptext.htmhttp://www- structure.llnl.gov/xray/comp/suptext.htm

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures PDB_Entry_A 1 SDKIIHLTDDSFDTDVLKA--DGAILVDFWAEWCGPCKMIAPILDEIADE 48.:..:...:...:.|.| |..::|||.|.||||||||.|....:::: PDB_Entry_B 1 MVKQIESKTAFQEALDAAGDKLVVVDFSATWCGPCKMIKPFFHSLSEK 48 PDB_Entry_A 49 YQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQ 98 |...:.: ::::|.....|.:..::..||...||.|: |||..| |. PDB_Entry_B 49 YSNVIFL-EVDVDDCQDVASECEVKCTPTFQFFKKGQ----KVGEFS-GA 92 PDB_Entry_A 99 LKEFLDANLA 108.||.|:|.:. PDB_Entry_B 93 NKEKLEATINELV 105 Sequence Alignment 2TRX:A - 3TRX:A

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures Problem: Low Homology # Length: 163 # Identity: 11/163 ( 6.7%) # Similarity: 14/163 ( 8.6%) # Gaps: 139/163 (85.3%) # Score: 16.0 # #======================================= 3TRX_model_de 1 MVKQIESK 8 |:|:.... 3GRX_model_1_ 1 ANVEIYTKETCPYSHRAKALLSSKGVSFQELPIDGNAAKREEMIKRSGRT 50 3TRX_model_de 9 TAFQ EALDAAG--DKLVVVDFSATWCGPCKMIKPFF 42 |..|.||||.| |.|:. 3GRX_model_1_ 51 TVPQIFIDAQHIGGYDDLYALDARGGLDPLLK 82 3TRX_model_de 43 HSLSEKYSNVIFLEVDVDDCQDVASECEVKCTPTFQFFKKGQKVGEFSGA 92 3GRX_model_1_ TRX_model_de 93 NKEKLEATINELV 105 3GRX_model_1_ TRX - 3GRX:1

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures Solution: Secondary Structural Alignment Sequence1: 3TRX_model_default_chain_default Sequence2: 3GRX_model_1_chain_default Score....: 600 Test Stat: 5.31 Matches..: 64 Sequence1: CEEEECCHHHHHHHHHHHCCEEEEEEEEECCCHHHHHCCCCCCHHHHHCC Matching.: ||||||||||||||| ||||||| Sequence2: CEEEEEEECCCHHHHHHHH HHHHHCC Structure: CBBBBBBBCCCHHHHHHHH HHHHHCC Sequence1: CEEEEEEEECCCHHHHHHHCCCCEEEEEEEECCCCCEEECCCCHHHHHHH Matching.: ||||||| || ||||||| ||| |||||||||| | ||||||| Sequence2: CEEEEEECCCCHHHHHHHHHCCCCCCEEEEECCCCC CHHHHHHHH Structure: CBBBBBBCCCCHHHHHHHHHCCCCCCBBBBBCCCCC CHHHHHHHH Sequence1: HHHCC Matching.: ||||| Sequence2: HHHCCCCCCCC Structure: HHHCCCCCCCC 3TRX - 3GRX:1

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures Problem: Multiple Structural Forms # Length: 145 # Identity: 143/145 (98.6%) # Similarity: 143/145 (98.6%) # Gaps: 2/145 ( 1.4%) # Score: # #======================================= 1A29_model_de 1 QLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMI 50 ||||||||||||||||||||||||||||||||||||||||||||||||| 1CLL_model_de 1 LTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMI 49 1A29_model_de 51 NEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISA 100 |||||||||||||||||||||||||||||||||||||||||||||||||| 1CLL_model_de 50 NEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISA 99 1A29_model_de 101 AELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMT 144 |||||||||||||||||||||||||||||||||||||||||||| 1CLL_model_de 100 AELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTA 144 1A29 - 1CLL

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures Solution: Subdomain Alignment 1A29 - 1CLL

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures The Difference Distance Matrix Make a Distance Matrix for each structure:

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures The Difference Distance Matrix Subtract the dif matrices to make a DD Matrix Plot the magnitude of the distance as a color shade

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures Analyze the difference distance matrix for similar subdomains. The DD Matrix will have regions that are similar, and regions that are different.

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures Superposition restricted to residues 5-74

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures Superposition restricted to residues 5-74

Lab Meeting 10/08/ Identifying Corresponding Points Between Protein Structures # Length: 145 # Identity: 143/145 (98.6%) # Similarity: 143/145 (98.6%) # Gaps: 2/145 ( 1.4%) # Score: # #======================================= 1A29_model_de 1 QLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMI 50 ||||||||||||||||||||||||||||||||||||||||||||||||| 1CLL_model_de 1 LTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMI 49 1A29_model_de 51 NEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISA 100 |||||||||||||||||||||||||||||||||||||||||||||||||| 1CLL_model_de 50 NEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISA 99 1A29_model_de 101 AELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMT 144 |||||||||||||||||||||||||||||||||||||||||||| 1CLL_model_de 100 AELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTA 144

Lab Meeting 10/08/ Multiple Structure Superposition How do you optimally superimpose more than 2 structures?

Lab Meeting 10/08/ Multiple Structure Superposition Superimpose to an average structure Initial 2-Structure Superposition Structure 3 Average Structure 3-Structure Superposition 1 2 3

Lab Meeting 10/08/ Multiple Structure Superposition Superposition ordering is important –Structures should be superposed in order of their pairwise structural similarity. –An 'all-against-all' DD Matrix analysis can be used to quickly determine overall relative similarity between every pair of structures Avg RMSD for 3TRX chains A & C = 1.75 A Avg RMSD for 3TRX chains A & B = 1.5 A

Lab Meeting 10/08/ Multiple Structure Superposition A structure 'pileup' is created from the DD Matrix analysis to determine the superposition order. 3TRX_A,D:.5A 3TRX_A,B:.6A 3TRX_B,D:.7A 3TRX_B,C:.8A 3TRX_A,C:.9A 3TRX_C,D: 1.0 3TRX_A,D:.5A 3TRX_A,B:.6A 3TRX_B,C:.8A

Lab Meeting 10/08/ Multiple Structure Superposition Average structures can be sensibly generated only from a collection of structures with identical sequences How do you superimpose a collection of sequences with non-identical sequences? Progressive pairwise buildup using the pileup as a guide. 3TRX_A,D:.5A 3TRX_A,B:.6A 3TRX_B,C:.8A Superpose Structures A and D 'Anchor' Structure A, translate/rotate B, add B to A,D 'Anchor' Structure B, translate/rotate C, Add C to A,B,D

Lab Meeting 10/08/ Multiple Structure Superposition CLUSTAL W (1.83) multiple sequence alignment 2TRX_model_default_chain_A SDKIIHLTDDSFDTDVLKA--DGAILVDFWAEWCGPCKMIAPILDEIADE 2TRX_model_default_chain_B SDKIIHLTDDSFDTDVLKA--DGAILVDFWAEWCGPCKMIAPILDEIADE 3TRX_model_default_chain_defau --MVKQIESKTAFQEALDAAGDKLVVVDFSATWCGPCKMIKPFFHSLSEK : ::..: :.*.* * ::*** * ******** *::..:::: 2TRX_model_default_chain_A YQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQ 2TRX_model_default_chain_B YQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQ 3TRX_model_default_chain_defau YSNVIFL-EVDVDDCQDVASECEVKCTPTFQFFKKGQ----KVGEFS-GA *.. : : ::::*:..*.: :: **: :**:*: *** :* * 2TRX_model_default_chain_A LKEFLDANLA--- 2TRX_model_default_chain_B LKEFLDANLA--- 3TRX_model_default_chain_defau NKEKLEATINELV ** *:*.:

Lab Meeting 10/08/ RMSD Calculation The degree of similarity between two or more structures is described by its average root mean square deviation (RMSD): x1x1 x1x1 x5x5 x4x4 x3x3 x2x2 y1y1 y2y2 y3y3 y4y4 y5y5

Lab Meeting 10/08/ SuperPose Superposition for 2 chains and for multiple chains Subdomain superposition Superposition of structures with low sequence identity