Proteins  Proteins control the biological functions of cellular organisms  e.g. metabolism, blood clotting, immune system amino acids  Building blocks.

Slides:



Advertisements
Similar presentations
1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
Advertisements

Profile Hidden Markov Models Bioinformatics Fall-2004 Dr Webb Miller and Dr Claude Depamphilis Dhiraj Joshi Department of Computer Science and Engineering.
Structural bioinformatics
Protein Structure Alignment Human Myoglobin pdb:2mm1 Human Hemoglobin alpha-chain pdb:1jebA Sequence id: 27% Structural id: 90% Another example: G-Proteins:
Introduction to Bioinformatics Burkhard Morgenstern Institute of Microbiology and Genetics Department of Bioinformatics Goldschmidtstr. 1 Göttingen, March.
Non-coding RNA William Liu CS374: Algorithms in Biology November 23, 2004.
Machine Learning for Protein Classification Ashutosh Saxena CS 374 – Algorithms in Biology Thursday, Nov 16, 2006.
Agenda A brief introduction The MASS algorithm The pairwise case Extension to the multiple case Experimental results.
1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu BIOINFORMATICS Structures Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a.
The Protein Data Bank (PDB)
Protein threading Structure is better conserved than sequence
Protein Tertiary Structure Prediction Structural Bioinformatics.
Identification of Domains using Structural Data Niranjan Nagarajan Department of Computer Science Cornell University.
BMI 731 Protein Structures and Related Database Searches.
Structure Alignment in Polynomial Time Rachel Kolodny Stanford University Nati Linial The Hebrew University of Jerusalem.
Model Database. Scene Recognition Lamdan, Schwartz, Wolfson, “Geometric Hashing”,1988.
CISC667, F05, Lec8, Liao CISC 667 Intro to Bioinformatics (Fall 2005) Multiple Sequence Alignment Scoring Dynamic Programming algorithms Heuristic algorithms.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Presented by Liu Qi An introduction to Bioinformatics Algorithms Qi Liu
Or, What is a correspondence set anyway?! Topic 12 Chapter 16, Du and Bourne “Structural Bioinformatics”
Protein Structure Alignment by Incremental Combinatorial Extension (CE) of the Optimal Path Ilya N. Shindyalov, Philip E. Bourne.
IBGP/BMI 705 Lab 4: Protein structure and alignment TA: L. Cooper.
Introduction to Profile Hidden Markov Models
Cédric Notredame (30/08/2015) Chemoinformatics And Bioinformatics Cédric Notredame Molecular Biology Bioinformatics Chemoinformatics Chemistry.
Structural alignment Protein structure Every protein is defined by a unique sequence (primary structure) that folds into a unique.
Protein Structure Similarity
Structural Bioinformatics R. Sowdhamini National Centre for Biological Sciences Tata Institute of Fundamental Research Bangalore, INDIA.
The dynamic nature of the proteome
PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches Gaurav Sahni, Ph.D.
1 Randomized Algorithms for Three Dimensional Protein Structures Comparison Yaw-Ling Lin Dept Computer Sci and Info Engineering, Providence University,
Eric C. Rouchka, University of Louisville SATCHMO: sequence alignment and tree construction using hidden Markov models Edgar, R.C. and Sjolander, K. Bioinformatics.
Pairwise Sequence Alignment. The most important class of bioinformatics tools – pairwise alignment of DNA and protein seqs. alignment 1alignment 2 Seq.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
Protein Structure Prediction and Structural Genomics Computer Science Department North Dakota State University Fargo, ND.
Construction of Substitution Matrices
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Multiple Sequence Alignment Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan WWW:
Bioinformatics Multiple Alignment. Overview Introduction Multiple Alignments Global multiple alignment –Introduction –Scoring –Algorithms.
Sequence Alignment Csc 487/687 Computing for bioinformatics.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
PatternHunter II: Highly Sensitive and Fast Homology Search Bioinformatics and Computational Molecular Biology (Fall 2005): Representation R 林語君.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
DALI Method Distance mAtrix aLIgnment
Pharm 201 Lecture 10, Reductionism and Classification Require Detailed Comparison Consider 3D Comparison Pharm 201/Bioinformatics I Philip E. Bourne.
Protein Classification Using Averaged Perceptron SVM
A data-mining approach for multiple structural alignment of proteins WY Siu, N Mamoulis, SM Yiu, HL Chan The University of Hong Kong Sep 9, 2009.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu BIOINFORMATICS Structures Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a (last edit.
COT 6930 HPC and Bioinformatics Multiple Sequence Alignment Xingquan Zhu Dept. of Computer Science and Engineering.
Pair-wise Structural Comparison using DALILite Software of DALI Rajalekshmy Usha.
Construction of Substitution matrices
EMBL-EBI MSDfold (SSM) A web service for protein structure comparison and structure searches Eugene Krissinel
Computational Biology, Part C Family Pairwise Search and Cobbling Robert F. Murphy Copyright  2000, All rights reserved.
Protein Sequence Alignment Multiple Sequence Alignment
Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures Rachel Kolodny Patrice Koehl Michael Levitt Stanford University.
EMBL-EBI Eugene Krissinel SSM - MSDfold. EMBL-EBI MSDfold (SSM)
Local Flexibility Aids Protein Multiple Structure Alignment Matt Menke Bonnie Berger Lenore Cowen.
More on HMMs and Multiple Sequence Alignment BMI/CS 776 Mark Craven March 2002.
Find the optimal alignment ? +. Optimal Alignment Find the highest number of atoms aligned with the lowest RMSD (Root Mean Squared Deviation) Find a balance.
1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Several motifs (  -sheet, beta-alpha-beta, helix-loop-helix) combine to form a compact globular.
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.
Protein Structure Comparison
Reduce the need for human intervention in protein model building
Protein Structures.
Protein structure prediction.
DALI Method Distance mAtrix aLIgnment
Protein Structure Alignment
Presentation transcript:

Proteins  Proteins control the biological functions of cellular organisms  e.g. metabolism, blood clotting, immune system amino acids  Building blocks – amino acids  amino group (NH 2 ), carboxyl group (COOH), side chain R

The Protein Data Bank

Protein sequence and structure  Protein alphabet consists of 20 amino acids Sequence viewStructure view ADKELKFLVVDDFSTMRRIV.....

Protein structure and function  Function is determined by 3D shape/structure Thrombin Facilitates blood clotting Hirudin Anticoagulant (blocks active site)

Protein structure and function  Structure conserves better evolution information 1MBC: VLSEGEWQLVLHVWAKVE FAL: XSLSAAEADLAGKSWAPV..... Myoglobin family

Structural Bioinformatics  Pairwise alignment algorithms  DALI (Holm and Sander, Journal of Molecular Biology, 1993)  LOCK (Singh and Brutlag, ISMB, 1997)  CE (Shindyalov and Bourne, Protein Engineering, 1998)  SSM (Krissinel and Henrick, Acta Cryst., 2004)  Ye et al. JBCB, 2004  Multiple alignment algorithms  Gerstein and Levitt, ISMB, 1996: Iterative dynamic programming  SSAP (Orengo and Taylor, Methods Enymol., 1996): Two-level DP  Leibowitz et al., ISMB, 1999): Geometric hashing  CE-MC (Guda et al., PSB, 2001)  MAMMOTH (Lupyan et al., Bioinformatics, 2005)  MAPSCI (Ye at al., WABI, 2006)

Structural Bioinformatics  Homology detection  Hidden Markov models (Jaakola et al., JCB, 2000)  Spectrum, Mismatch kernel (Leslie et al., Bioinformatics, 2002)  Structure kernel (Qiu et al., Bioinformatics, 2007)  Protein structure prediction  Jones and Hadley, Bioinformatics: Sequence, structure and databanks  FUGUE (Shi et al., J. Mol. Biol., 2001)  SCOP (Andreeva, Nucleic Acids Res., 2004)  Protein docking  Shoichet et al., J. Comput. Chem.,  Choi et al., WABI,  Wang et al., PSB,  Sousa et al., Proteins, 2006.

Pairwise Structure Alignment  Given two proteins represented by the C α atoms (backbone)  find 3D transformation that superimposes a large number of the C α atoms  ensure that overall distance between matched pairs is as small as possible  Trade-off between number of matches and total distance between

Pairwise Structure Alignment Ye et al. JBCB 2004  Uses orientation independent representation of proteins based on the fact that C α atoms are ~4 Ǻ apart

Pairwise Structure Alignment Ye et al. JBCB 2004  The protein is represented as a sequence of angle triplets {(α 1, β 1, γ 1 ), (α 2, β 2, γ 2 ), …, (α n, β n, γ n ) }

Pairwise Structure Alignment Ye et al. JBCB 2004  Compute a local alignment based on angle representation  Find maximal subset of runs with similar transformation matrices

Pairwise Structure Alignment Ye et al. JBCB 2004  The main algorithm  Compute the angle based representation  Align the angle based representation  Identify runs with similar transformation matrices  Compute initial structural alignment  Refine the alignment iteratively  Running time is ~(m+n) 2 where m, n are the protein lengths

Multiple Structure Alignment  Given a set of proteins represented by the C α atoms (backbone)  find a simultaneous alignment of all structures  find a consensus structure that represents all of them

Multiple Structure Alignment  The main algorithm  find initial consensus structure (one of the given proteins)  pairwise align the consensus and each of the proteins  merge the pairwise alignments from previous step  recompute the consensus protein; repeat from step 2  Merging the pairwise alignments similar to sequence case P 1 = BBCA, P 2 = CBBA, P 3 = BCCA P 1 : -BBCA, P 1 := BBCAP: -BBCA P 2 : CBB-A, P 3 := BCCAP: CBB-A P: -BCCA

Multiple Structure Alignment  Computation of consensus structure (after merging alignments)

Multiple Structure Alignment  Algorithm flowchart