Daniel E. Almonacid and Patricia C. Babbitt

Slides:



Advertisements
Similar presentations
TEMPLATE DESIGN © Statistical Coupling Analysis of the Photosystem II D1 Protein Janan Zhu 1 ; Nicholas Polizzi 2 ; 1.
Advertisements

3.1 Nucleic Acids are Informational Macromolecule  Diagram and describe the structure of the DNA molecule including:  The monomer and its parts (all.
MitoInteractome : Mitochondrial Protein Interactome Database Rohit Reja Korean Bioinformation Center, Daejeon, Korea.
Enzyme Evolution John Mitchell, February Theories of Enzyme Evolution.
50%, guessing 100%, all correct Accuracy = Figure 2 Predictive Accuracy of SMO algorithm using each attribute separately Prediction of catalytic residues.
Structural bioinformatics
Enzymes. What is an enzyme? globular protein which functions as a biological catalyst, speeding up reaction rate by lowering activation energy without.
Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity Nicholas M. Luscombe and Janet M. Thornton JMB (2002)
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Protein Homology Detection Using String Alignment Kernels Jean-Phillippe Vert, Tatsuya Akutsu.
Exploiting Structural and Comparative Genomics to Reveal Protein Functions  How many domain families can we find in the genomes and can we predict the.
Dali: A Protein Structural Comparison Algorithm Using 2D Distance Matrices.
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
Alignment IV BLOSUM Matrices. 2 BLOSUM matrices Blocks Substitution Matrix. Scores for each position are obtained frequencies of substitutions in blocks.
Multiple Sequence Alignments
An Investigation into Selection Constraints in RNA Genes Naila Mimouni, Rune Lyngsoe and Jotun Hein Department of Statistics, Oxford University Aim A robust.
Remote Homology detection: A motif based approach CS 6890: Bioinformatics - Dr. Yan CS 6890: Bioinformatics - Dr. Yan Swati Adhau Swati Adhau 04/14/06.
Chemical mechanism is dominant Nature selects the protein for divergent evolution from a pool of enzymes whose mechanism provide a partial mechanism, or.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Adventures in Computational Enzymology John Mitchell.
Adventures in Computational Enzymology John Mitchell University of St Andrews.
SUPERVISED NEURAL NETWORKS FOR PROTEIN SEQUENCE ANALYSIS Lecture 11 Dr Lee Nung Kion Faculty of Cognitive Sciences and Human Development UNIMAS,
ComPath Comparative Metabolic Pathway Analyzer Kwangmin Choi and Sun Kim School of Informatics Indiana University.
Using reaction mechanism to measure enzyme similarity Noel M. O'Boyle, Gemma L. Holliday, Daniel E. Almonacid and John B.O. Mitchell Unilever Centre for.
Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.
Semantic Similarity over Gene Ontology for Multi-label Protein Subcellular Localization Shibiao WAN and Man-Wai MAK The Hong Kong Polytechnic University.
ZORRO : A masking program for incorporating Alignment Accuracy in Phylogenetic Inference Sourav Chatterji Martin Wu.
Genome Alignment. Alignment Methods Needleman-Wunsch (global) and Smith- Waterman (local) use dynamic programming Guaranteed to find an optimal alignment.
The Chemistry of Protein Catalysis
MACiE – a Database of Enzyme Reaction Mechanisms Janet Thornton EMBL-EBI July 2006.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
John Mitchell Bioinformatics Chemoinformatics Computational Chemistry Theoretical Chemistry.
Construction of Substitution Matrices
NIGMS Protein Structure Initiative: Target Selection Workshop ADDA and remote homologue detection Liisa Holm Institute of Biotechnology University of Helsinki.
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
Complementarity of network and sequence information in homologous proteins March, Department of Computing, Imperial College London, London, UK 2.
High-Oxidation-State Palladium Catalysis 报告人:刘槟 2010 年 10 月 23 日.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Genome Analysis II Comparative Genomics Jiangbo Miao Apr. 25, 2002 CISC889-02S: Bioinformatics.
Study of Protein Prediction Related Problems Ph.D. candidate Le-Yi WEI 1.
PREDICTION OF CATALYTIC RESIDUES IN PROTEINS USING MACHINE-LEARNING TECHNIQUES Natalia V. Petrova (Ph.D. Student, Georgetown University, Biochemistry Department),
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Enolase Bridging Project John A. Gerlt Enzyme Function Initiative (EFI) Advisory Committee Meeting November 30, 2011.
Using structure in protein function annotation: predicting protein interactions Donald Petrey, Cliff Qiangfeng Zhang, Raquel Norel, Barry Honig Howard.
Classification of protein and domain families Sequence to function Protein Family Resources and Protocols for Structural and Functional Annotation of Genome.
Protein Homologue Clustering and Molecular Modeling L. Wang.
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
Daniel E. Almonacid, Gemma L. Holliday, Gail J. Bartlett, Noel M. O’Boyle, Peter Murray-Rust, Janet M. Thornton and John B. O. Mitchell Enzyme Mechanism.
05/02/2008 Jae Hyun Kim Genome scale enzyme-metabolite and drug-target interaction predictions using the signature molecular descriptor Faulon, J. L.,
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
CZ5225 Methods in Computational Biology Lecture 2-3: Protein Families and Family Prediction Methods Prof. Chen Yu Zong Tel:
Construction of Substitution matrices
Typically, classifiers are trained based on local features of each site in the training set of protein sequences. Thus no global sequence information is.
Classification and Nomenclature of Enzymes
Rita Casadio BIOCOMPUTING GROUP University of Bologna, Italy Prediction of protein function from sequence analysis.
Computational Biology Group. Class prediction of tumor samples Supervised Clustering Detection of Subgroups in a Class.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
(c) M Gerstein '06, gerstein.info/talks 1 CS/CBB Data Mining Predicting Networks through Bayesian Integration #1 - Theory Mark Gerstein, Yale University.
Genetic Algorithms Select Protein Features Most Predictive of Enzyme Function Andrew Kernytsky, Burkhard Rost Columbia University.
Marc Robinson-Rechavi Département d'Ecologie et d'Evolution Université de Lausanne Genomique structurale comparative et evolution des proteines What is.
Bioinformatics Research Overview Li Liao Develop new algorithms and (statistical) learning methods > Capable of incorporating domain knowledge > Effective,
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
The Chemistry of Protein Catalysis John Mitchell University of St Andrews.
Global comparison of MinSpan pathways with databases of human‐defined pathways The pairwise connection specificity index (CSI) was calculated for all pathway.
Analysis of in silico flux changes along the exponential growth phase: (A) In silico flux changes from 24 to 36 h, from 36 to 48 h, and from 48 to 60 h.
Enzyme Structure & Mechanism
Prediction of protein function from sequence analysis
Classification of Enzymes
Volume 16, Issue 11, Pages (November 2008)
Presentation transcript:

Daniel E. Almonacid and Patricia C. Babbitt Classification of Mechanistically Diverse Enzyme Superfamilies According to Similarities in Reaction Mechanism Daniel E. Almonacid and Patricia C. Babbitt 18th July 2008

Overview Introduction Motivation E.C. Classification Enzyme Catalysis Databases Structure-Function Linkage Database Methods Enolase Superfamily Computing Similarity of Mechanisms and Overall Reactions Results Overall vs Mechanism Similarity Complete Linkage Clustering Applications Conclusions

Motivation Enzymes catalyse almost all the reactions in the metabolism of all organisms. Knowledge about the evolution of structure-function relationships in enzymes allows prediction of function for newly obtained sequences and structures, and to direct enzyme engineering efforts.

Enzyme Commission (EC) Nomenclature, 1992, Academic Press, 6th Ed. E.C. Classification Enzyme Commission (EC) Nomenclature, 1992, Academic Press, 6th Ed.

Enzyme Catalysis Databases Pegg, S. C.-H., et al. Biochemistry, 2006, 45, 2545 Holliday, G. L., et al. Nucleic Acids Res., 2007, 35, D515

SFLD (http://sfld.rbvi.ucsf.edu/)

Enolase Superfamily

Enolase Superfamily

Enolase Superfamily

Enolase Superfamily

Enolase Superfamily

Enolase Superfamily

Enolase Superfamily

Dataset Labeling E1 GD1 MR1 .

Computing Mechanism Similarity chloromuconate cycloisomerase (MC6) dipeptide epimerase (MC2)

Similarity between Reaction Steps chloromuconate cycloisomerase (MC6) Bonds formed: None Bonds cleaved: C-O Bond order changes: C-O C=O C=C  C-C C-C  C=C Step 3 dipeptide epimerase (MC2) Bonds formed: C-H Bonds cleaved: Base-H Bond order changes: C-O C=O C=C  C-C Step 2 Step similarity (Tanimoto coeff) = intersection / union = 2/(4+4-2) = 0.3333

Global Alignment of Reaction Sequences a) Similarity Matrix MC2.stg01 MC2.stg02 MC6.stg01 0.0000 0.3333 MC6.stg02 1.0000 MC6.stg03 0.1429 b) Needleman-Wunsch Maximum-Match Matrix MC2.stg01 MC2.stg02 MC6.stg01 0.0000 0.3333 MC6.stg02 1.0000 MC6.stg03 0.1429 1.3333

Similarity between Reaction Mechanisms chloromuconate cycloisomerase (MC6) dipeptide epimerase (MC2) Step 1 Step 2 Step 3 Step 1 Step 2 1.0 0.3333 Alignment score, Axy, of 1.3333 normalised similarity, Sxy = Axy Axx + Ayy - Axy Sxy = 1.3333 = 0.3636 3 + 2 – 1.3333 NM O’Boyle, et al., J. Mol. Biol., 2007, 368, 1484.

Overall vs Mechanistic Similarity A total of 190 pairs are compared. Size of the spheres is proportional to the number of data points in that position. Significance levels are shown in red.

Similarity between Overall Reactions chloromuconate cycloisomerase (MC6) dipeptide epimerase (MC2)

Similarity between Overall Reactions chloromuconate cycloisomerase (MC6) Bonds formed: C-Cl Bonds cleaved: C-O Bond order changes: None Overall dipeptide epimerase (MC2) Bonds formed: C-H Bonds cleaved: Bond order changes: None Overall Overall similarity (Tanimoto coeff) = intersection / union = 0/(4+4-0) = 0

Overall vs Mechanistic Similarity A total of 190 pairs are compared. Size of the spheres is proportional to the number of data points in that position. Significance levels are shown in red.

Complete Linkage Clustering of Mechanisms Common partial reaction: chloromuconate cycloisomerase (SFLD10) dipeptide epimerase (SFLD12)

Same Subgroup, Different Mechanism Common partial reaction: chloromuconate cycloisomerase (SFLD10) chloromuconate cycloisomerase dipeptide epimerase dipeptide epimerase (SFLD12)

Different Subgroup, Same Mechanism D-tartrate dehydratase Common partial reaction: o-succinyl- benzoate synthase Enolase D-tartrate dehydratase

Different Subgroup, Same Mechanism D-tartrate dehydratase (MR1) enolase (E1) o-succinylbenzoate synthase (MC1)

Preliminary Anecdotal Observation Common partial reaction: muconate cycloisomerase o-succinyl- benzoate synthase dipeptide epimerase

Preliminary Anecdotal Observation Mechanism Similarity to Target: o-succinylbenzoate synthase (MC1) Mechanism Similarity to MC1 MC2 0.5556 MC7 0.7143 dipeptide epimerase (MC2) Kcat/KM (M-1s-1) E. Coli OSBS (MC1) 3.1 x 106 D297G AEE (MC2) 12.5 E323G MLE (MC7) 1.9 x 103 muconate cycloisomerase (MC7) Schmidt, et al., Biochemistry, 2003, 42, 8387.

Conclusions Compared to the traditional approach of classifying enzymes according to overall reaction similarity (such as that of the Enzyme Commission), the method based on step similarity is better able to capture elements of functional conservation. The relationship between sequence/structure and function is yet more complicated than previously envisaged. We expect our study to be useful for guiding functional annotation of new homologues of enzyme superfamilies, and to help guide engineering of enzyme functions by identifying enzyme templates capable of catalyzing the key mechanistic step of a transformation

Acknowledgements $$$ Margy Glasner Sunil Ojha Shoshana Brown Patricia Babbitt Noel O’Boyle John Mitchell Gemma Holliday Janet Thornton $$$ NIH NSF ISCB

Structure-Function Linkage Database Questions? daniel.almonacid@ucsf.edu Structure-Function Linkage Database http://sfld.rbvi.ucsf.edu/