The Chemistry of Protein Catalysis

Slides:



Advertisements
Similar presentations
TEMPLATE DESIGN © Statistical Coupling Analysis of the Photosystem II D1 Protein Janan Zhu 1 ; Nicholas Polizzi 2 ; 1.
Advertisements

Functional Site Prediction Selects Correct Protein Models Vijayalakshmi Chelliah Division of Mathematical Biology National Institute.
Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein.
Enzyme Evolution John Mitchell, February Theories of Enzyme Evolution.
Domain-SLiM mining from High Throughput Protein Interaction Data Hugo Willy August 19, 2010.
Protein Structure Database Introduction Database of Comparative Protein Structure Models ModBase 生資所 g 詹濠先.
Pfam(Protein families )
DNA sequences alignment measurement
EBI is an Outstation of the European Molecular Biology Laboratory. Alex Mitchell InterPro team Using InterPro for functional analysis.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Automatic Annotation of Transport Proteins M. Braga Lin T. L. J. Meidanis
Readings for this week Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks….. Sign up for meeting next week for.
Adaptive evolution of bacterial metabolic networks by horizontal gene transfer Chao Wang Dec 14, 2005.
Mining frequent patterns in protein structures: A study of protease families Dr. Charles Yan CS6890 (Section 001) ST: Bioinformatics The Machine Learning.
MCSG Site Visit, Argonne, January 30, 2003 Genome Analysis to Select Targets which Probe Fold and Function Space  How many protein superfamilies and families.
Identifying functional residues of proteins from sequence info Using MSA (multiple sequence alignment) - search for remote homologs using HMMs or profiles.
Enzymes Chapter 8. Important Group of Proteins Catalytic power can incr rates of rxn > 10 6 Specific Often regulated to control catalysis Coupling  biological.
BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.
Phylogenetic Shadowing Daniel L. Ong. March 9, 2005RUGS, UC Berkeley2 Abstract The human genome contains about 3 billion base pairs! Algorithms to analyze.
Evolution of Multidomain Proteins CS 374 – Lecture 10 Wissam Kazan.
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
Chemical mechanism is dominant Nature selects the protein for divergent evolution from a pool of enzymes whose mechanism provide a partial mechanism, or.
The evolution and structural anatomy of small molecule metabolism pathways in Escherichia coli. Of Pathways and Proteins Stuart Rison and Sarah Teichmann.
Adventures in Computational Enzymology John Mitchell.
Pairwise sequence alignments Dynamic programming (Needleman-Wunsch), finds optimal alignment Heuristics: Blast (Altschul et al) does not guarantee finding.
Cédric Notredame (30/08/2015) Chemoinformatics And Bioinformatics Cédric Notredame Molecular Biology Bioinformatics Chemoinformatics Chemistry.
Adventures in Computational Enzymology John Mitchell University of St Andrews.
WebGBrowse A Web Server for GBrowse Configuration Ram Podicheti B.V.Sc. & A.H. (D.V.M.), M.S. Staff Scientist – Bioinformatics Center for Genomics and.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Using reaction mechanism to measure enzyme similarity Noel M. O'Boyle, Gemma L. Holliday, Daniel E. Almonacid and John B.O. Mitchell Unilever Centre for.
Sequence analysis: what is a sequence? Linear arrangement of chemical subunits Contains information: 3-D arrangement determined by the sequence; 3-D defines.
Structural Bioinformatics R. Sowdhamini National Centre for Biological Sciences Tata Institute of Fundamental Research Bangalore, INDIA.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Overview Enzymes are specialized proteins that function as catalysts to increase the rate of biochemical reactions. By interacting with substrates (reactant.
1 PyMOL Evolutionary Trace Viewer 1.1 Lichtarge Lab Sept. 13, 2010.
MACiE – a Database of Enzyme Reaction Mechanisms Janet Thornton EMBL-EBI July 2006.
Biochemical Studies to Probe the Domain-Domain Communication Pathways in E. coli Prolyl-tRNA Synthetase Heidi Schmit and Sanchita Hati Department of Chemistry,
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
John Mitchell Bioinformatics Chemoinformatics Computational Chemistry Theoretical Chemistry.
Construction of Substitution Matrices
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Pairwise Sequence Analysis-III
PREDICTION OF CATALYTIC RESIDUES IN PROTEINS USING MACHINE-LEARNING TECHNIQUES Natalia V. Petrova (Ph.D. Student, Georgetown University, Biochemistry Department),
Using structure in protein function annotation: predicting protein interactions Donald Petrey, Cliff Qiangfeng Zhang, Raquel Norel, Barry Honig Howard.
Functional and Evolutionary Attributes through Analysis of Metabolism Sophia Tsoka European Bioinformatics Institute Cambridge UK.
Comparing and Classifying Domain Structures
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
Daniel E. Almonacid, Gemma L. Holliday, Gail J. Bartlett, Noel M. O’Boyle, Peter Murray-Rust, Janet M. Thornton and John B. O. Mitchell Enzyme Mechanism.
Classification and Nomenclature of Enzymes
Daniel E. Almonacid and Patricia C. Babbitt
Rita Casadio BIOCOMPUTING GROUP University of Bologna, Italy Prediction of protein function from sequence analysis.
Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison.
Genetic Algorithms Select Protein Features Most Predictive of Enzyme Function Andrew Kernytsky, Burkhard Rost Columbia University.
InterPro Sandra Orchard.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
The Chemistry of Protein Catalysis John Mitchell University of St Andrews.
Demo: Protein Information Resource

Predicting Active Site Residue Annotations in the Pfam Database
EECS 800 Research Seminar Mining Biological Data
Chapter Three: Enzymes
There are four levels of structure in proteins
Acknowledgements Abbreviations
Classification: understanding the diversity and principles of
Prediction of protein function from sequence analysis
by Jacob O. Brunkard, Anne M. Runkel, and Patricia C. Zambryski
Annabel E Todd, Christine A Orengo, Janet M Thornton  Structure 
Introduction to bioinformatics Lecture 5 Pair-wise sequence alignment
Presentation transcript:

The Chemistry of Protein Catalysis John Mitchell

The MACiE Database Mechanism, Annotation and Classification in Enzymes. http://www.ebi.ac.uk/thornton-srv/databases/MACiE/ Gemma Holliday, Daniel Almonacid, Noel O’Boyle, Janet Thornton (EBI), Peter Murray-Rust, Gail Bartlett (EBI), James Torrance, John Mitchell G.L. Holliday et al., Nucl. Acids Res., 35, D515-D520 (2007)

Enzyme Nomenclature and Classification EC Classification Class Subclass Sub-subclass Serial number

The EC Classification Only deals with overall reaction Reaction direction arbitrary Cofactors and active site residues ignored Doesn’t deal with structural and sequence information However, it was never intended to do so

A New Representation of Enzyme Reactions? Should be complementary to, but distinct from, the EC system Should take into account: Reaction Mechanism Structure Sequence Active Site residues Cofactors Need a database of enzyme mechanisms

Mechanism, Annotation and Classification in Enzymes. MACiE Database Mechanism, Annotation and Classification in Enzymes. http://www.ebi.ac.uk/thornton-srv/databases/MACiE/

Coverage of MACiE Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass.

Coverage of MACiE Structures exist for: 6 EC 1.-.-.- 56 EC 1.2.-.- MACiE covers: 6 EC 1.-.-.- 53 EC 1.2.-.- 156 EC 1.2.3.- 199 EC 1.2.3.4 1312/184~7 Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass.

Repertoire of Enzyme Catalysis G.L. Holliday et al., J. Molec. Biol., 372, 1261-1277 (2007) G.L. Holliday et al., J. Molec. Biol., accepted (2009)

Repertoire of Enzyme Catalysis Enzyme chemistry is largely nucleophilic

Repertoire of Enzyme Catalysis Enzyme chemistry is largely nucleophilic

Repertoire of Enzyme Catalysis Proton transfer AdN2 E1 SN2 E2 Radical reaction Tautom. Others

Repertoire of Enzyme Catalysis

Repertoire of Enzyme Catalysis

Repertoire of Enzyme Catalysis

Repertoire of Enzyme Catalysis

Residue Catalytic Propensities

Residue Catalytic Functions

We use a combination of bioinformatics & chemoinformatics to identify similarities between enzyme-catalysed reaction mechanisms

Just like sequence alignment! … we align the steps of chemical reactions. Just like sequence alignment! We can measure their similarity …

Find only a few similar pairs

Identify convergent evolution

Check MACiE for duplicates

Mechanistic similarity is only weakly related to proximity in the EC classification

EC in common  0 -.-.-.-  1 c.-.-.-  2 c.s.-.-  3 c.s.ss.-

Evolution of Enzyme Function D.E. Almonacid et al., to be published

EC is our Functional Classification Chemical reaction Enzyme Commission (EC) Nomenclature, 1992, Academic Press, San Diego, 6th Edition

Enzyme catalysis databases G.L. Holliday et al., Nucleic Acids Res., 35, D515 (2007) S.C. Pegg et al., Biochemistry, 45, 2545 (2006) N. Nagano, Nucleic Acids Res., 33, D407 (2005)

Coverage of MACiE Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass.

Based on a few evolutionarily related families Coverage of SFLD Based on a few evolutionarily related families

But without mechanisms. Coverage of EzCatDB But without mechanisms.

Work with domains - evolutionary & structural units of proteins. Map enzyme catalytic mechanisms to domains to quantify convergent and divergent functional evolution of enzymes.

CATH is our Structural Classification Orengo, C. A., et al. Structure, 1997, 5, 1093

Results: Convergent Evolution Numbers of CATH code occurrences per EC number c.-.-.- c.s.-.- c.s.ss.- c.s.ss.sn C 3.17 1.73 1.38 1.11 A 11.00 3.27 1.93 1.60 T 28.00 4.89 2.24 1.19 H 38.33 5.80 2.46 1.22 2.46 CATH/EC reaction Convergent Evolution

Numbers of CATH code occurrences per EC number Results: Convergent Evolution Numbers of CATH code occurrences per EC number c.-.-.- c.s.-.- c.s.ss.- c.s.ss.sn C 3.17 1.73 1.38 1.11 A 11.00 3.27 1.93 1.60 T 28.00 4.89 2.24 1.19 H 38.33 5.80 2.46 1.22 2.46 CATH/EC reaction: Convergent Evolution An average reaction has evolved independently in 2.46 superfamilies

Results: Divergent Evolution database entries/CATH EC reactions/CATH C 4.75 19.50 39.25 90.00 A 3.14 7.00 10.48 17.90 T 1.36 1.79 2.08 3.05 H 1.20 1.36 c.-.-.- c.s.-.- c.s.ss.- c.s.ss.sn 1.46 2.05 1.46 EC reactions/CATH Divergent Evolution database entries/CATH 2.18

Results: Divergent Evolution database entries/CATH EC reactions/CATH C 4.75 19.50 39.25 90.00 A 3.14 7.00 10.48 17.90 T 1.36 1.79 2.08 3.05 H 1.20 1.36 c.-.-.- c.s.-.- c.s.ss.- c.s.ss.sn 1.46 2.05 1.46 EC reactions/CATH: Divergent Evolution An average superfamily has evolved 1.46 different reactions database entries/CATH 2.18

The Future …

(1) Molecular Evolution

Now we want to evolve chemical reactions in silico across chemical, or EC, space. 1. To understand and rationalise convergent and divergent biochemical evolution; 2. To better relate protein structure and function; 3. To understand the influence on networks of coupled reactions.

(2) Understanding Protein Structure We seek to understand the influence of folding pathway on protein structure over all time scales (including the evolutionary one).

5788 (~12%) of PDB come from Structural Genomics 44

Protein Folding Funnel Energy Landscape 5788 (~12%) of PDB come from Structural Genomics 45

ACKNOWLEDGEMENTS Dr Gemma Holliday Dr Daniel Almonacid Dr Noel O’Boyle Prof. Janet Thornton (EBI) Dr Peter Murray-Rust Dr Florian Nigsch

ACKNOWLEDGEMENTS Cambridge Overseas Trust