Presentation is loading. Please wait.

Presentation is loading. Please wait.

MED260 Modeling Protein Function - October 11, 2006 1 Modeling Protein Function MED260 Philip E. Bourne Department of Pharmacology, UCSD

Similar presentations


Presentation on theme: "MED260 Modeling Protein Function - October 11, 2006 1 Modeling Protein Function MED260 Philip E. Bourne Department of Pharmacology, UCSD"— Presentation transcript:

1 MED260 Modeling Protein Function - October 11, 2006 1 Modeling Protein Function MED260 Philip E. Bourne Department of Pharmacology, UCSD pbourne@ucsd.edu http://www.sdsc.edu/pb Slides on-line at: http://www.sdsc.edu/pb/edu/med260/med260.ppt

2 MED260 Modeling Protein Function - October 11, 2006 2 Agenda Why model protein function? Where does it fit as a technique in modern medical research? The data deluge as a motivator The extent of what can be modeled Ontologies – establishing order from chaos Examples of what can be learnt Accuracy – a word of caution

3 MED260 Modeling Protein Function - October 11, 2006 3 Why Model Protein Function The rate of discovery of new proteins far outweighs our ability to functionally characterize them Functional discovery of new proteins has implications in: –Drug discovery –Biomarker identification –Understanding of biological processes –Identification of disease states and treatment regimes Why model protein function?

4 Cell Biology Anatomy Physiology ProteomicsGenomics Medicinal Chemistry Chemistry Organisms Organs Cells MacromoleculesBiopolymers Atoms & Molecules SCIENTIFIC RESEARCH & DISCOVERY REPRESENTATIVE DISCIPLINE EXAMPLE UNITS MRI Heart Neuron StructureSequence ProteaseInhibitor ElectronMicroscopy MigratorySensors VentricularModeling X-rayCrystallography ProteinDocking REPRESENTATIVE TECHNOLOGY Where does it fit as a technique in modern medical research?

5 Cell Biology Anatomy Physiology ProteomicsGenomics Medicinal Chemistry Chemistry Organisms Organs Cells MacromoleculesBiopolymers Atoms & Molecules SCIENTIFIC RESEARCH & DISCOVERY REPRESENTATIVE DISCIPLINE EXAMPLE UNITS MRI Heart Neuron StructureSequence ProteaseInhibitor ElectronMicroscopy MigratorySensors VentricularModeling X-rayCrystallography ProteinDocking REPRESENTATIVE TECHNOLOGY Translational Medicine Where does it fit as a technique in modern medical research?

6 MED260 Modeling Protein Function - October 11, 2006 6 The Ability to Model Protein Function Influences and can be Influenced by Any Level of Biological Complexity - Examples Genome - rapid increase in sequenced genomes provides new raw material Proteome – large increase in the number of 3D structures highlights new functions Interactome – identification of a binding partner points to a new function Metabolome – isolation of a protein within a metabolic pathway Cell - localization points to function Organ – gene expression in heart tissue points to function Organism – different physiology observed in species can be related to protein functions Where does it fit as a technique in modern medical research?

7 MED260 Modeling Protein Function - October 11, 2006 7 Cell Biology Anatomy Physiology ProteomicsGenomics Medicinal Chemistry Chemistry Organisms Organs Cells MacromoleculesBiopolymers Atoms & Molecules SCIENTIFIC RESEARCH & DISCOVERY REPRESENTATIVE DISCIPLINE EXAMPLE UNITS MRI Heart Neuron StructureSequence ProteaseInhibitor ElectronMicroscopy MigratorySensors VentricularModeling X-rayCrystallography ProteinDocking REPRESENTATIVE TECHNOLOGY We will focus here

8 At All Levels We Are Being Driven By Data Biological Experiment Data Information Knowledge Discovery Collect Characterize Compare Model Infer Sequence Structure Assembly Sub-cellular Cellular Organ Higher-life Year 9005 Computing Power Sequencing Technology Data 110 1001000100000 9500 Human Genome Project E.Coli Genome C.Elegans Genome 1 Small Genome/Mo. ESTs Yeast Genome Gene Chips Virus Structure Ribosome Model Metaboloic Pathway of E.coli Complexity Technology Brain Mapping Genetic Circuits Neuronal Modeling Cardiac Modeling Human Genome # People/Web Site 10 6 10 2 1 Virtual Communities The Data Deluge

9 MED260 Modeling Protein Function - October 11, 2006 9 Metagenomics A First Look New type of genomics New data (and lots of it) and new types of data –17M new (predicted proteins!) 4-5 x growth in just few months and much more coming –New challenges and exacerbation of old challenges The Data Deluge

10 MED260 Modeling Protein Function - October 11, 2006 10 Metagenomics: First Results More then 99.5% of DNA in very environment studied represent unknown organisms –Culturable organisms are exceptions, not the rule Most genes represent distant homologs of known genes, but there are thousands of new families Everything we touch turns out to be a gold mine Environments studied: –Water (ocean, lakes) –Soil –Human body (gut, oral cavity, human microbiome) The Data Deluge

11 MED260 Modeling Protein Function - October 11, 2006 11 Metagenomics New Discoveries Environmental (red) vs. Currently Known PTPases (blue) Higher eukaryotes 1 2 3 4 The Data Deluge

12 MED260 Modeling Protein Function - October 11, 2006 12 The Good News and the Bad News Good news –Data pointing towards function are growing at near exponential rates –IT can handle it on a per dollar basis Bad news –Data are growing at near exponential rates –Quality is highly variable –Accurate functional annotation is sparse The Data Deluge

13 MED260 Modeling Protein Function - October 11, 2006 13 Genomes - 2004 We all know about the human – what is not so well known is: –191 completed microbial genomes –44 archaea –727 bacteria –785 eukaryotes (complete or in progress) –Viroids …. The Data Deluge

14 MED260 Modeling Protein Function - October 11, 2006 14 Proteome We are reasonably good at finding proteins in genomes with intergenic regions but not perfect – eg alternative initiation codons Regulatory elements provide a different set of challenges We are not so good at assigning functions to those proteins Moreover the devil is in the details The Extent of What Can Be Modeled

15 MED260 Modeling Protein Function - October 11, 2006 15 Estimated Functional Roles (by % of Proteins) of the Proteome in a Complex Organism The Extent of What Can Be Modeled

16 MED260 Modeling Protein Function - October 11, 2006 16 Functional Nomenclature Needs to be Consistent for Orderly Progress – Enter EC and GO EC classifies all enzymes - http://www.chem.qmul.ac.uk/iubmb/enzym e/ http://www.chem.qmul.ac.uk/iubmb/enzym e/ Gene Ontology Consortium characterizes by molecular function, biochemiscal process and cellular location http://www.geneontology.org/ http://www.geneontology.org/ Ontologies – establishing order from chaos

17 Functional Coverage of the Human Genome http://function.rcsb.org:8080/pdb/function_distribution/index.html 40% covered The Extent of What Can Be Modeled

18 MED260 Modeling Protein Function - October 11, 2006 18 Step 1. Learn What You Can from the Protein Sequence Find it Pay attention to the quality of the functional annotation – errors are transitive Understand its 1-D structure – domain organization, {signatures, fingerprints} Examples of what can be learnt

19 MED260 Modeling Protein Function - October 11, 2006 19 Step 2. Is there a 3D Structure? If so What Can You Learn from That? Find it Understand it Characterize it Understand its function(s) – these follow a power law at the fold level – some folds are promiscuous (many functions) others are solitary or of unknown function Examples of what can be learnt

20 (a) myoglobin (b) hemoglobin (c) lysozyme (d) transfer RNA (e) antibodies (f) viruses (g) actin (h) the nucleosome (i) myosin (j) ribosome Courtesy of David Goodsell, TSRI

21 MED260 Modeling Protein Function - October 11, 2006 21 First Why Bother with Structure? An Example: Protein Kinase A This “molecular scene” for cAMP dependant protein kinase depicts years of collective knowledge. Beyond basics, only the atomic coordinates are captured by the PDB. Functional annotation requires the literature Examples of what can be learnt

22 MED260 Modeling Protein Function - October 11, 2006 22 What Did that Picture Tell Us? Two domains with associated functions ATP binding & substrate binding Through conserved residues and their spatial location details of the ATP and substrate binding and mechanism of the phospho transfer reaction So is structure the answer to functional modeling? Examples of what can be learnt

23 MED260 Modeling Protein Function - October 11, 2006 23 Question: So is structure the answer to functional modeling? Answer: Partly - The number of unique protein sequences still outnumbers the number of unique structures by 100:1 Enter Structural Genomics Enter Structure Prediction Examples of what can be learnt

24 MED260 Modeling Protein Function - October 11, 2006 24 The Structural Genomics Pipeline (X-ray Crystallography) Basic Steps Target Selection Crystallomics Isolation, Expression, Purification, Crystallization Data Collection Structure Solution Structure Refinement Functional Annotation Publish Examples of what can be learnt

25 MED260 Modeling Protein Function - October 11, 2006 25 Structural Genomics Will Give Us.. Good news –More structures (definitely) –New folds (some but not as anticipated) –New understanding of specific diseases and pathways (maybe) –Representatives from each major protein family (maybe) Bad news –Many new structures that are functionally unclassified (definitely) Examples of what can be learnt

26 MED260 Modeling Protein Function - October 11, 2006 26 What About Structure Prediction? Current rule We will be able to predict a structure when we know all the structures Examples of what can be learnt

27 MED260 Modeling Protein Function - October 11, 2006 27 Random 1000 structurally similar PDB polypeptide chains with z > 4.5 (% sequence identity vs alignment length) Twilight Zone Why is Structure Prediction so Hard? Midnight Zone Examples of what can be learnt

28 MED260 Modeling Protein Function - October 11, 2006 28 Approaches to Structure Prediction Homology modeling Threading (aka fold recognition) Ab initio How well do we do? – see CASP Consensus servers –Eva - http://cubic.bioc.columbia.edu/eva// –LiveBench - http://bioinfo.pl/meta/ Examples of what can be learnt

29 MED260 Modeling Protein Function - October 11, 2006 29 Step 3. What Can Be Got from Structure When You Have it? From Structural Bioinformatics Ed Bourne and Weissig p394 Wiley 2002 Examples of what can be learnt

30 MED260 Modeling Protein Function - October 11, 2006 30 Specific Example Mj0577 – putative ATP molecular switch Mj0577 is an open reading frame (ORF) of previously unknown function from Methanococcus jannaschii. Its structure was determined at 1.7Å (Figure 7a) (Zarembinski et al, 1998). The structure contains a bound ATP molecule, picked up from the E. coli host. The presence of bound ATP led to the proposition that Mj0577 is either an ATPase, or an ATP-binding molecular switch. Further experimental work showed that Mj0577 cannot hydrolyse ATP by itself, and can only do so in the presence of M. jannaschii crude cell extract. Therefore it is more likely to act as a molecular switch, in a process analogous to ras-GTP hydrolysis in the presence of GTPase activating protein. From Structural Bioinformatics Ed Bourne and Weissig p402 Wiley 2002 Examples of what can be learnt

31 MED260 Modeling Protein Function - October 11, 2006 31 Step 4. Proteins Do Not Function in Isolation But are Part of Complex Interaction Networks http://www.genome.jp/kegg/ Examples of what can be learnt

32 MED260 Modeling Protein Function - October 11, 2006 32 Accuracy - A Word of Caution Errors are transitive –Proteins A and B are observed to have similar functions through sequence homology –Proteins B and C are observed to have similar functions through sequence homology –Is protein A related to protein C? –Up to 30% of current annotation may be wrong Accuracy - A Word of Caution

33 MED260 Modeling Protein Function - October 11, 2006 33 Questions?

34 MED260 Modeling Protein Function - October 11, 2006 34 Demo of Steps 1-4 Step 1. Learn What You Can from the Protein Sequence Step 2. Is there a 3D Structure? If So, What Can You Learn from That? Step 3. What Can Be Got from Structure When You Have it? Step 4. Proteins Do Not Function in Isolation But are Part of Complex Interaction Networks


Download ppt "MED260 Modeling Protein Function - October 11, 2006 1 Modeling Protein Function MED260 Philip E. Bourne Department of Pharmacology, UCSD"

Similar presentations


Ads by Google