Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 4.31 Protein Pathways and Pathway Databases Shan Sundararaj University of Alberta Edmonton, AB

Similar presentations


Presentation on theme: "Lecture 4.31 Protein Pathways and Pathway Databases Shan Sundararaj University of Alberta Edmonton, AB"— Presentation transcript:

1 Lecture 4.31 Protein Pathways and Pathway Databases Shan Sundararaj University of Alberta Edmonton, AB ss23@ualberta.ca

2 Lecture 4.32 Interactions  Networks  Pathways A collection of interactions defines a network Pathways are a subset of networks –All pathways are networks of interactions, however not all networks are pathways! –Difference in the level of annotation/understanding We can define a pathway as a biological network that relates to a known physiological process or phenotype

3 Lecture 4.33 Pathways However, there is no precise biological definition of a pathway Our partitioning of networks into pathways is somewhat arbitrary –We choose the start/finish points based on “important” or easily understood compounds –Gives us the ability to conceptualize the mapping of genotype  phenotype

4 Lecture 4.34 Biological pathways There are 3 type of interactions that can be mapped to pathways: 1) enzyme – ligand metabolic pathways 2) protein – protein cell signaling pathways complexes for cell processes 3) gene regulatory elements – gene products genetic networks

5 Lecture 4.35 Pathways are inter-linked Signalling pathway Genetic network Metabolic pathway STIMULUS

6 Lecture 4.36 Metabolic Pathways 1993 Boehringer Mannheim GmbH - Biochemica

7 Lecture 4.37 What the pathway represents Metabolites involved Enzymes/transport proteins Order of reactions General biological function Reaction rates Expression data Inhibitors, activators, alternate pathways Genetic regulatory information

8 Lecture 4.38 Describing metabolic networks Classical biochemical pathways –glycolysis, TCA cycle, etc. Stoichiometric modeling –flux balance analysis, extreme pathways Kinetic modeling (CyberCell, E-cell, …) –Need to accumulate comprehensive kinetic information

9 Lecture 4.39 Complexity Pathways involve multiple enzymes, which may have multiple subunits, alternate forms, alternate specificities Enzymes may be involved in multiple pathways Malate dehydogenase appears in 6 different metabolic pathways in some databases

10 Lecture 4.310 Metabolic Pathway Reconstruction Given a genomic sequence, we can infer what metabolic pathways are available to an organism Used to design culture medium for Tropheryma whipplei by seeing what nutrients were essential for growth (Renesto et al., Lancet, 362, 447-449, 2003)

11 Lecture 4.311 Co-expression within pathways Tempting thought: genes that occur within the same pathway will show similar expression profiles Reality: depends greatly on how you identify your pathways, KEGG pathways show at best 50% co- expression in survey of available yeast expression data (Ihmels et al., Nat Biotechnol. 22, 86-92, 2004). Expression levels do not correlate very well with protein interactions (unless they are “stable” complexes, maintained in many different conditions)

12 Lecture 4.312 Pathway Databases KEGG BioCyc Reactome GenMAPP BioCarta TransPATH …175 more at Pathway Resource List http://www.cbio.mskcc.org/prl/index.php

13 Lecture 4.313 BioPAX (www.biopax.org) Collaborative effort to create a data exchange format for biological pathway data

14 Lecture 4.314 KEGG http://www.genome.ad.jp/kegg/ 5904 chemical reactions 15,037 pathways 229 reference pathways 85 ortholog tables 181 organisms

15 Lecture 4.315 KEGG GENES Database –The universe of genes and proteins in complete genomes LIGAND Database –The universe of chemical reactions involving metabolites and other biochemical compounds Pathway Database –Molecular interaction networks, metabolic and regulatory pathways, and molecular complexes

16 Lecture 4.316 Connection between KEGG and other Databases

17 Lecture 4.317 Pathways Represented as diagrams, manually created, stored as gifs Easy to link to, highlight genes of interest Generate orthologous pathways in other organisms 2.7.2.4 1.2.1.11 1.1.1.3 2.3.1.46 2.5.1.48 4.4.1.8 2.1.1.13 2.5.1.6

18 Lecture 4.318 http://www.biocyc.org/

19 Lecture 4.319 The primary database was EcoCyc (E. coli) 21 more curated pathway/genome databases (PGDB), each focusing on one organism (e.g. HumanCyc) –Also 142 more non-curated (computationally generated) pathways MetaCyc database contains non-redundant reference pathways from more than 240 organisms Supports “Pathway Tools” software suite to analyze PGDBs, and “PathoLogic” pathway prediction program for new genomes BioCyc

20 Lecture 4.320 BioCyc Chromosomes, Plasmids Genes Proteins Reactions Pathways Compounds Operons, Promoters, DNA Binding Sites Each PGDB includes info about: –Pathways, reactions, substrates –Enzymes, transporters –Genes, replicons –Transcription factors, promoters, operons, DNA binding sites MetaCyc and EcoCyc are literature-based, the others are compu- tationally derived

21 Lecture 4.321 164 datasets Query by protein, gene, compound, reaction, pathway BLAST sequence if protein name unknown

22 Lecture 4.322 MetaCyc Statistics

23 Lecture 4.323 EcoCyc Statistics

24 Lecture 4.324 BioCyc: Pathway Tools Full Metabolic Map –Paint gene expression data on metabolic network; compare metabolic networks Pathways –Pathway prediction (PathoLogic) Reactions –Balance checker Compounds –Chemical substructure comparison Enzymes,Transcription Factors Genes: Blast search Operons –Operon prediction (Adapted from Pathway Tools tutorial, http://bioinformatics.ai.sri.com/ptools/)

25 Lecture 4.325 PathoLogic – Making PGDBs

26 Lecture 4.326 Completeness of Pathways

27 Lecture 4.327 Completeness of Pathways

28 Lecture 4.328 Issues with predicting pathways Predicting metabolic pathways from genome: –Predict genes –Assign enzymatic function to genes –Look for enzymes unique to pathway –Check if pathway is “balanced” (no holes) –Try to fill holes by re-searching genome

29 Lecture 4.329 Reactome http://www.reactome.org/

30 Lecture 4.330 Reactome Joint venture of CSHL and EBI (supercedes the Genome Knowledgebase project) Curated database of biological processes in humans –Also rat, mouse, fugu, zebrafish, chicken Everything referenced by curators to literature citation or inference based on sequence similarity

31 Lecture 4.331 Reactome model Model reactions: (input_entities)  (output_entities) Distinguishes between modified/unmodified proteins (modification is an explicit reaction) Highly annotated at every step, very micromanaged, hope to find interesting links between reactions

32 Lecture 4.332 Reactome: PathFinder Pathfinding between distant processes Enter two molecules or events and see if they can be joined together by reactions

33 Lecture 4.333 Reactome: SkyPainter Find all reactions that contain a molecule or event –Very flexible input, any one or more of: protein/gene ID (UniProt, Genbank or others) protein/gene sequence GO or OMIM identifier time series from a gene expression study

34 Lecture 4.334 Reactome: SkyPainter Starry sky output If expression data used, you get different colours for different levels of expression If time series available, you can make an animation

35 Lecture 4.335 GenMAPP (www.genmapp.org) Designed to rapidly analyze gene profiling data in the context of known biochemical pathways Pathways (MAPPs) are authored by experts, as well as adapting several pathways from KEGG Pathways easily web-queryable Free for all users But… Windows platform only

36 Lecture 4.336 GenMAPP Easy to draw/edit pathways Color genes from user imported expression data

37 Lecture 4.337 MAPPFinder – maps to GO ontology

38 Lecture 4.338 BioCarta (www.biocarta.com)

39 Lecture 4.339 BioCarta Not a public database, but offers free, clickable, graphics-rich pathway database and gene information –Community annotation Easy to use glyph system for genes 355 pathways –mostly human/mouse metabolic and signaling pathways

40 Lecture 4.340 TransPATH

41 Lecture 4.341 TransPATH Part of larger BioBase package (commercial) PathwayBuilder package for network visualization Highly integrated with signaling networks and transcription factor networks (TransFAC) Linked to extensive enzyme information in BRENDA (www.brenda.uni-koeln.de/) 28,456 molecules; 52,007 reactions; 54 hand- drawn pathways

42 Lecture 4.342 Pathway Database Comparison KEGGBioCycGenMAPPReactomeBioCarta TransPATH Organisms 181 (varied) E.Coli, human (20 others) Human, mouse, rat, fly, yeast Human, rat, mouse, chicken, fugu, zebrafish Human, mouse Pathway types Metabolic, genetic, signaling, complexes Metabolic, complexes Metabolic, signaling, complexes Signaling, genetic Tools/ visualization linked to from many Pathway Tools GenMAPPPathView applets nonePathway Builder ImagesStatic box flow diagrams Detailed flow diagrams Static box flow diagrams “starry sky”“Graphics rich” cell diagrams Graphics rich cell diagrams Download Formats KGML XML BioPax SBML MAPP format SBML MySQL Just images Propietary XML files

43 Lecture 4.343 Conclusion Pathway databases are continually evolving, and are an important abstract mid-level of expressing data: between genes/proteins and observable phenotypes Metabolic pathways are most well studied/modeled Many different formats of storage and display, but moving towards standards (PSI-MI, Biopax)


Download ppt "Lecture 4.31 Protein Pathways and Pathway Databases Shan Sundararaj University of Alberta Edmonton, AB"

Similar presentations


Ads by Google