MetaCyc and AraCyc: Plant Metabolic Databases Hartmut Foerster Carnegie Institution.

Slides:



Advertisements
Similar presentations
How pathway databases were created and curated Peifen Zhang Plant Metabolic Network (PMN)
Advertisements

The Arabidopsis Information Resource (TAIR)
Kate Dreher AraCyc, TAIR, PMN Carnegie Institution for Science
Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
SRI International Bioinformatics Comparative Analysis Q
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
Overview of the Pathway Tools Software and Pathway/Genome Databases.
Overviews and Omics Viewers. SRI International Bioinformatics Introduction Each overview is a genome-scale diagram of a different aspect of the cellular.
The Plant Metabolic Network: PlantCyc, AraCyc, and NEW Metabolic Pathway Databases for Plant Research *K. Dreher, P. Zhang, L. Chae, R.A. Nilo Poyanco,
Overview of the Pathway Tools Software and Pathway/Genome Databases.
SRI International Bioinformatics 1 The consistency Checker, or Overhauling a PGDB By Ron Caspi.
Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International
Introduction to the Pathway Tools Software David Walsh and Simon Eng bigDATA Workshop—May 29, 2010.
Pathway Tools User Group Meeting Introduction Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
陳虹瑋 國立陽明大學 生物資訊學程 Genome Engineering Lab. Genome Engineering Lab The Newest.
Update on The Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org.
Creating a … Community Database Organism-Specific Database Model-Organism Database.
Accessing the Data You Need at the Plant Metabolic Network kate dreher biocurator PMN The Carnegie Institution for Science Stanford, CA.
1 SRI International Bioinformatics BioCyc Tutorial Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
1 SRI International Bioinformatics The Pathway Tools Software and BioCyc Database Collection Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Pathway Tools: Recent Developments GMOD Meeting, June 2006.
Overviews, Omics Viewers, and Object Groups. SRI International Bioinformatics Introduction Each overview is a genome-scale diagram of cellular machinery.
Overviews and Omics Viewers. SRI International Bioinformatics Introduction Each overview is a genome-scale diagram of cellular machinery l Cellular Overview.
Data Content of the BioCyc Databases. BioCyc Tier 1 Databases.
Pathway Assignments. The assignment – Annotating Pathways KEGG Pathway Database.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Accessing information in plant metabolic pathway databases at the PMN, Gramene, and SGN Part I: Contents, Search Strategies, and Data Sharing Opportunities.
TAIR/Gramene/SGN Workshop I ASPB Meeting July 08, 2007 Chicago, IL Metabolic Databases.
The BioCyc Collection of Pathway/Genome Databases Alexander Shearer Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
Use cases for Tools at the Bovine Genome Database Apollo and Bovine QTL viewer.
SRI International Bioinformatics 1 Advanced Editing of Pathway/Genome Databases Ron Caspi.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
PlantCyc, AraCyc, PoplarCyc and more... Building databases and connecting to researchers at the Plant Metabolic Network kate dreher curator PMN/TAIR.
The consistency Checker, or Overhauling a PGDB By Ron Caspi.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
Top Four Essential TAIR Resources Debbie Alexander Metabolic Pathway Databases for Arabidopsis and Other Plants Peifen Zhang.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
1 SRI International Bioinformatics And now for our ‘Feature’ presentation: Automatic Loading of Protein Sequence Annotation Data from UniProt to Pathway.
Cellular Overview and Omics Viewer. SRI International Bioinformatics The Cellular Overview Diagram A way to quickly visualize an organism’s metabolism.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
© 2014 SRI International About OMICS Group OMICS Group International is an amalgamation of Open Access publications and worldwide international science.
Pathway Database Pathway Comparison Expression Viewer Discovery. Pankaj Jaiswal Oregon State University 1.
Combining Computational Prediction and Manual Curation to Create Plant Metabolic Pathway Databases Peifen Zhang Carnegie Institution For Science Department.
Metabolic Pathway Databases and Tools Speaker and Schedule Update PMN (Peifen Zhang) KEGG (auto-slide show) MetaCrop (cancelled)
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
SRI International Bioinformatics 1 The Structured Advanced Query Page Mario Latendresse Tomer Altman Bioinformatics Research Group SRI International March,
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Building and Refining AraCyc: Data Content, Sources, and Methodologies Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
Welcome to Gramene’s RiceCyc (Pathways) Tutorial RiceCyc allows biochemical pathways to be analyzed and visualized. This tutorial has been developed for.
1 AraCyc Metabolic Pathway Annotation. 2 AraCyc – An overview  AraCyc is a metabolic pathway database for Arabidopsis thaliana;  Computational prediction.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
2006 ICAR: TAIR workshop Organizers: Katica Ilic and Peifen Zhang Location: Reception Room, 4th floor A general overview of TAIR website and demonstration.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
SRI International Bioinformatics Selected PathoLogic Refining Tasks Creation of Protein Complexes Assignment of Modified Proteins Operon Prediction.
Recent Developments and Future Directions in Pathway Tools Peter D. Karp SRI International.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Overviews, Omics Viewers, Pathway Collages
Comparative Analysis in BioCyc
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
An Advanced Web Query Interface for Biological Databases
The Pathway Tools FBA Module
The Pathway Tools Software and BioCyc Database Collection
A Community Effort to Model the Human Microbiome
Propagating Changed Annotation and Pathway Information
Welcome to Gramene’s RiceCyc (Pathways) Tutorial
SRI Bioinformatics Research Group
Part II SeqViewer AraCyc Help
Overview of the Pathway Tools Software and Pathway/Genome Databases
Presentation transcript:

MetaCyc and AraCyc: Plant Metabolic Databases Hartmut Foerster Carnegie Institution

Overview  MetaCyc: Explore the Database  AraCyc: The Metabolic Network of Arabidopsis

MetaCyc

What is MetaCyc ?  MetaCyc is a multi-organism database that collects any known pathways across all kingdoms  MetaCyc is a curated, literature-based biochemical pathways database  Collaboration between SRI International and Carnegie Institution

Goal and Applications Goal Universal repository of metabolic pathways  Up-to-date, literature-curated catalogue of commented enzymes and pathways for use in research, metabolic engineering and education Applications Database of reference used to generate predicted Pathway/Genome DataBases (PGDBs)

The Content of MetaCyc  Pathways from primary and secondary (specialized) metabolism Reactions with compound structures  Proteins  Genes Mostly from microorganism and plant kingdoms, and several animal pathways Does not contain sequence information

Curation Team  PhD-level curators  Extract data from literature o NLY experimentally verified data ideally protein information  Follow MetaCyc Curator’s guide

MetaCyc Version 10.5 to be released in August, 2006 Note: The statistics for each year pertain to the last MetaCyc version released in that year. Note: The statistics for each year pertain to the last MetaCyc version released in that year

MetaCyc – Browse the Database Explore class hierarchy of pathways, compounds, reactions, genes and cell components

MetaCyc – Browse the Database (cont’d) Explore class hierarchy of pathways, compounds, reactions, genes and cell components

MetaCyc – Explore the Metabolic Universe Query the database (pathways, reactions, compounds, genes)

MetaCyc – Query Page Type your search term and click submit Luteolin

MetaCyc – Query Result Page

MetaCyc – Pathway Detail Page Part I The pathway diagram shows compounds, reactions and metabolic links

MetaCyc – Pathway Detail Page Part II Pathway commentary comprises general and specific information about the pathway

MetaCyc – More Detail Extend or collapse the detail level of the pathway detail page

MetaCyc – More Detail (cont’d) In depth information about reaction, EC number, enzymes, genes, regulatory aspects, and metabolic links to related pathways

MetaCyc – More Detail (cont’d) More detail reveals structural information about compounds

MetaCyc – Reaction Detail Page

MetaCyc – Enzymes and Genes Contains enzyme commentary, references, and physico-chemical properties of the enzyme

MetaCyc – Enzymes and Genes (cont’d)

MetaCyc – Evidence Codes Intuitive icons Pathway Level Evidence codes provide assessment of data quality, i.e. the affirmation for the existence of an pathway Evidence codes provide assessment of data quality, i.e. the affirmation for the catalytic activity of an enzyme Enzyme Level

MetaCyc – Evidence Codes (cont’d)

Variants, Related Pathways and Links  Pathway variants are created as separate pathways  IAA biosynthesis I (tryptophane-dependent) IAA biosynthesis II (tryptophane-independent)  Links are added between interconnected pathways  Related pathways are grouped into superpathways  e.g.superpathway of choline biosynthesis

Creation of a Superpathway L-serine ethanolamine phosphoryl-ethanolamine N-methylethanolamine phosphate N-dimethylethanolamine phosphate phosphoryl-choline choline biosynthesis III Choline biosynthesis II ethanolamine phosphoryl-choline CDP-choline-choline a phosphatidylcholine choline biosynthesis IIIcholine biosynthesis IIcholine biosynthesis I N-monomethylethanolamine N-dimethylethanolamine choline Superpathway choline biosynthesis

Applications: Pathways Prediction Goal Universal repository of metabolic pathways  Up-to-date, literature-curated catalogue of commented enzymes and pathways for use in research, metabolic engineering and education Applications Database of reference used to generate predicted Pathway/Genome DataBases (PGDBs)

Pathway Tools Software Suite  Software for generating, curating, querying, displaying PGDBs  Developed by Peter Karp and team PathoLogic – Infers pathways from genome or transcripts sequencing Pathway/Genome Editors – Curation interface Pathway/Genome Navigator – Query, visualization, analysis and Web publishing OMICS Viewer

The PGDB Family Annotated Genome Arabidopsis thaliana PathoLogic Software Reference Pathway Database (MetaCyc) Reactions Pathways compounds Gene products genes Pathway/Genome Database (AraCyc)

AraCyc: The Arabidopsis thaliana specific metabolic database

The Computational Build of AraCyc  In 2004, the Arabidopsis genome contained 7900 genes annotated to the GO term ‘catalytic activity’  4900 enzymes in small molecule metabolism (19% of the total genome)  PathoLogic inferred 219 pathways and mapped 940 (19% enzyme-coding) genes to the pathways

Cleaning of a Newborn Database  PathoLogic errs on the side of over-prediction  First round of curation to remove false-positives  Add missing pathways  Improve the quality of information  Introduce new pathways  Increase number of pathway and protein comments  Refine computational assignment of protein

Pathway Validation Criteria  A pathway that is described in the Arabidopsis literature  A pathway whose critical metabolites are described in the Arabidopsis literature  A pathway that contains unique reactions and having genes assigned to those unique reactions

Validation Procedure  Delete non-plant pathways: Pathway variants of bacteria-origin Pathways not operating in plants at all (e.g. glycogen biosynthesis)  Add new plant-specific pathways: Pathway variants of plant-origin Plant-specific metabolites (e.g. plant hormones) Plant-specific metabolism (e.g. xanthophyll cycle)

AraCyc - Curation Progress AraCyc (2.1) Apr 2005 AraCyc (2.5) Oct 2005 AraCyc (2.6) Feb 2006 Total pathways New-3735 Updated-04 Deleted-616 Pathways manually reviewed with literature evidence 71 (32%)170 (86%)201 (88%) Additional 66 new pathways were added to MetaCyc

How to Link to AraCyc From the TAIR home page click on the link to AraCyc pathways

AraCyc – The Home Page Browse pathways, enzymes, genes, compounds Display of the Arabidopsis metabolic network Paint data from high-throughput experiments on the metabolic map User submission form

AraCyc – All the Help you can get

AraCyc’s Content

Computationally Predicted Enzymes

Experimentally Verified Enzymes

AraCyc: Pathway Detail Page (cont’d)

AraCyc: OmicsViewer

OMICS Viewer  Part of the Pathway Tools Software Suite  Displays bird-eye view of the Metabolic Overview diagram  Allows to paint data values from the user's high- throughput onto the Metabolic Overview diagram  Microarray Expression Data  Proteomics Data  Metabolomics Data

OmicsViewer Submission Page Step 1 Step 2 Load sample file and provide information about your data Sample data file (text tab-delimited)

OmicsViewer Submission Page (cont’d) Step 3 Step 6 Step 4 Step 5 Choose relative or absolute values Check the box if you have log values or negative fold change numbers Choose to display a single or multiple step experiment Select the type of data you want to display (refers to your loading file)

OmicsViewer Submission Page (cont’d) Step 7 For single or multiple time points add the corresponding column number(s)

Step 8 Step 9 OmicsViewer Submission Page (cont’d) Choose your cutoff to visualize your expression values

The Omics Viewer Result Page reactions (lines) are color-coded according to the gene expression level compounds (icons) are color-coded according to the concentration of compounds

The Omics Viewer Result Page (cont’d) The statistics for the expression map (single time points only) is provided at the bottom of the page

Saving Results

Curation Flow

Acknowledgements  TAIR - Sue Rhee (PI) - Peifen Zhang (curator) - Christophe Tissier (curator) - Hartmut Foerster (curator) - Mary Montoya (software developer) - Joe Filla (sysadmin)  SRI - Peter Karp (PI) - Ron Caspi (curator) - Carol Fulcher (curator) - Suzanne Paley (software developer) - Pallavi Kaipa (programmer) - Markus Krummenacker (programmer) - Mario Latendresse (programmer) NIH,NSF,Pioneer Hi-Bred  Previous contributers - Lukas Müller - Aleksey Kleytman (curator assistant – TAIR) - Thomas Yan (programmer – TAIR) - John Pick (software developer - SRI)