Pathway Tools User Group Meeting Introduction Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.

Slides:



Advertisements
Similar presentations
How pathway databases were created and curated Peifen Zhang Plant Metabolic Network (PMN)
Advertisements

SRI International Bioinformatics Comparative Analysis Q
Overview of the Pathway Tools Software and Pathway/Genome Databases.
Overview of the Pathway Tools Software and Pathway/Genome Databases.
SRI International Bioinformatics 1 The consistency Checker, or Overhauling a PGDB By Ron Caspi.
Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International
Overview of Genome Databases Peter D. Karp, Ph.D. SRI International www-db.stanford.edu/dbseminar/seminar.html.
New Developments in the Pathway Tools Software and EcoCyc Database Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International
Contents of this Talk [Used as intro to Genome Databases Seminar, 2002] Overview of bioinformatics Motivations for genome databases Analogy of virus reverse-eng.
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
The EcoCyc and MetaCyc Pathway/Genome Databases
Interoperation of Molecular Biology Databases Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International Menlo Park, CA
Overview of the Pathway Tools Software and Pathway/Genome Databases.
Introduction to the Pathway Tools Software David Walsh and Simon Eng bigDATA Workshop—May 29, 2010.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
陳虹瑋 國立陽明大學 生物資訊學程 Genome Engineering Lab. Genome Engineering Lab The Newest.
Pathway/Genome Databases and Software Tools Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International
Update on The Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org.
CalbiCyc, Metabolic Pathways at the Candida Genome Database Martha Arnaud
Creating a … Community Database Organism-Specific Database Model-Organism Database.
Computational Exploration of Metabolic Networks with Pathway Tools Part 1: Overview & Representations Suzanne Paley Bioinformatics Research Group SRI International.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
Integration of E. Coli Data (E. coli Pathway and Genomic Data from BioCyc) Jesse Walsh.
1 SRI International Bioinformatics Introduction to Lisp Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International
1 SRI International Bioinformatics BioCyc Tutorial Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
1 SRI International Bioinformatics The BioCyc Ontologies Markus Krummenacker Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
1 SRI International Bioinformatics The Pathway Tools Software and BioCyc Database Collection Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Pathway Tools: Recent Developments GMOD Meeting, June 2006.
Overviews, Omics Viewers, and Object Groups. SRI International Bioinformatics Introduction Each overview is a genome-scale diagram of cellular machinery.
Computational Exploration of Metabolic Networks with Pathway Tools Part 2: APIs & Examples Randy Gobbel, Ph.D. Bioinformatics Research Group SRI International.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
1 SRI International Bioinformatics EcoCyc, MetaCyc, and the Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
Data Content of the BioCyc Databases. BioCyc Tier 1 Databases.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
The Pathway Tools Ontology and Inferencing Layer Peter D. Karp, Ph.D. SRI International.
TAIR/Gramene/SGN Workshop I ASPB Meeting July 08, 2007 Chicago, IL Metabolic Databases.
The BioCyc Collection of Pathway/Genome Databases Alexander Shearer Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Advanced Editing of Pathway/Genome Databases Ron Caspi.
MetaCyc and AraCyc: Plant Metabolic Databases Hartmut Foerster Carnegie Institution.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
Top Four Essential TAIR Resources Debbie Alexander Metabolic Pathway Databases for Arabidopsis and Other Plants Peifen Zhang.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
1 SRI International Bioinformatics And now for our ‘Feature’ presentation: Automatic Loading of Protein Sequence Annotation Data from UniProt to Pathway.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
© 2014 SRI International About OMICS Group OMICS Group International is an amalgamation of Open Access publications and worldwide international science.
Writing Programs that Analyze Pathway/Genome Databases Markus Krummenacker Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics Update your computers! To install a patch: Tools => Instant Patch => Download and Activate All Patches.
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Building and Refining AraCyc: Data Content, Sources, and Methodologies Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
1 AraCyc Metabolic Pathway Annotation. 2 AraCyc – An overview  AraCyc is a metabolic pathway database for Arabidopsis thaliana;  Computational prediction.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
SRI International Bioinformatics 1 The Structured Advanced Query Page Tomer Altman Mario Latendresse Bioinformatics Research Group SRI International April.
Recent Developments and Future Directions in Pathway Tools Peter D. Karp SRI International.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Annotating with GO: an overview
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
An Advanced Web Query Interface for Biological Databases
The Pathway Tools FBA Module
The Pathway Tools Schema
Bioinformatics Capstone Project
The Pathway Tools Software and BioCyc Database Collection
A Community Effort to Model the Human Microbiome
Overview of Microbial Pathway and Genome Databases
Overview of the Pathway Tools FBA Module
Overview of the Pathway Tools Software and Pathway/Genome Databases
Presentation transcript:

Pathway Tools User Group Meeting Introduction Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org HumanCyc.org

SRI International Bioinformatics Overview Goals of meeting Terminology Pathway Tools and BioCyc – The Big Picture Updates to EcoCyc and MetaCyc More information Optional: Speakers contribute talks to web site

SRI International Bioinformatics Meeting Goals Share experiences on how to make optimal use of Pathway Tools and BioCyc What new add-on tools are people developing that others might want to use? Coordinate future software development by SRI and other groups l What software enhancements are needed? l Example: New inference modules – GO terms, cell location Give us feedback on how we can better serve you

SRI International Bioinformatics Terminology Databases vs Software xCyc’s vs Pathway Tools

SRI International Bioinformatics BioCyc Collection of Pathway/Genome Databases Pathway/Genome Database (PGDB) – combines information about l Pathways, reactions, substrates l Enzymes, transporters l Genes, replicons l Transcription factors/sites, promoters, operons Tier 1: Literature-Derived PGDBs l MetaCyc l EcoCyc -- Escherichia coli K-12 l BioCyc Open Chemical Database Tier 2: Computationally-derived DBs, Some Curation PGDBs l HumanCyc l Mycobacterium tuberculosis Tier 3: Computationally-derived DBs, No Curation DBs

SRI International Bioinformatics Terminology – Pathway Tools Software PathoLogic l Predicts operons, metabolic network, pathway hole fillers, from genome l Computational creation of new Pathway/Genome Databases Pathway/Genome Editors l Distributed curation of PGDBs l Distributed object database system, interactive editing tools Pathway/Genome Navigator l WWW publishing of PGDBs l Querying, visualization of pathways, chromosomes, operons l Analysis operations u Pathway visualization of gene-expression data u Global comparisons of metabolic networks Bioinformatics 18:S

SRI International Bioinformatics BioCyc Tier PGDBs l 130 prokaryotic PGDBs created by SRI u Source: CMR database l 15 prokaryotic and eukaryotic PGDBs created by EBI u Source: UniProt Automated processing by PathoLogic l Pathway prediction l Operon prediction (bacteria) l Pathway hole filler predictions All PGDBs available for adoption

SRI International Bioinformatics Family of Pathway/Genome Databases MetaCyc EcoCyc CauloCyc AraCyc MtbRvCyc HumanCyc

SRI International Bioinformatics Pathway/Genome DBs Created by External Users More than 500 licensees of Pathway Tools 50 groups applying the software to more than 80 organisms Software freely available to academics; Each PGDB owned by its creator Saccharomyces cerevisiae, SGD project, Stanford University l pathway.yeastgenome.org/biocyc / TAIR, Carnegie Institution of Washington Arabidopsis.org:1555 dictyBase, Northwestern University GrameneDB, Cold Spring Harbor Laboratory Planned: l CGD (Candida albicans), Stanford University l MGD (Mouse), Jackson Laboratory l RGD (Rat), Medical College of Wisconsin l WormBase ( C. elegans ), Caltech DOE Genomes to Life contractors: l G. Church, Harvard, Prochlorococcus marinus MED4 l E. Kolker, BIATECH, Shewanella onedensis l J. Keasling, UC Berkeley, Desulfovibrio vulgaris Plasmodium falciparum, Stanford University l plasmocyc.stanford.edu Fiona Brinkman, Simon Fraser Univ, Pseudomonas aeruginosa Methanococcus janaschii, EBI maine.ebi.ac.uk:1555

SRI International Bioinformatics EcoCyc Project – EcoCyc.org E. coli Encyclopedia l Model-Organism Database for E. coli l Computational symbolic theory of E. coli l Electronic review article for E. coli u 10,500 literature citations u 3600 protein comments l Tracks the evolving annotation of the E. coli genome l Resource for microbial genome annotation Collaborative development via Internet l John Ingraham (UC Davis) l Paulsen (TIGR) – Transport, flagella, DNA repair l Collado (UNAM) -- Regulation of gene expression l Keseler, Shearer (SRI) -- Metabolic pathways, cell division, proteases l Karp (SRI) -- Bioinformatics Nuc. Acids. Res. 33:D ASM News 70: Science 293:2040

SRI International Bioinformatics Comments in Proteins, Pathways, Operons, etc.

SRI International Bioinformatics EcoCyc Accelerates Science Experimentalists l E. coli experimentalists l Experimentalists working with other microbes l Analysis of expression data Computational biologists l Biological research using computational methods l Genome annotation l Study connectivity of E. coli metabolic network l Study organization of E. coli metabolic enzymes into structural protein families l Study phylogentic extent of metabolic pathways and enzymes in all domains of life Bioinformaticists l Training and validation of new bioinformatics algorithms – predict operons, promoters, protein functional linkages, protein-protein interactions, Metabolic engineers l “Design of organisms for the production of organic acids, amino acids, ethanol, hydrogen, and solvents “ Educators

SRI International Bioinformatics MetaCyc: Metabolic Encyclopedia Nonredundant metabolic pathway database Describe a representative sample of every experimentally determined metabolic pathway Literature-based DB with extensive references and commentary Pathways, reactions, enzymes, substrates Jointly developed by SRI and Carnegie Institution Nucleic Acids Research 32:D

SRI International Bioinformatics MetaCyc Curation DB updates by 5 staff curators l Information gathered from biomedical literature l Emphasis on microbial and plant pathways l More prevalent pathways given higher priority l Curator’s Guide lists curation conventions Review-level database Four releases per year Quality assurance of data and software: l Evaluate database consistency constraints l Perform element balancing of reactions l Run other checking programs l Display every DB object

SRI International Bioinformatics MetaCyc Curation Ontologies guide querying l Pathways (recently revised), compounds, enzymatic reactions l Example: Coenzyme M biosynthesis Extensive citations and commentary Evidence codes l Controlled vocabulary of evidence types l Attach to pathways and enzymes: u Code : Citation : Curator : date Release notes explain recent updates l

SRI International Bioinformatics MetaCyc Data

SRI International Bioinformatics MetaCyc Pathway Variants Pathways that accomplish similar biochemical functions using different biochemical routes l Alanine biosynthesis I – E. coli l Alanine biosynthesis II – H. sapiens Pathways that accomplish similar biochemical functions using similar sets of reactions l Several variants of TCA Cycle

SRI International Bioinformatics MetaCyc Super-Pathways Groups of pathways linked by common substrates Example: Super-pathway containing l Chorismate biosynthesis l Tryptophan biosynthesis l Phenylalanine biosynthesis l Tyrosine biosynthesis Super-pathways defined by listing their component pathways Multiple levels of super-pathways can be defined Pathway layout algorithms accommodate super-pathways

SRI International Bioinformatics More Information 200+ pages of documentation available: User’s Guide, Schema Guide, Curator’s Guide Pathway Tools source code available Active community of contributors Read the release notes!

SRI International Bioinformatics Behind the Scenes 330,000 lines of code, mostly Common Lisp 4.5 programmers Extensive QA on each release Bug tracking using Bugzilla

SRI International Bioinformatics The Common Lisp Programming Environment Gatt studied Lisp and Java implementation of 16 programs by 14 programmers (Intelligence 11: )

SRI International Bioinformatics Peter Norvig’s Solution “I wrote my version in Lisp. It took me about 2 hours (compared to a range of hours for the other Lisp programmers in the study, 3-25 for C/C++ and 4-63 for Java) and I ended up with 45 non-comment non-blank lines (compared with a range of for Lisp, and for the other languages). (That means that some Java programmer was spending 13 lines and 84 minutes to provide the functionality of each line of my Lisp program.)”

SRI International Bioinformatics Common Lisp Programming Environment General-purpose language, not just for recursive or functional programming Interpreted and/or compiled execution Fabulous debugging environment High-level language Interactive data exploration Extensive built-in libraries Dynamic redefinition Find out more! l See ALU.org or l

SRI International Bioinformatics Pathway Tools WWW Server

SRI International Bioinformatics Summary Pathway/Genome Databases l MetaCyc non-redundant DB of literature-derived pathways l 165 organism-specific PGDBs available through SRI at BioCyc.org l Computational theories of biochemical machinery Pathway Tools software l Extract pathways from genomes l Morph annotated genome into structured ontology l Distributed curation tools for MODs l Query, visualization, WWW publishing

SRI International Bioinformatics BioCyc and Pathway Tools Availability WWW BioCyc freely available to all l BioCyc.org BioCyc DBs freely available to non-profits l Flatfiles downloadable from BioCyc.org Pathway Tools freely available to non-profits l PC/Windows, PC/Linux, SUN

SRI International Bioinformatics Acknowledgements SRI l Suzanne Paley, Michelle Green, Ron Caspi, Ingrid Keseler, John Pick, Carol Fulcher, Markus Krummenacker, Alex Shearer EcoCyc Project Collaborators l Julio Collado-Vides, John Ingraham, Ian Paulsen MetaCyc Project Collaborators l Sue Rhee, Peifen Zhang, Hartmut Foerster And l Harley McAdams Funding sources: l NIH National Center for Research Resources l NIH National Institute of General Medical Sciences l NIH National Human Genome Research Institute l Department of Energy Microbial Cell Project l DARPA BioSpice, UPC BioCyc.org