Writing Programs that Analyze Pathway/Genome Databases Markus Krummenacker Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.

Slides:



Advertisements
Similar presentations
SRI International Bioinformatics 1 Navigation to Related Objects Bioinformatics Research Group SRI International Mario Latendresse.
Advertisements

Editing Pathway/Genome Databases. SRI International Bioinformatics Pathway Tools Paradigm Separate database from user interface Navigator provides one.
1 SRI International Bioinformatics The Ocelot Frame Knowledge Representation System Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
Computing with Pathway/Genome Databases
SRI International Bioinformatics 1 Web Services. SRI International Bioinformatics 2 Kinds of Web Services Data retrieval Web Services l PTools-XML l BioPAX.
SRI International Bioinformatics Data Import / Export Markus Krummenacker Bioinformatics Research Group SRI, International Q
SRI International Bioinformatics Comparative Analysis Q
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
SRI International Bioinformatics 1 Orthology-Based Multi-PGDB Curation Tools Suzanne Paley Pathway Tools Workshop 2010.
Pathway Bioinformatics Peter D. Karp, PhD Bioinformatics Research Group SRI International Menlo Park, CA BioCyc.org.
SRI International Bioinformatics 1 The consistency Checker, or Overhauling a PGDB By Ron Caspi.
Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International
The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon.
New Developments in the Pathway Tools Software and EcoCyc Database Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International
The EcoCyc and MetaCyc Pathway/Genome Databases
Interoperation of Molecular Biology Databases Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International Menlo Park, CA
Overview of the Pathway Tools Software and Pathway/Genome Databases.
Pathway Tools User Group Meeting Introduction Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
Pathway/Genome Databases and Software Tools Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International
Update on The Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org.
Pathway Bioinformatics (2) Peter D. Karp, PhD Bioinformatics Research Group SRI International Menlo Park, CA BioCyc.org.
Creating a … Community Database Organism-Specific Database Model-Organism Database.
Computational Exploration of Metabolic Networks with Pathway Tools Part 1: Overview & Representations Suzanne Paley Bioinformatics Research Group SRI International.
Integration of E. Coli Data (E. coli Pathway and Genomic Data from BioCyc) Jesse Walsh.
1 SRI International Bioinformatics BioCyc Tutorial Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
1 SRI International Bioinformatics The BioCyc Ontologies Markus Krummenacker Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
1 SRI International Bioinformatics The Pathway Tools Software and BioCyc Database Collection Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Pathway Tools: Recent Developments GMOD Meeting, June 2006.
Computational Exploration of Metabolic Networks with Pathway Tools Part 2: APIs & Examples Randy Gobbel, Ph.D. Bioinformatics Research Group SRI International.
1 SRI International Bioinformatics EcoCyc, MetaCyc, and the Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
Data Content of the BioCyc Databases. BioCyc Tier 1 Databases.
The Pathway Tools Ontology and Inferencing Layer Peter D. Karp, Ph.D. SRI International.
TAIR/Gramene/SGN Workshop I ASPB Meeting July 08, 2007 Chicago, IL Metabolic Databases.
SRI International Bioinformatics 1 The PerlCyc and JavaCyc APIs.
The BioCyc Collection of Pathway/Genome Databases Alexander Shearer Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Computing with Pathway/Genome Databases.
SRI International Bioinformatics 1 Advanced Editing of Pathway/Genome Databases Ron Caspi.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
MetaCyc and AraCyc: Plant Metabolic Databases Hartmut Foerster Carnegie Institution.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
1 SRI International Bioinformatics And now for our ‘Feature’ presentation: Automatic Loading of Protein Sequence Annotation Data from UniProt to Pathway.
The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
© 2014 SRI International About OMICS Group OMICS Group International is an amalgamation of Open Access publications and worldwide international science.
SRI International Bioinformatics 1 Computing with Pathway/Genome Databases.
Overview of the Pathway Tools Software and Pathway/Genome Databases Peter D. Karp Bioinformatics Research Group SRI International
SRI International Bioinformatics Update your computers! To install a patch: Tools => Instant Patch => Download and Activate All Patches.
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
SRI International Bioinformatics Selected PathoLogic Refining Tasks Creation of Protein Complexes Assignment of Modified Proteins Operon Prediction.
Recent Developments and Future Directions in Pathway Tools Peter D. Karp SRI International.
PythonCyc and other APIs A Python package to access Pathway Tools and its data using the Python programming language Mario Latendresse March 2016.
SRI International Bioinformatics 1 Computing with Pathway/Genome Databases.
Constructing Complex Queries in Pathway Tools using Emacs, Lisp, and Perl Randy Gobbel, Ph.D. May 14, 2003
Editing Pathway/Genome Databases
Comparative Analysis in BioCyc
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
An Advanced Web Query Interface for Biological Databases
The Pathway Tools Schema
How to Administer a PGDB
Bioinformatics Research Group
The Pathway Tools Software and BioCyc Database Collection
Data Exchange Java API and Perl API : read & modify
A Community Effort to Model the Human Microbiome
Maintaining the EcoCyc
Comparative Analysis Q
SRI Bioinformatics Research Group
Overview of the Pathway Tools Software and Pathway/Genome Databases
Presentation transcript:

Writing Programs that Analyze Pathway/Genome Databases Markus Krummenacker Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org HumanCyc.org

SRI International Bioinformatics Recent Projects Genome Browser, including Comparative Mode Horizontal: uses screen real estate better Semantic zooming: increasing levels of details Author-Crediting System For Pathways & Enzymes Types of Credit: Created, Reviewed, Revised Home Pages for Authors and Organizations EcoCyc Computability Steven Eker: Minimal Nutrient Prediction Palsson Group: Reconciliation of Models Compound Class Representation

SRI International Bioinformatics Data Exchange Java API and Perl API : read & modify BioPAX Export: since Pathway Tools 9.0 Biopax.org Export of entire PGDB as Flatfiles Export of Reactions as SBML -- sbml.org Import/Export of Pathways: between PGDBs Import/Export of Selected Frames, for Spreadsheets Import/Export of Compounds as Molfile, CML Registering/Publishing PGDBs on WWW BioWarehouse : Loader for Flatfiles, SQL access

SRI International Bioinformatics Programmatic Access to BioCyc Common LISP Native language of Pathway Tools Interactive & Mature Environment Full Access to the Data & Many Utility Functions Source code is available for academics PerlCyc API of Functions, Exposed to Perl Communication through UNIX Socket JavaCyc API of Functions, Exposed to Java Communication through UNIX Socket

SRI International Bioinformatics BioCyc Schema Basics Object-Oriented, Class & Instance Frames Frame Representation System, named Ocelot Query API: GFP (Generic Frame Protocol) one KB per PGDB, persistently stored as: Preloaded into Runtimes Ocelot file: single-user RDBMS: MySQL-4 or Oracle-10 : multi-user, change-logging Frames have named Slots Slots Single or Multiple Values Numbers, Strings, or Pointers to other Frames (pot. w/ Inverse) Slot Units define the Properties of a Slot Annotations on Slot Values used for Stoichiometry Coefficients, Compartments, Citations, etc.

SRI International Bioinformatics Principal Classes 1 Class names are capitalized, plural, separated by dashes Genetic-Elements, with subclasses: Chromosomes Plasmids Genes Transcription-Units RNAs rRNAs, snRNAs, tRNAs, Charged-tRNAs Proteins, with subclasses: Polypeptides Protein-Complexes

SRI International Bioinformatics Principal Classes 2 Reactions, with subclasses: Transport-Reactions Enzymatic-Reactions Pathways Compounds-And-Elements

SRI International Bioinformatics Slot Links Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhA sdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle product component-of catalyzes reaction in-pathway

SRI International Bioinformatics Example of a Single GFP Call The General Pattern: (gfp-call frame-ID slot-ID value...) LISP (get-slot-values 'TRYPSYN-RXN 'LEFT) ==> (INDOLE-3-GLYCEROL-P SER) Perl my $cyc = perlcyc -> new("ECOLI"); = $cyc -> get_slot_all_values("Trypsyn-Rxn", “Left”);

SRI International Bioinformatics More Information Pathway Tools WWW Site, Tutorial Slides PerlCyc & JavaCyc API, includes some relationships genopath/9.5/lisp/relationships.lisp Pathway Tools User’s Guide, Volume I Appendix A: Guide to the Pathway Tools Schema Curator's Guide: Compound Classes, Polymerization

SRI International Bioinformatics Simple Query Example Perl use perlcyc; my $cyc = perlcyc -> new("ECOLI"); = $cyc -> get_class_all_instances("|Enzymatic-Reactions|"); ## We check every instance of the class foreach my $er { ## We test for whether the INHIBITORS-ALL ## slot contains the compound frame ATP my $bool = $cyc -> member_slot_value_p($er, "Inhibitors-All", "Atp"); if ($bool) { ## Whenever the test is positive, we collect the value of the slot ENZYME. ## The results are printed in the terminal. my $enz = $cyc -> get_slot_value($er, "Enzyme"); print STDOUT "$enz\n"; }

SRI International Bioinformatics Simple Query Example LISP (defun atp-inhibits () ;; We check every instance of the class (loop for x in (get-class-all-instances '|Enzymatic-Reactions|) ;; We test for whether the INHIBITORS-ALL ;; slot contains the compound frame ATP when (member-slot-value-p x 'INHIBITORS-ALL 'ATP) ;; Whenever the test is positive, we collect the value of the slot ENZYME. ;; The collected values are returned as a list, once the loop terminates. collect (get-slot-value x 'ENZYME) ) ) ;;; invoking the query: (select-organism :org-id 'ECOLI) (atp-inhibits) (get-slot-values 'TRYPSYN-RXN 'LEFT) ==> (INDOLE-3-GLYCEROL-P SER)

SRI International Bioinformatics relationships.lisp File contains many useful relations. They encapsulate common query building blocks and intricacies of navigating the schema. enzymes-of-gene reactions-of-gene pathways-of-gene genes-of-pathway pathway-hole-p reactions-of-compound cpd :non-specific-too? top-containers protein all-rxns type (:metab-smm :metab-all :metab-pathways :enzyme :transport etc.)

SRI International Bioinformatics Chokepoint Example For Antibiotic Target Development Find Strategic Essential Weak Links in Metabolism Many Compounds have just 1 Producing and consuming reaction (defun chokepoint-1 () (remove-duplicates (loop for cpd in (remove-if-not #'coercible-to-frame-p (all-substrates (all-rxns))) when (= 1 (length (get-slot-values cpd 'APPEARS-IN-LEFT-SIDE-OF)) (length (get-slot-values cpd 'APPEARS-IN-RIGHT-SIDE-OF))) collect (get-slot-value cpd 'APPEARS-IN-LEFT-SIDE-OF) and collect (get-slot-value cpd 'APPEARS-IN-RIGHT-SIDE-OF) ) :test #'fequal) ) ;;; invoking the query: (length (chokepoint-1)) ==> 348

SRI International Bioinformatics Substring Search Example Find all that genes that contain a given substring within their common name or synonym list. (defun find-gene-by-substring (substring) (let (result) (loop for g in (get-class-all-instances '|Genes|) do (loop for name in (get-slot-values g 'names) when (search substring name :test #'string-equal) do (pushnew g result) ) ) result ) )

SRI International Bioinformatics Acknowledgements SRI Peter Karp, Suzanne Paley, John Pick, Ingrid Keseler, Ron Caspi, Michelle Green, Carol Fulcher EcoCyc Project J. Collado-Vides, J. Ingraham, I. Paulsen, M. Saier MetaCyc Project Sue Rhee, Lukas Mueller, Peifen Zhang, Hartmut Foerster Funding sources: NIH National Center for Research Resources NIH National Institute of General Medical Sciences NIH National Human Genome Research Institute Department of Energy Microbial Cell Project DARPA BioSpice, UPC BioCyc.org