The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon.

Slides:



Advertisements
Similar presentations
Editing Pathway/Genome Databases. SRI International Bioinformatics Pathway Tools Paradigm Separate database from user interface Navigator provides one.
Advertisements

The Pathway/Genome Navigator (These slides are a guide as you experiment with the Navigator)
1 SRI International Bioinformatics The Ocelot Frame Knowledge Representation System Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
Computing with Pathway/Genome Databases
SRI International Bioinformatics 1 Web Services. SRI International Bioinformatics 2 Kinds of Web Services Data retrieval Web Services l PTools-XML l BioPAX.
SRI International Bioinformatics Data Import / Export Markus Krummenacker Bioinformatics Research Group SRI, International Q
SRI International Bioinformatics Comparative Analysis Q
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
Overview of the Pathway Tools Software and Pathway/Genome Databases.
Pathway Bioinformatics Peter D. Karp, PhD Bioinformatics Research Group SRI International Menlo Park, CA BioCyc.org.
Overview of the Pathway Tools Software and Pathway/Genome Databases.
SRI International Bioinformatics 1 The consistency Checker, or Overhauling a PGDB By Ron Caspi.
Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International
The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon.
Overview of Genome Databases Peter D. Karp, Ph.D. SRI International www-db.stanford.edu/dbseminar/seminar.html.
Interoperation of Molecular Biology Databases Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International Menlo Park, CA
Introduction to the Pathway Tools Software David Walsh and Simon Eng bigDATA Workshop—May 29, 2010.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
August 29, 2002InforMax Confidential1 Vector PathBlazer Product Overview.
陳虹瑋 國立陽明大學 生物資訊學程 Genome Engineering Lab. Genome Engineering Lab The Newest.
Pathway/Genome Databases and Software Tools Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International
Update on The Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org.
Creating a … Community Database Organism-Specific Database Model-Organism Database.
SRI International Bioinformatics 1 Gene Ontology in Pathway Tools: Internals.
Computational Exploration of Metabolic Networks with Pathway Tools Part 1: Overview & Representations Suzanne Paley Bioinformatics Research Group SRI International.
PathoLogic Pathway Predictor. SRI International Bioinformatics Inference of Metabolic Pathways Pathway/Genome Database Annotated Genomic Sequence Genes/ORFs.
Integration of E. Coli Data (E. coli Pathway and Genomic Data from BioCyc) Jesse Walsh.
1 SRI International Bioinformatics BioCyc Tutorial Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
SRI International Bioinformatics 1 Pathway Tools: Recent Developments GMOD Meeting, June 2006.
Computational Exploration of Metabolic Networks with Pathway Tools Part 2: APIs & Examples Randy Gobbel, Ph.D. Bioinformatics Research Group SRI International.
The Pathway Tools Ontology and Inferencing Layer Peter D. Karp, Ph.D. SRI International.
Building an Ontology of Semantic Web Techniques Utilizing RDF Schema and OWL 2.0 in Protégé 4.0 Presented by: Naveed Javed Nimat Umar Syed.
SRI International Bioinformatics 1 The PerlCyc and JavaCyc APIs.
The BioCyc Collection of Pathway/Genome Databases Alexander Shearer Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
The Pathway/Genome Navigator (These slides are a guide as you experiment with the Navigator)
SRI International Bioinformatics 1 Advanced Editing of Pathway/Genome Databases Ron Caspi.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
The consistency Checker, or Overhauling a PGDB By Ron Caspi.
MetaCyc and AraCyc: Plant Metabolic Databases Hartmut Foerster Carnegie Institution.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
SRI International Bioinformatics 1 Regulation in Pathway Tools Pathway Tools Workshop August 2009.
Copyright © 1997 Pangea Systems, Inc. All rights reserved. Pathway Tools Training Course.
3 Copyright © 2004, Oracle. All rights reserved. Working in the Forms Developer Environment.
SRI International Bioinformatics 1 Genome Browser Tomer Altman Bioinformatics Research Group SRI, International August 19th, 2009.
The Pathway/Genome Navigator. SRI International Bioinformatics Overview Data page types General query strategies Web queries Desktop Pathway Tools User.
Overview of the Pathway Tools Software and Pathway/Genome Databases Peter D. Karp Bioinformatics Research Group SRI International
Writing Programs that Analyze Pathway/Genome Databases Markus Krummenacker Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics 1 The Structured Advanced Query Page Mario Latendresse Tomer Altman Bioinformatics Research Group SRI International March,
Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi.
SRI International Bioinformatics Update your computers! To install a patch: Tools => Instant Patch => Download and Activate All Patches.
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
CS223: Software Engineering Lecture 13: Software Architecture.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
SRI International Bioinformatics 1 The Structured Advanced Query Page Tomer Altman Mario Latendresse Bioinformatics Research Group SRI International April.
Recent Developments and Future Directions in Pathway Tools Peter D. Karp SRI International.
PythonCyc and other APIs A Python package to access Pathway Tools and its data using the Python programming language Mario Latendresse March 2016.
The Pathway/Genome Navigator
Editing Pathway/Genome Databases
Comparative Analysis in BioCyc
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
The Pathway Tools Schema
How to Administer a PGDB
Data Exchange Java API and Perl API : read & modify
Comparative Analysis Q
Incremental PathoLogic
Propagating Changed Annotation and Pathway Information
SRI Bioinformatics Research Group
Presentation transcript:

The Pathway Tools Schema

SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon the software being able to find precise information in precise places within a Pathway/Genome DB When writing complex queries to PGDBs, those queries must name classes and slots within the schema A Pathway/Genome Database is a web of interconnected objects; each object represents a biological entity

SRI International Bioinformatics Reference Pathway Tools User’s Guide, Volume I l Appendix A: Guide to the Pathway Tools Schema

SRI International Bioinformatics Web of Relationships for One Enzyme Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhAsdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle

SRI International Bioinformatics Frame Data Model Frame Data Model -- organizational structure for a PGDB Knowledge base (KB, Database, DB) Frames Slots Facets Annotations

SRI International Bioinformatics Knowledge Base Collection of frames and their associated slots, values, facets, and annotations AKA: Database, PGDB Can be stored within l An Oracle DB l A disk file l A Pathway Tools binary program

SRI International Bioinformatics Frames Entities with which facts are associated Kinds of frames: l Classes: Genes, Pathways, Biosynthetic Pathways l Instances (objects): trpA, TCA cycle Classes: l Superclass(es) l Subclass(es) l Instance(s) A symbolic frame name (id, key) uniquely identifies each frame

SRI International Bioinformatics Frame IDs Naming conventions for frame IDs Uniqueness of frame IDs l Frame IDs must be unique within a PGDB l Goal: Same frame ID within different PGDBs should refer to the same biological entity l Because many frames are imported from MetaCyc, this helps ensure consistency of frame names l Frame IDs for newly created frames (not imported) are generated by Pathway Tools u Those frame IDs contain a PGDB-specific identifier u Example: CPLXzz-nnnn CPLXB3-0035

SRI International Bioinformatics Slots Encode attributes/properties of a frame l Integer, real number, string, symbols Represent relationships between frames l The value of a slot is the identifier of another frame Every slot is described by a “slot frame” in a KB that defines meta information about that slot

SRI International Bioinformatics Slot Links Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhAsdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle product component-of catalyzes reaction in-pathway

SRI International Bioinformatics Slots Number of values l Single valued l Multivalued: sets, bags Slot values l Any LISP object: Integer, real, string, symbol (frame name) Slotunits define properties of slots: datatypes, classes, constraints Two slots are inverses if they encode opposite relationships l Slot Product in class Genes l Slot Gene in class Polypeptides

SRI International Bioinformatics Representation of Function Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhAsdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle EC# K eq Cofactors Inhibitors Molecular wt pI Left-end-position

SRI International Bioinformatics Monofunctional Monomer Gene Reaction Enzymatic-reaction Monomer Pathway

SRI International Bioinformatics Bifunctional Monomer Gene Reaction Enzymatic-reaction Monomer Pathway Reaction Enzymatic-reaction

SRI International Bioinformatics Monofunctional Multimer Monomer Gene Reaction Enzymatic-reaction Multimer Pathway

SRI International Bioinformatics Pathway and Substrates Reactant-1 Reaction Pathway Reaction Reactant-2 Product-2 Product-1 in-pathway left right

SRI International Bioinformatics Transcriptional Regulation site001 pro001 trpE trpD trpC trpB trpA trpL Int003RpoSig70 TrpR*trpInt001 trpLEDCBA trp apoTrpR Int005

SRI International Bioinformatics Annotations Encode information about individual slot values Used to attach comments and citations to slot values Example: l Frame tryptophan-synthetase has a slot called Molecular- Weight with a value of 28 l Attached to that value is an annotation whose label is Citation and whose value is “[ ]”

SRI International Bioinformatics Facets Encode information about slots Allow association between a slot and: l comments l citations Example: Comment attached to Inhibitors of EnzRxn Allow access to schema information

SRI International Bioinformatics Principle Classes Class names are capitalized, plural, separated by dashes Genetic-Elements, with subclasses: l Chromosomes l Plasmids Genes Transcription-Units RNAs l rRNAs, snRNAs, tRNAs, Charged-tRNAs Proteins, with subclasses: l Polypeptides l Protein-Complexes

SRI International Bioinformatics Principle Classes Reactions, with subclasses: l Transport-Reactions Enzymatic-Reactions Pathways Compounds-And-Elements

SRI International Bioinformatics Slots in Multiple Classes Common-Name Synonyms Comment Citations DB-Links

SRI International Bioinformatics Genes Slots Component-Of (links to replicon, transcription unit) Left-End-Position Right-End-Position Centisome-Position Transcription-Direction Product

SRI International Bioinformatics Proteins Slots Molecular-Weight-Seq Molecular-Weight-Exp pI Locations Modified-Form Unmodified-Form Component-Of

SRI International Bioinformatics Polypeptides Slots Gene

SRI International Bioinformatics Protein-Complexes Slots Components

SRI International Bioinformatics Reactions Slots EC-Number Left, Right DeltaG0 Keq Spontaneous?

SRI International Bioinformatics Enzymatic-Reactions Slots Enzyme Reaction Activators Inhibitors Physiologically-Relevant Cofactors Prosthetic-Groups Alternative-Substrates Alternative-Cofactors

SRI International Bioinformatics Pathways Slots Reaction-List Predecessors Primaries

SRI International Bioinformatics GKB Editor Browse class hierarchy and slot definitions Tools -> Ontology Browser GKB Editor described at l

Pathway Tools Data Access Mechanisms

SRI International Bioinformatics Introduction MANY ways to access and update PGDBs APIs in Java, Perl, and Lisp Import/export of files in many formats Registry of Pathway/Genome Databases Import PGDB data into BioWarehouse Updating a PGDB from an external genome DB

SRI International Bioinformatics Pathway Tools APIs Support programmatic queries and updates to PGDBs APIs in Java, Perl, and Lisp all provide access to a common set of procedures: l Generic Frame Protocol -- Ocelot object database API l Additional Pathway Tools functions For more information see l

SRI International Bioinformatics Generic Frame Protocol (GFP) A library of procedures for accessing Ocelot DBs GFP specification: l A small number of GFP functions are sufficient for most complex queries Knowledge of Pathway Tools schema is critical for using the APIs: l Appendix I of Pathway Tools User’s Guide, Vol I

SRI International Bioinformatics Generic Frame Protocol get-class-all-instances (Class) l Returns the instances of Class Key Pathway Tools classes: l Genetic-Elements l Genes l Proteins l Polypeptides (a subclass of Proteins) l Protein-Complexes (a subclass of Proteins) l Pathways l Reactions l Compounds-And-Elements l Enzymatic-Reactions l Transcription-Units l Promoters l DNA-Binding-Sites

SRI International Bioinformatics Generic Frame Protocol Notation Frame.Slot means a specified slot of a specified frame get-slot-value(Frame Slot) l Returns first value of Frame.Slot get-slot-values(Frame Slot) l Returns all values of Frame.Slot as a list slot-has-value-p(Frame Slot) l Returns T if Frame.Slot has at least one value member-slot-value-p(Frame Slot Value) l Returns T if Value is one of the values of Frame.Slot print-frame(Frame) l Prints the contents of Frame Note: Frame and Slot must be symbols!

SRI International Bioinformatics Generic Frame Protocol coercible-to-frame-p (Thing) l Returns T if Thing is the name of a frame, or a frame object save-kb l Saves the current KB

SRI International Bioinformatics Generic Frame Protocol – Update Operations put-slot-value(Frame Slot Value) l Replace the current value(s) of Frame.Slot with Value put-slot-values(Frame Slot Value-List) l Replace the current value(s) of Frame.Slot with Value-List, which must be a list of values add-slot-value(Frame Slot Value) l Add Value to the current value(s) of Frame.Slot, if any remove-slot-value(Frame Slot Value) l Remove Value from the current value(s) of Frame.slot replace-slot-value(Frame Slot Old-Value New-Value) l In Frame.Slot, replace Old-Value with New-Value remove-local-slot-values(Frame Slot) l Remove all of the values of Frame.Slot

SRI International Bioinformatics Additional Pathway Tools Functions – Semantic Inference Layer Semantic inference layer defines built-in functions to compute commonly required relationships in a PGDB fns.html fns.html

SRI International Bioinformatics Internal note Note: Refer to local copy of ptools-fns.html to go through the semantic inference layer fns

SRI International Bioinformatics File Import/Export Capabilities PGDBs can be exported in whole or part to: l SBML – Systems Biology Markup Language – sbml.org u Import supported by many simulation packages u File -> Export -> Selected Reactions to SBML File l Pathway Tools Attribute-Value format and column-delimited format files u u Dump entire PGDB to a suite of files: File -> Export -> Entire DB to Flat Files u Dump selected frames to a single file: File -> Export -> Selected Frames to File

SRI International Bioinformatics Import/Export Import from attribute-value or column-delimited files l File -> Import -> Frames From File Import/Export to/from internal Pathway Tools format that allows pathways, reactions, enzymes, and compounds to be easily moved between Pathway Tools installations u Edit -> Add Pathway to File Export List u File -> Export -> Selected Pathways to File u File -> Import -> Pathways from File Import/Export to/from MDL molfile format l Edit -> Import compound structure from molfile l Edit -> Export compound structure to molfile

SRI International Bioinformatics Miscellaneous Exports Overview -> Highlight -> Save to File Overview -> Highlight -> Load from File Gene / Protein Sequence / Save to file Chromosome -> Show Sequence of a Segment of Replicon

SRI International Bioinformatics Napster Comes to Bioinformatics Public sharing of Pathway/Genome Databases l PGDB registry maintained by SRI at URL Registry operations l List contents of registry l Download PGDBs listed in the registry l Register PGDBs you have created

SRI International Bioinformatics Registry Details Why register your PGDB? l Declare existence of your PGDB in a central location l Facilitate download by other scientists Why download a PGDB? l Desktop Navigator provides more functionality than Web l Comparative operations l Programmatic querying and processing of PGDB Registration process l Registered PGDBs have open availability by default l Authors can provide their own license agreements l Registered PGDBs reside on authors’ FTP site

SRI International Bioinformatics BioWarehouse Biospice.org

SRI International Bioinformatics New Import/Export Tools Suggestions? Volunteers?

SRI International Bioinformatics Updating a PGDB From an External Genome DB Example: AraCyc forms a pathway module to the TAIR DB TAIR is authoritative source for gene and gene- product information Update AraCyc to reflect updates in TAIR

SRI International Bioinformatics Proposed Approach Export TAIR to PathoLogic files Build AraCyc2 from those PathoLogic files – automated PathoLogic only Compare AraCyc1 (A1) to AraCyc2 (A2) A. Import new genes/proteins from A2 to A1 B. Delete from A1 genes/proteins not found in A2 C. Rename genes/proteins whose names changed from A2 to A1 l Run name matcher on A1’ l Check for pathways with no enzymes and report them so user can keep any that otherwise PathoLogic will delete u What about enzymes that were assigned to a pathway by the hole filler? l Re-run pathway predictor l Remember what pathways user deletes so they are not re-predicted by PathoLogic l Consider movement of genes from contig to chromosome