The Pathway Tools Schema
Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon the software being able to find precise information in precise places within a Pathway/Genome DB A Pathway/Genome Database is a web of interconnected objects; each object represents a biological entity
Motivations for Understanding Pathway Tools Schema When writing complex queries to PGDBs, those queries must refer to classes and slots within the schema Queries using Lisp, Perl, Java APIs Queries using Query Page Queries using Structured Advanced Query Form
References Pathway Tools User’s Guide Appendix A: Guide to the Pathway Tools Schema Ontology Papers section of http://biocyc.org/publications.shtml “The outcomes of pathway database computations depend on pathway ontology” "An Evidence Ontology for use in Pathway/Genome Databases," "An ontology for biological function based on molecular interactions," "Representations of metabolic knowledge: Pathways," "Representations of metabolic knowledge,"
Pathway Tools Ontology / Schema Ontology classes: 1621 Many datatypes from genomes to pathways Classification schemes for pathways, chemical compounds, enzymatic reactions (EC system) Cell Component Ontology Protein Feature ontology Comprehensive set of 221 attributes and relationships Evidence codes, supporting citations
Root Classes in the Pathway Tools Ontology Chemicals -- All molecules Polymer-Segments -- Regions of polymers Protein-Features -- Features on proteins Paralogous-Gene-Groups Organisms Enzymatic-Reactions -- Link enzymes to reactions they catalyze Generalized-Reactions -- Reactions and pathways Regulation -- Defines regulatory interactions CCO -- Cell Component Ontology Evidence -- Evidence ontology Notes -- Timestamped, person-stamped notes Organizations People Publications
Principle Classes Class names are capitalized, plural, separated by dashes Genetic-Elements, with subclasses: Chromosomes Plasmids Genes Transcription-Units RNAs rRNAs, snRNAs, tRNAs, Charged-tRNAs, Regulatory-RNAs Proteins, with subclasses: Polypeptides Protein-Complexes
Principle Classes Reactions, with subclasses: Transport-Reactions Enzymatic-Reactions Pathways Compounds-And-Elements
Web of Relationships for One Enzyme TCA Cycle Succinate + FAD = fumarate + FADH2 Enzymatic-reaction Succinate dehydrogenase Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2 sdhA sdhB sdhC sdhD
Representation of Function TCA Cycle EC# Keq Cofactors Molecular-Weight-Seq Molecular-Weight-Exp pI Succinate + FAD = fumarate + FADH2 Enzymatic-reaction Succinate dehydrogenase Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2 sdhA sdhB sdhC sdhD Left-end-position
Monofunctional Monomer Pathway Reaction Enzymatic-reaction Monomer Gene
Bifunctional Monomer Pathway Reaction Reaction Enzymatic-reaction Gene
Monofunctional Multimer Pathway Reaction Enzymatic-reaction Multimer Monomer Monomer Monomer Monomer Gene Gene Gene Gene
Pathway and Substrates Reactant-1 Pathway left in-pathway Reactant-2 Reaction Reaction Reaction Reaction Product-1 right Product-2
Regulation Reorganization and expansion of regulation under way in Pathway Tools Initial application to EcoCyc Class Regulation with subclasses that describe different biochemical mechanisms of regulation Slots: Regulator Regulated-Entity Mode Mechanism
Regulation of Enzyme Activity Class Regulation-of-Enzyme-Activity Each instance of the class describes one regulatory interaction Slots: Regulator -- usually a small molecule Regulated-Entity -- an Enzymatic-Reaction Mechanism -- One of: Competitive, Uncompetitive, Noncompetitive, Irreversible, Allosteric, Unkmech, Other Mode -- One of: + , -
Transcription Initiation Class Regulation-of-Transcription-Initiation Slots: Regulator -- instance of Proteins or Complexes (a transcription-factor) Regulated-Entity -- instance of Promoters or Transcription-Units or Genes Mode -- One of: + , -
Attenuation Class Transcriptional-Attenuation Several subclasses depending on type of attenuation Slots common to all: Regulator -- Depends on subtype of attenuation Regulated-Entity -- instance of Terminators or Genes or Transcription-Units Mode -- One of: + , -
Attenuation Subtypes Small-Molecule-Mediated-Attenuation Regulator = A small molecule Leader transcript binds small molecule and determines formation of terminator or antiterminator RNA-Polymerase-Modification Regulator = instance of Proteins or Complexes Regulatory protein binds to site in transcription unit and interacts with RNA polymerase to determine termination RNA-Mediated-Attenuation Ribosome-Mediated-Attenuation Rho-Blocking-Antitermination Protein-Mediated-Attenuation
Frame IDs of Instances Instance frame ID conventions have evolved over time Examples: Pathways TRPSYN-PWY, P23-PWY Genes AG10045 Monomers TRPA-MONOMER, AG10045-MONOMER
Slots in Multiple Classes Common-Name Synonyms Names (computed as union of Common-Name, Synonyms) Comment Citations DB-Links
Genes Slots Component-Of (links to replicon, transcription unit) Left-End-Position Right-End-Position Transcription-Direction Product
Proteins Slots Molecular-Weight-Seq Molecular-Weight-Exp pI Locations Modified-Form Unmodified-Form Component-Of
Polypeptides Slots Slots inherited from Proteins Gene
Protein-Complexes Slots Slots inherited from Proteins Components
Reactions Slots EC-Number Left, Right DeltaG0 Keq Spontaneous?
Enzymatic-Reactions Slots Enzyme Reaction Cofactors Prosthetic-Groups Alternative-Substrates
Pathways Slots Reaction-List Predecessors Primaries
Inspecting PGDB Instance Frames Right-click on object handles to find frame-id Show menu allows printing of frames
Inspecting PGDB Schema Invoke GKB Editor Taxonomy Browser: (gkb) or Right-click: Edit Ontology Editor Invoke GKB Editor frame editor: Right-click: Edit Frame Editor Information about GKB Editor: User Guide: http://www.ai.sri.com/~gkb/user-man.html Publication: http://www.ai.sri.com/pkarp/pubs/97gkb.ps