Download presentation
Presentation is loading. Please wait.
Published bySherilyn Stokes Modified over 8 years ago
1
The Pathway Tools Schema
2
SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon the software being able to find precise information in precise places within a Pathway/Genome DB When writing complex queries to PGDBs, those queries must name classes and slots within the schema A Pathway/Genome Database is a web of interconnected objects; each object represents a biological entity
3
SRI International Bioinformatics Reference Pathway Tools User’s Guide, Volume I l Appendix A: Guide to the Pathway Tools Schema
4
SRI International Bioinformatics Web of Relationships for One Enzyme Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhAsdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle
5
SRI International Bioinformatics Frame Data Model Frame Data Model -- organizational structure for a PGDB Knowledge base (KB, Database, DB) Frames Slots Facets Annotations
6
SRI International Bioinformatics Knowledge Base Collection of frames and their associated slots, values, facets, and annotations AKA: Database, PGDB Can be stored within l An Oracle DB l A disk file l A Pathway Tools binary program
7
SRI International Bioinformatics Frames Entities with which facts are associated Kinds of frames: l Classes: Genes, Pathways, Biosynthetic Pathways l Instances (objects): trpA, TCA cycle Classes: l Superclass(es) l Subclass(es) l Instance(s) A symbolic frame name (id, key) uniquely identifies each frame
8
SRI International Bioinformatics Frame IDs Naming conventions for frame IDs Uniqueness of frame IDs l Frame IDs must be unique within a PGDB l Goal: Same frame ID within different PGDBs should refer to the same biological entity l Because many frames are imported from MetaCyc, this helps ensure consistency of frame names l Frame IDs for newly created frames (not imported) are generated by Pathway Tools u Those frame IDs contain a PGDB-specific identifier u Example: CPLXzz-nnnn CPLXB3-0035
9
SRI International Bioinformatics Slots Encode attributes/properties of a frame l Integer, real number, string, symbols Represent relationships between frames l The value of a slot is the identifier of another frame Every slot is described by a “slot frame” in a KB that defines meta information about that slot
10
SRI International Bioinformatics Slot Links Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhAsdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle product component-of catalyzes reaction in-pathway
11
SRI International Bioinformatics Slots Number of values l Single valued l Multivalued: sets, bags Slot values l Any LISP object: Integer, real, string, symbol (frame name) Slotunits define properties of slots: datatypes, classes, constraints Two slots are inverses if they encode opposite relationships l Slot Product in class Genes l Slot Gene in class Polypeptides
12
SRI International Bioinformatics Representation of Function Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhAsdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle EC# K eq Cofactors Inhibitors Molecular wt pI Left-end-position
13
SRI International Bioinformatics Monofunctional Monomer Gene Reaction Enzymatic-reaction Monomer Pathway
14
SRI International Bioinformatics Bifunctional Monomer Gene Reaction Enzymatic-reaction Monomer Pathway Reaction Enzymatic-reaction
15
SRI International Bioinformatics Monofunctional Multimer Monomer Gene Reaction Enzymatic-reaction Multimer Pathway
16
SRI International Bioinformatics Pathway and Substrates Reactant-1 Reaction Pathway Reaction Reactant-2 Product-2 Product-1 in-pathway left right
17
SRI International Bioinformatics Transcriptional Regulation site001 pro001 trpE trpD trpC trpB trpA trpL Int003RpoSig70 TrpR*trpInt001 trpLEDCBA trp apoTrpR Int005
18
SRI International Bioinformatics Annotations Encode information about individual slot values Used to attach comments and citations to slot values Example: l Frame tryptophan-synthetase has a slot called Molecular- Weight with a value of 28 l Attached to that value is an annotation whose label is Citation and whose value is “[3444332]”
19
SRI International Bioinformatics Facets Encode information about slots Allow association between a slot and: l comments l citations Example: Comment attached to Inhibitors of EnzRxn Allow access to schema information
20
SRI International Bioinformatics Principle Classes Class names are capitalized, plural, separated by dashes Genetic-Elements, with subclasses: l Chromosomes l Plasmids Genes Transcription-Units RNAs l rRNAs, snRNAs, tRNAs, Charged-tRNAs Proteins, with subclasses: l Polypeptides l Protein-Complexes
21
SRI International Bioinformatics Principle Classes Reactions, with subclasses: l Transport-Reactions Enzymatic-Reactions Pathways Compounds-And-Elements
22
SRI International Bioinformatics Slots in Multiple Classes Common-Name Synonyms Comment Citations DB-Links
23
SRI International Bioinformatics Genes Slots Component-Of (links to replicon, transcription unit) Left-End-Position Right-End-Position Centisome-Position Transcription-Direction Product
24
SRI International Bioinformatics Proteins Slots Molecular-Weight-Seq Molecular-Weight-Exp pI Locations Modified-Form Unmodified-Form Component-Of
25
SRI International Bioinformatics Polypeptides Slots Gene
26
SRI International Bioinformatics Protein-Complexes Slots Components
27
SRI International Bioinformatics Reactions Slots EC-Number Left, Right DeltaG0 Keq Spontaneous?
28
SRI International Bioinformatics Enzymatic-Reactions Slots Enzyme Reaction Activators Inhibitors Physiologically-Relevant Cofactors Prosthetic-Groups Alternative-Substrates Alternative-Cofactors
29
SRI International Bioinformatics Pathways Slots Reaction-List Predecessors Primaries
30
SRI International Bioinformatics GKB Editor Browse class hierarchy and slot definitions Tools -> Ontology Browser GKB Editor described at l http://www.ai.sri.com/~gkb/user-man.html http://www.ai.sri.com/~gkb/user-man.html
31
Pathway Tools Data Access Mechanisms
32
SRI International Bioinformatics Introduction MANY ways to access and update PGDBs APIs in Java, Perl, and Lisp Import/export of files in many formats Registry of Pathway/Genome Databases Import PGDB data into BioWarehouse Updating a PGDB from an external genome DB
33
SRI International Bioinformatics Pathway Tools APIs Support programmatic queries and updates to PGDBs APIs in Java, Perl, and Lisp all provide access to a common set of procedures: l Generic Frame Protocol -- Ocelot object database API l Additional Pathway Tools functions For more information see l http://bioinformatics.ai.sri.com/ptools/ptools-resources.html http://bioinformatics.ai.sri.com/ptools/ptools-resources.html
34
SRI International Bioinformatics Generic Frame Protocol (GFP) A library of procedures for accessing Ocelot DBs GFP specification: l http://www.ai.sri.com/~gfp/spec/paper/paper.html A small number of GFP functions are sufficient for most complex queries Knowledge of Pathway Tools schema is critical for using the APIs: l Appendix I of Pathway Tools User’s Guide, Vol I
35
SRI International Bioinformatics Generic Frame Protocol get-class-all-instances (Class) l Returns the instances of Class Key Pathway Tools classes: l Genetic-Elements l Genes l Proteins l Polypeptides (a subclass of Proteins) l Protein-Complexes (a subclass of Proteins) l Pathways l Reactions l Compounds-And-Elements l Enzymatic-Reactions l Transcription-Units l Promoters l DNA-Binding-Sites
36
SRI International Bioinformatics Generic Frame Protocol Notation Frame.Slot means a specified slot of a specified frame get-slot-value(Frame Slot) l Returns first value of Frame.Slot get-slot-values(Frame Slot) l Returns all values of Frame.Slot as a list slot-has-value-p(Frame Slot) l Returns T if Frame.Slot has at least one value member-slot-value-p(Frame Slot Value) l Returns T if Value is one of the values of Frame.Slot print-frame(Frame) l Prints the contents of Frame Note: Frame and Slot must be symbols!
37
SRI International Bioinformatics Generic Frame Protocol coercible-to-frame-p (Thing) l Returns T if Thing is the name of a frame, or a frame object save-kb l Saves the current KB
38
SRI International Bioinformatics Generic Frame Protocol – Update Operations put-slot-value(Frame Slot Value) l Replace the current value(s) of Frame.Slot with Value put-slot-values(Frame Slot Value-List) l Replace the current value(s) of Frame.Slot with Value-List, which must be a list of values add-slot-value(Frame Slot Value) l Add Value to the current value(s) of Frame.Slot, if any remove-slot-value(Frame Slot Value) l Remove Value from the current value(s) of Frame.slot replace-slot-value(Frame Slot Old-Value New-Value) l In Frame.Slot, replace Old-Value with New-Value remove-local-slot-values(Frame Slot) l Remove all of the values of Frame.Slot
39
SRI International Bioinformatics Additional Pathway Tools Functions – Semantic Inference Layer Semantic inference layer defines built-in functions to compute commonly required relationships in a PGDB http://bioinformatics.ai.sri.com/ptools/ptools- fns.html http://bioinformatics.ai.sri.com/ptools/ptools- fns.html
40
SRI International Bioinformatics Internal note Note: Refer to local copy of ptools-fns.html to go through the semantic inference layer fns
41
SRI International Bioinformatics File Import/Export Capabilities PGDBs can be exported in whole or part to: l SBML – Systems Biology Markup Language – sbml.org u Import supported by many simulation packages u File -> Export -> Selected Reactions to SBML File l Pathway Tools Attribute-Value format and column-delimited format files u http://brg.ai.sri.com/ptools/flatfile-format.shtml http://brg.ai.sri.com/ptools/flatfile-format.shtml u Dump entire PGDB to a suite of files: File -> Export -> Entire DB to Flat Files u Dump selected frames to a single file: File -> Export -> Selected Frames to File
42
SRI International Bioinformatics Import/Export Import from attribute-value or column-delimited files l File -> Import -> Frames From File Import/Export to/from internal Pathway Tools format that allows pathways, reactions, enzymes, and compounds to be easily moved between Pathway Tools installations u Edit -> Add Pathway to File Export List u File -> Export -> Selected Pathways to File u File -> Import -> Pathways from File Import/Export to/from MDL molfile format l Edit -> Import compound structure from molfile l Edit -> Export compound structure to molfile
43
SRI International Bioinformatics Miscellaneous Exports Overview -> Highlight -> Save to File Overview -> Highlight -> Load from File Gene / Protein Sequence / Save to file Chromosome -> Show Sequence of a Segment of Replicon
44
SRI International Bioinformatics Napster Comes to Bioinformatics Public sharing of Pathway/Genome Databases l PGDB registry maintained by SRI at URL http://biocyc.org/registry.html http://biocyc.org/registry.html Registry operations l List contents of registry l Download PGDBs listed in the registry l Register PGDBs you have created
45
SRI International Bioinformatics Registry Details Why register your PGDB? l Declare existence of your PGDB in a central location l Facilitate download by other scientists Why download a PGDB? l Desktop Navigator provides more functionality than Web l Comparative operations l Programmatic querying and processing of PGDB Registration process l Registered PGDBs have open availability by default l Authors can provide their own license agreements l Registered PGDBs reside on authors’ FTP site
46
SRI International Bioinformatics BioWarehouse Biospice.org
47
SRI International Bioinformatics New Import/Export Tools Suggestions? Volunteers?
48
SRI International Bioinformatics Updating a PGDB From an External Genome DB Example: AraCyc forms a pathway module to the TAIR DB TAIR is authoritative source for gene and gene- product information Update AraCyc to reflect updates in TAIR
49
SRI International Bioinformatics Proposed Approach Export TAIR to PathoLogic files Build AraCyc2 from those PathoLogic files – automated PathoLogic only Compare AraCyc1 (A1) to AraCyc2 (A2) A. Import new genes/proteins from A2 to A1 B. Delete from A1 genes/proteins not found in A2 C. Rename genes/proteins whose names changed from A2 to A1 l Run name matcher on A1’ l Check for pathways with no enzymes and report them so user can keep any that otherwise PathoLogic will delete u What about enzymes that were assigned to a pathway by the hole filler? l Re-run pathway predictor l Remember what pathways user deletes so they are not re-predicted by PathoLogic l Consider movement of genes from contig to chromosome
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.