Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon.

Similar presentations


Presentation on theme: "The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon."— Presentation transcript:

1 The Pathway Tools Schema

2 SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon the software being able to find precise information in precise places within a Pathway/Genome DB When writing complex queries to PGDBs, those queries must name classes and slots within the schema A Pathway/Genome Database is a web of interconnected objects; each object represents a biological entity

3 SRI International Bioinformatics Reference Pathway Tools User’s Guide, Volume I l Appendix A: Guide to the Pathway Tools Schema

4 SRI International Bioinformatics Web of Relationships for One Enzyme Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhAsdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle

5 SRI International Bioinformatics Frame Data Model Frame Data Model -- organizational structure for a PGDB Knowledge base (KB, Database, DB) Frames Slots Facets Annotations

6 SRI International Bioinformatics Knowledge Base Collection of frames and their associated slots, values, facets, and annotations AKA: Database, PGDB Can be stored within l An Oracle DB l A disk file l A Pathway Tools binary program

7 SRI International Bioinformatics Frames Entities with which facts are associated Kinds of frames: l Classes: Genes, Pathways, Biosynthetic Pathways l Instances (objects): trpA, TCA cycle Classes: l Superclass(es) l Subclass(es) l Instance(s) A symbolic frame name (id, key) uniquely identifies each frame

8 SRI International Bioinformatics Frame IDs Naming conventions for frame IDs Uniqueness of frame IDs l Frame IDs must be unique within a PGDB l Goal: Same frame ID within different PGDBs should refer to the same biological entity l Because many frames are imported from MetaCyc, this helps ensure consistency of frame names l Frame IDs for newly created frames (not imported) are generated by Pathway Tools u Those frame IDs contain a PGDB-specific identifier u Example: CPLXzz-nnnn CPLXB3-0035

9 SRI International Bioinformatics Slots Encode attributes/properties of a frame l Integer, real number, string, symbols Represent relationships between frames l The value of a slot is the identifier of another frame Every slot is described by a “slot frame” in a KB that defines meta information about that slot

10 SRI International Bioinformatics Slot Links Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhAsdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle product component-of catalyzes reaction in-pathway

11 SRI International Bioinformatics Slots Number of values l Single valued l Multivalued: sets, bags Slot values l Any LISP object: Integer, real, string, symbol (frame name) Slotunits define properties of slots: datatypes, classes, constraints Two slots are inverses if they encode opposite relationships l Slot Product in class Genes l Slot Gene in class Polypeptides

12 SRI International Bioinformatics Representation of Function Sdh-flavoSdh-Fe-SSdh-membrane-1Sdh-membrane-2 sdhAsdhB sdhCsdhD Succinate + FAD = fumarate + FADH 2 Enzymatic-reaction Succinate dehydrogenase TCA Cycle EC# K eq Cofactors Inhibitors Molecular wt pI Left-end-position

13 SRI International Bioinformatics Monofunctional Monomer Gene Reaction Enzymatic-reaction Monomer Pathway

14 SRI International Bioinformatics Bifunctional Monomer Gene Reaction Enzymatic-reaction Monomer Pathway Reaction Enzymatic-reaction

15 SRI International Bioinformatics Monofunctional Multimer Monomer Gene Reaction Enzymatic-reaction Multimer Pathway

16 SRI International Bioinformatics Pathway and Substrates Reactant-1 Reaction Pathway Reaction Reactant-2 Product-2 Product-1 in-pathway left right

17 SRI International Bioinformatics Transcriptional Regulation site001 pro001 trpE trpD trpC trpB trpA trpL Int003RpoSig70 TrpR*trpInt001 trpLEDCBA trp apoTrpR Int005

18 SRI International Bioinformatics Annotations Encode information about individual slot values Used to attach comments and citations to slot values Example: l Frame tryptophan-synthetase has a slot called Molecular- Weight with a value of 28 l Attached to that value is an annotation whose label is Citation and whose value is “[3444332]”

19 SRI International Bioinformatics Facets Encode information about slots Allow association between a slot and: l comments l citations Example: Comment attached to Inhibitors of EnzRxn Allow access to schema information

20 SRI International Bioinformatics Principle Classes Class names are capitalized, plural, separated by dashes Genetic-Elements, with subclasses: l Chromosomes l Plasmids Genes Transcription-Units RNAs l rRNAs, snRNAs, tRNAs, Charged-tRNAs Proteins, with subclasses: l Polypeptides l Protein-Complexes

21 SRI International Bioinformatics Principle Classes Reactions, with subclasses: l Transport-Reactions Enzymatic-Reactions Pathways Compounds-And-Elements

22 SRI International Bioinformatics Slots in Multiple Classes Common-Name Synonyms Comment Citations DB-Links

23 SRI International Bioinformatics Genes Slots Component-Of (links to replicon, transcription unit) Left-End-Position Right-End-Position Centisome-Position Transcription-Direction Product

24 SRI International Bioinformatics Proteins Slots Molecular-Weight-Seq Molecular-Weight-Exp pI Locations Modified-Form Unmodified-Form Component-Of

25 SRI International Bioinformatics Polypeptides Slots Gene

26 SRI International Bioinformatics Protein-Complexes Slots Components

27 SRI International Bioinformatics Reactions Slots EC-Number Left, Right DeltaG0 Keq Spontaneous?

28 SRI International Bioinformatics Enzymatic-Reactions Slots Enzyme Reaction Activators Inhibitors Physiologically-Relevant Cofactors Prosthetic-Groups Alternative-Substrates Alternative-Cofactors

29 SRI International Bioinformatics Pathways Slots Reaction-List Predecessors Primaries

30 SRI International Bioinformatics GKB Editor Browse class hierarchy and slot definitions Tools -> Ontology Browser GKB Editor described at l http://www.ai.sri.com/~gkb/user-man.html http://www.ai.sri.com/~gkb/user-man.html

31 Pathway Tools Data Access Mechanisms

32 SRI International Bioinformatics Introduction MANY ways to access and update PGDBs APIs in Java, Perl, and Lisp Import/export of files in many formats Registry of Pathway/Genome Databases Import PGDB data into BioWarehouse Updating a PGDB from an external genome DB

33 SRI International Bioinformatics Pathway Tools APIs Support programmatic queries and updates to PGDBs APIs in Java, Perl, and Lisp all provide access to a common set of procedures: l Generic Frame Protocol -- Ocelot object database API l Additional Pathway Tools functions For more information see l http://bioinformatics.ai.sri.com/ptools/ptools-resources.html http://bioinformatics.ai.sri.com/ptools/ptools-resources.html

34 SRI International Bioinformatics Generic Frame Protocol (GFP) A library of procedures for accessing Ocelot DBs GFP specification: l http://www.ai.sri.com/~gfp/spec/paper/paper.html A small number of GFP functions are sufficient for most complex queries Knowledge of Pathway Tools schema is critical for using the APIs: l Appendix I of Pathway Tools User’s Guide, Vol I

35 SRI International Bioinformatics Generic Frame Protocol get-class-all-instances (Class) l Returns the instances of Class Key Pathway Tools classes: l Genetic-Elements l Genes l Proteins l Polypeptides (a subclass of Proteins) l Protein-Complexes (a subclass of Proteins) l Pathways l Reactions l Compounds-And-Elements l Enzymatic-Reactions l Transcription-Units l Promoters l DNA-Binding-Sites

36 SRI International Bioinformatics Generic Frame Protocol Notation Frame.Slot means a specified slot of a specified frame get-slot-value(Frame Slot) l Returns first value of Frame.Slot get-slot-values(Frame Slot) l Returns all values of Frame.Slot as a list slot-has-value-p(Frame Slot) l Returns T if Frame.Slot has at least one value member-slot-value-p(Frame Slot Value) l Returns T if Value is one of the values of Frame.Slot print-frame(Frame) l Prints the contents of Frame Note: Frame and Slot must be symbols!

37 SRI International Bioinformatics Generic Frame Protocol coercible-to-frame-p (Thing) l Returns T if Thing is the name of a frame, or a frame object save-kb l Saves the current KB

38 SRI International Bioinformatics Generic Frame Protocol – Update Operations put-slot-value(Frame Slot Value) l Replace the current value(s) of Frame.Slot with Value put-slot-values(Frame Slot Value-List) l Replace the current value(s) of Frame.Slot with Value-List, which must be a list of values add-slot-value(Frame Slot Value) l Add Value to the current value(s) of Frame.Slot, if any remove-slot-value(Frame Slot Value) l Remove Value from the current value(s) of Frame.slot replace-slot-value(Frame Slot Old-Value New-Value) l In Frame.Slot, replace Old-Value with New-Value remove-local-slot-values(Frame Slot) l Remove all of the values of Frame.Slot

39 SRI International Bioinformatics Additional Pathway Tools Functions – Semantic Inference Layer Semantic inference layer defines built-in functions to compute commonly required relationships in a PGDB http://bioinformatics.ai.sri.com/ptools/ptools- fns.html http://bioinformatics.ai.sri.com/ptools/ptools- fns.html

40 SRI International Bioinformatics Internal note Note: Refer to local copy of ptools-fns.html to go through the semantic inference layer fns

41 SRI International Bioinformatics File Import/Export Capabilities PGDBs can be exported in whole or part to: l SBML – Systems Biology Markup Language – sbml.org u Import supported by many simulation packages u File -> Export -> Selected Reactions to SBML File l Pathway Tools Attribute-Value format and column-delimited format files u http://brg.ai.sri.com/ptools/flatfile-format.shtml http://brg.ai.sri.com/ptools/flatfile-format.shtml u Dump entire PGDB to a suite of files: File -> Export -> Entire DB to Flat Files u Dump selected frames to a single file: File -> Export -> Selected Frames to File

42 SRI International Bioinformatics Import/Export Import from attribute-value or column-delimited files l File -> Import -> Frames From File Import/Export to/from internal Pathway Tools format that allows pathways, reactions, enzymes, and compounds to be easily moved between Pathway Tools installations u Edit -> Add Pathway to File Export List u File -> Export -> Selected Pathways to File u File -> Import -> Pathways from File Import/Export to/from MDL molfile format l Edit -> Import compound structure from molfile l Edit -> Export compound structure to molfile

43 SRI International Bioinformatics Miscellaneous Exports Overview -> Highlight -> Save to File Overview -> Highlight -> Load from File Gene / Protein Sequence / Save to file Chromosome -> Show Sequence of a Segment of Replicon

44 SRI International Bioinformatics Napster Comes to Bioinformatics Public sharing of Pathway/Genome Databases l PGDB registry maintained by SRI at URL http://biocyc.org/registry.html http://biocyc.org/registry.html Registry operations l List contents of registry l Download PGDBs listed in the registry l Register PGDBs you have created

45 SRI International Bioinformatics Registry Details Why register your PGDB? l Declare existence of your PGDB in a central location l Facilitate download by other scientists Why download a PGDB? l Desktop Navigator provides more functionality than Web l Comparative operations l Programmatic querying and processing of PGDB Registration process l Registered PGDBs have open availability by default l Authors can provide their own license agreements l Registered PGDBs reside on authors’ FTP site

46 SRI International Bioinformatics BioWarehouse Biospice.org

47 SRI International Bioinformatics New Import/Export Tools Suggestions? Volunteers?

48 SRI International Bioinformatics Updating a PGDB From an External Genome DB Example: AraCyc forms a pathway module to the TAIR DB TAIR is authoritative source for gene and gene- product information Update AraCyc to reflect updates in TAIR

49 SRI International Bioinformatics Proposed Approach Export TAIR to PathoLogic files Build AraCyc2 from those PathoLogic files – automated PathoLogic only Compare AraCyc1 (A1) to AraCyc2 (A2) A. Import new genes/proteins from A2 to A1 B. Delete from A1 genes/proteins not found in A2 C. Rename genes/proteins whose names changed from A2 to A1 l Run name matcher on A1’ l Check for pathways with no enzymes and report them so user can keep any that otherwise PathoLogic will delete u What about enzymes that were assigned to a pathway by the hole filler? l Re-run pathway predictor l Remember what pathways user deletes so they are not re-predicted by PathoLogic l Consider movement of genes from contig to chromosome


Download ppt "The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon."

Similar presentations


Ads by Google