Running the Transport Inference Parser

Slides:



Advertisements
Similar presentations
Editing Pathway/Genome Databases. SRI International Bioinformatics Pathway Tools Paradigm Separate database from user interface Navigator provides one.
Advertisements

eBilling Training Invoicing
SRI International Bioinformatics 1 The consistency Checker, or Overhauling a PGDB By Ron Caspi.
Guide to Oracle10G1 Introduction To Forms Builder Chapter 5.
Automating Tasks With Macros
Transport Inference Parser: Inferring Transport Reactions from Protein Data for PGDBs.
XP New Perspectives on Microsoft Access 2002 Tutorial 51 Microsoft Access 2002 Tutorial 5 – Enhancing a Table’s Design, and Creating Advanced Queries and.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
Transport Inference Parser: Inferring Transport Reactions from Protein Data for PGDBs Thomas J Lee, Peter Karp, AIC BRG Ian Paulsen consulting.
XP New Perspectives on Integrating Microsoft Office XP Tutorial 2 1 Integrating Microsoft Office XP Tutorial 2 – Integrating Word, Excel, and Access.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
Management Information Systems MS Access MS Access is an application software that facilitates us to create Database Management Systems (DBMS)
Duty Log and Chat Setup SSG Frese, Jerome S. Sensor Manager Cell 12 MDD.
Transport Identification Parser: Inferring Transport Reactions from Protein Data for PGDBs Thomas J Lee, Peter Karp, AIC BRG Ian Paulsen consulting.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
Key Applications Module Lesson 21 — Access Essentials
MS Access 2007 Management Information Systems 1. Overview 2  What is MS Access?  Access Terminology  Access Window  Database Window  Create New Database.
The consistency Checker, or Overhauling a PGDB By Ron Caspi.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
FIX Eye FIX Eye Getting started: The guide EPAM Systems B2BITS.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
Online Catalog Tutorial. Introduction Welcome to the Online Catalog Tutorial. This is the place to find answers to all of your online shopping questions.
Office of Housing Choice Voucher Program Voucher Management System – VMS Version Released October 2011.
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Key Applications Module Lesson 22 — Managing and Reporting Database Information Computer Literacy BASICS.
T U T O R I A L  2009 Pearson Education, Inc. All rights reserved Address Book Application Introducing Database Programming.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
Recent Developments and Future Directions in Pathway Tools Peter D. Karp SRI International.
Emdeon Office Batch Management Services This document provides detailed information on Batch Import Services and other Batch features.
Core LIMS Training: Entering Experimental Data – Simple Data Entry.
User Manual for Contact Management Customer Relationship Management (CRM) for Bursa Malaysia 2014 Version 1.0 | 4 September 2014.
Editing Pathway/Genome Databases
Standard Operating Procedure
Michigan Electronic Grants System Plus
Quick Instructor Guide
by Markus Krummenacker June 2011
Practical Office 2007 Chapter 10
Single Sample Registration
Download/Upload Receipts
Basic User Site Access Training & Producing Reports
PathoLogic: More about Matching Enzyme Names to Reactions
RAD-IT Architecture Software Training
Central Document Library Quick Reference User Guide View User Guide
an Exciting New Way for Students to Build their Class Schedule
PPS/OPTRS Departmental Roles Structure System
Reachability Analysis Bioinformatics Research Group
Tutorial 3 – Querying a Database
NextGen Trustee General Ledger Accounting
Reachability Analysis Bioinformatics Research Group
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
Scenario Modeling in GoldSim
Incremental PathoLogic
Propagating Changed Annotation and Pathway Information
Microsoft Official Academic Course, Access 2016
Two methods to observe tutorial
HP ALM Defects Module To protect the confidential and proprietary information included in this material, it may not be disclosed or provided to any third.
G-Databases Competency 7.00
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
Using Templates and Library Items
IBM SCPM PIT Data Download/Upload
Instantiation of Generic Reactions
SRI Bioinformatics Research Group
Instantiation of Generic Reactions
Instantiation of Generic Reactions
Guide: Certify results Version of Ladok by the latest update:
Reachability Analysis
Unbalanced Reactions by Markus Krummenacker Q
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
Create, Upload and Use Data Extensions (Lists)
Presentation transcript:

Transport Inference Parser: Inferring Transport Reactions from Protein Data for PGDBs

Running the Transport Inference Parser 1. Run Pathway Tools. 2. Make the organism of interest the current organism. 3. [Run operon predictor]. 4. Select Tools/Pathologic. 5. From Pathologic, select Refine/Transport Inference Parser. 6. If running TIP for the first time on the organism, optionally provide its aerobicity. 7. Wait and observe progress. 8. When complete, Probable Transporter Table window appears. 9. You may now review and modify the inferred transporters. TIP can take significant time to run so we start it and let it run during the first portion of the talk. Operon predictor is under Pathologic/Refine menu – Trnascription Unit Prediction

Background Implemented in consultation with Ian Paulsen Reference: Annotation-based inference of transporter function. Thomas J. Lee, Ian Paulsen and Peter Karp. Bioinformatics, vol. 24, pp. 259-67, 2008.

Purpose of TIP Infer transport reactions from protein data and construct them in BioCyc PGDBs. Present results for review so that predictions can be reviewed for acceptance, rejection, and modification.

Results of running TIP Add the following to the PGDB for each inferred transported substrate: Transport-Reaction frame of correct subclass Assign compartments – use simple assumptions Enzymatic-Reaction frame linking protein to reaction Construct Protein-Complexes as required Evidence codes and provenance data added to these

Sequence of internal operations 1. Find candidate transporter proteins. 2. Filter out candidates. 3. Identify substrate(s). 4. Assign an energy coupling to transporter. 5. Identify compartment of each substrate. 6. Group subunits of transporter complexes. 7. Construct full compartmental reaction from substrate and coupling. 8. Construct enzymatic reaction linking each reaction with protein.

1. Find candidate transporter proteins Input: all protein frames of organism Output: internal data structures for each candidate Annotation must contain an indicator. Exs: "transport”, “export”, “permease”, “channel” Exclude proteins with long annotations (default: 12 words)

2. Filter candidates Exclude if annotation matches a list of regular expressions of counterindicator phrases and patterns Ex: “transport associated domain” Exclude if annotation contains counterindicator word Exs: “regulator”, “nuclear-export”

3. Identify substrate(s) Search annotation for names of MetaCyc compounds. Details: Multiple substrates indicate multiple reactions, symport/antiport pair, or both. Exs: “cytosine/purines/uracil/thiamine/allantoin permease family protein” “magnesium and cobalt transport protein cora, putative” “sodium:sulfate symporter transmembrane domain protein” “probable agcs sodium/alanine/glycine symporter” Exclude non-substrates that look like compounds via an exception list. Exs: “as” “be” “c” “i”

3. Identify substrate(s) (cont.) Name canonicalization. Ex: strip plurals. Affixed substrates. Exs: “-transporting” “-specific” Lookup special ionic forms. Exs: “cuprous” “ferric” “hydrogen” Resolve multivalent options using aerobicity. Exs: “FE” “CR” “MN” Two-word substrates, substrate classes (no 3+ word substrates). Ex: “amino acid”

4. Assign an energy coupling. Couplings: Channel, Secondary, ATP, PTS, Unknown Search annotation for prioritized list of indicators. Exs: "atp-binding" => ATP "mfs" => SECONDARY "pts" => PTS "phosphotransferase" => PTS "carrier" => SECONDARY "channel" => CHANNEL Some substrates imply a coupling. Ex: protoheme => ATP Absence of indicator => UNKNOWN

5. Identify compartment of each substrate. Use keywords to determine compartment of primary substrate (Exs: “export”, “antiporter”) Otherwise assume primary substrate is transported into cell (periplasm => cytoplasm) Deferred complex compartment analysis: Assume E.coli-like cellular structure

6. Group subunits of transporter complexes. Many transporters are systems of several proteins. These are grouped into complexes Grouping criteria; all must be met: Predicted coupling is ATP or PEP Predicted substrates are identical Genes of proteins have a common operon (NOTE requirement on operon availability) Resulting complex is added to PGDB as a frame Protein-Complexes. Complexes may be created manually prior to running or re-running TIP. This may be useful if the organism has no operons.

7. Construct full compartmental reaction from substrate and coupling. Determine set of transported substrates for this transporter: For SECONDARY coupling: Identify auxiliary substrate providing ion gradient (H+, Na+) Remove from transported substrate list Place on side of reaction indicated by symport/antiport clues For other couplings: Determined previously in substrate analysis Strictly a mechanical step, except some additional inference is done for secondary transporters.

7. Construct full compartmental reaction from substrate and coupling (cont). For each transported substrate of this transporter, either import reaction (from E.coli) or to create new one. Search import KB for reaction with matching substrates:(find-rxn-by-substrates) Transported substrate added with indicated compartment Auxiliary substrates determined by coupling. Ex: CHANNEL have none ATP have ATP/H2O  ADP/phosphate If one reaction is found, import: (import-reactions trxns src-kb dst-kb …) If multiple reactions found, retain all. Else if reaction is not present in PGDB, create new rxn

7. Construct full compartmental reaction from substrate and coupling (cont). Create new reaction: Create reaction frame, subclass determined by coupling: (create-instance-w-generated-id rxn-class) Add transported and auxiliary substrates to appropriate sides of reaction

8. Construct enzymatic reaction linking each reaction with protein. For each created reaction: (add-reactions-to-protein …) Added evidence code, history string arguments Subordinates new [(import-reactions) handles import of enzymatic-reactions]

Running the Transport Inference Parser 1. Run Pathway Tools. 2. Make the organism of interest the current organism. 3. [Run operon predictor]. 4. Select Tools/Pathologic. 5. From Pathologic, select Refine/Transport Inference Parser. 6. If running TIP for the first time on the organism, optionally provide its aerobicity. 7. Wait and observe progress. 8. When complete, Probable Transporter Table window appears. 9. You may now review and modify the inferred transporters. Any attendees who are running TIP are advised to check that it has completed successfully. If not, identify any problems that occurred.

GUI Overview Window is titled: Probable Transporter Table for Organism Table of inferred transporters is organized into columns: Status Gene Substrate Coupling Reaction / Function 3. Each row contains a transport reaction description: Multiple reactions per transport protein are possible Sort by Gene (the default) to keep together visually 4. Aggregate pane shows counts by status. 5. Mousing over a reaction shows details in bottom pane.

Probable Transporter Table

Notional Probable Transporter Table Status Gene Substrate Coupling Reaction / Annotation Un-reviewed T0059 Ca2+ SECONDARY Ca+2[c] + H+[p] = Ca+2[p] + H+[c] calcium/proton antiporter Rejected T3669 phosphate ATP H2O + ATP + phosphate[p] = ADP + 2 phosphate[c] phosphate transport atp-binding protein Accepted T0080 Na+ CHANNEL Na+[p] = Na+[c] sodium channel

Reviewing and Editing Left-click on a row May edit: Dialog box appears May edit: Function (name) Energy coupling May invoke Reaction Editor on reaction May retract reaction May update status

TIP Dialog

Transporter Status Unreviewed: Initial value of status Accepted: Preserves edits Incorporates transporter into PGDB upon save Rejected: Discard transporter upon save Accept and Reject are undoable

Table row after rejection 

Dialog after rejection There is an analogous dialog when a transporter is accepted.

Filtering and Sorting Filtering excluded transporters from display: Filter low- or high-confidence transporters (low-confidence usually means ‘no substrate’) Filter by status Filter by number of reactions per substrate Sort transporters by columns like a spreadsheet: Gene Energy Coupling Substrate number/name Status (e.g., Accepted, Rejected)

Group Operations TIP permits en masse acceptance or rejection of remaining predictions being shown: Edit / Accept all Unreviewed predictions being shown Edit / Reject all Unreviewed predictions being shown Emphasize that transporters that are currently filtered out are not affected.

Saving Your Work TIP has made in-memory modifications to the PGDB; nothing is saved until exit from TIP. Exit / Save saves all predictions & edits. Exit / Cancel reverts to most recent save. Must exit to save work! TIP is inconvenient in that it does not support the conventional change/save/change/save paradigm for saving work. One must change/exit/restart/change/exit/restart… The reason is that TIP makes in-memory changes to the PGDB, and allowing incremental changes would leave the PGDB in an unreviewed and potentially incorrect state.

Multisession Workflow TIP remembers accepted predictions in the KB. TIP remembers rejected transporters in a file under the organism directory. To continue, re-run TIP and resume session. If you don’t resume (i.e., start from scratch): Will not re-predict Accepteds (they are in KB) Will re-predict Rejecteds

Batch Mode TIP supports batch mode operation as well as interactive Run by BRG for all Tier 3 PGDBs (>3000 KBs) To support both automated and user-controlled operation: Distinguish high- and low-confidence inferences Automated mode accepts all high-confidence inferences