Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 SRI International Bioinformatics And now for our ‘Feature’ presentation: Automatic Loading of Protein Sequence Annotation Data from UniProt to Pathway.

Similar presentations


Presentation on theme: "1 SRI International Bioinformatics And now for our ‘Feature’ presentation: Automatic Loading of Protein Sequence Annotation Data from UniProt to Pathway."— Presentation transcript:

1 1 SRI International Bioinformatics And now for our ‘Feature’ presentation: Automatic Loading of Protein Sequence Annotation Data from UniProt to Pathway / Genome PGDBs Tomer Altman Bioinformatics Research Group SRI International taltman@ai.sri.com

2 2 SRI International Bioinformatics Protein Features in Pathway Tools Represents annotations along a polypeptide sequence Can represent anything from active sites to secondary structure Defined by a set of classes rooted at ‘|Protein-Features| Are found in the ‘FEATURES slot of ‘|Proteins| instances

3 3 SRI International Bioinformatics Protein Features Displayed

4 4 SRI International Bioinformatics BioWarehouse UniProt Loader Parses the XML versions of the SwissProt and TrEMBL databases Loads the Feature table with the corresponding sequence annotation entries BioWarehouse is open-source software Currently being extended to support alternate sequences and sequence annotation citations

5 5 SRI International Bioinformatics Extensions to the Pathway Tools Schema Rooted as a sub-class under ‘|Protein-Segments| Mirrors protein features available from the UniProt controlled vocabulary Makes distinctions between variants due to human activity, variants within an organism, and variants across a strain population

6 6 SRI International Bioinformatics UniProt Feature Importer PGDB proteins are mapped to entries in UniProt via UniProt Accession Numbers If it does not already exist, the protein feature is imported from UniProt Identity is based on the associated protein object, ‘|Protein-Feature| sub-class, and location along the protein. If the previously-imported protein feature was deleted from UniProt, it is removed from the PGDB

7 7 SRI International Bioinformatics Current Statistics for EcoCyc 19032 total ‘|Protein-Features| instances (out of 75537 total frames in EcoCyc) 2130 manually created instances 16902 imported from UniProt 5586 ‘|Transmembrane-Regions| 1939 ‘|Metal-Binding-Sites| 1647 ‘|Mutagenesis-Variants| 1146 ‘|Conserved-Regions|

8 8 SRI International Bioinformatics Current Work Extending the UniProt Loader to import variant sequence information, and citations Adding interface to UniProt Feature Importer from Pathway Tools Creating databases on PublicHouse (publicly accessible BioWarehouse instance) to allow our users to import protein features into their own PGDBs

9 9 SRI International Bioinformatics Acknowledgements Alex Shearer Suzanne Paley Ingrid Keseler Valerie Wagner EcoCyc.org


Download ppt "1 SRI International Bioinformatics And now for our ‘Feature’ presentation: Automatic Loading of Protein Sequence Annotation Data from UniProt to Pathway."

Similar presentations


Ads by Google