Protein Ontology (PRO) Amherst, NY May 15, 2013 Cathy H. Wu, Ph.D. Director, Protein Information Resource (PIR) Edward G. Jefferson Chair and Director Center for Bioinformatics & Computational Biology, University of Delaware Professor of Biochemistry & Molecular Biology, Georgetown University PRO-PO-GO Meeting
2 Protein Ontology (PRO) Reference Ontology for Proteins One of the first set of six OBO Foundry ontologies PRO in OBO Foundry The Protein Ontology: a structured representation of protein forms and complexes Natale DA, Arighi CN, Barker WC, Blake JA, Bult CJ, Caudy M, Drabkin HJ, D'Eustachio P, Evsikov AV, Huang H, Nchoutmboube J, Roberts NV, Smith B, Zhang J, Wu CH. (2011) Nucleic Acids Res. 39, D [PMID: ]
PRO in OBO Foundry – Reference Ontology for Proteins – Capture knowledge about proteins to model biology Three sub-ontologies: connect protein types necessary to model biology – Ontology for Protein Evolution (ProEvo): Captures protein classes reflecting evolutionary relatedness of whole proteins – Ontology for Protein Forms (ProForm): Captures different protein forms of a given gene locus from genetic variations, alternative splicing, proteolytic cleavage, post-translational modifications (PTMs) – Ontology for Protein Complexes (ProComp): Captures distinct complexes as they exist in different species ; defines complexes through component proteins Protein Ontology (PRO) 3
4 PRO Overview: ProEvo Ontology for Protein Evolution (ProEvo): Captures protein classes reflecting evolutionary relatedness of whole proteins
5 PRO Overview: ProForm Ontology for Protein Forms (ProForm): Captures different protein forms of a given gene locus from genetic variations, alternative splicing, proteolytic cleavage, post-translational modifications (PTMs) => “proteoforms”
6 PRO Overview: ProComp Ontology for Protein Complexes (ProComp): Captures distinct complexes as they exist in different species; defines complexes through component proteins Serine Palmitoyltransferase (SPT) Complex
7 Why PRO Provides formalization to support precise annotation of specific protein classes/forms/complexes, allowing accurate and consistent data mapping, integration and analysis Allows specification of relationships between PRO and other ontologies, such as GO, SO (Sequence Ontology), PSI-MOD, ChEBI, MIM/Disease Ontology, CL (Cell Ontology), PO (Plant Ontology)/cROP Provides stable unique identifiers to distinct protein types Provides a formal structure to support computer-based reasoning based on homology and shared protein attributes, including “ortho- isoform,” “ortho-modified form”
8 PRO Framework PRO (ProForm, ProEvo, ProComp) is aligned with other OBO Foundry ontologies under the umbrella of the Basic Formal Ontology (BFO) PRO terms are defined/annotated using other ontologies and resources via definition of relations or mappings when appropriate
PRO in Biological Context Representation of protein forms & complexes in pathways/networks 9 TGF- Signaling Pathway
Arabidopsis thaliana: Organism-Gene level terms (UniProtKB) mapped to Gene level terms (PRO) based on PANTHER mapping of 12 reference genomes Plant PRO Terms 10
Plant PRO Terms Arabidopsis thaliana: Protein classes, forms and complexes related to cullin-containing complexes and brassinosteroid (BR) signaling Cullin-Related PRO Terms 11
Hierarchical View : Cullin-Related Terms 12
13 Cytoscape View : Cullin-Related Terms 13
Cytoscape & Entry Views : Cullin-Related Terms 14
Cullin-1 in SCF complexes SCF Complexes formed in response to auxin and jasmonate signaling Link to ChEBI for small molecule-containing complexes Cullin-1 Rubylated 15
Core proteins and other associated proteins annotated with GO related to BR signaling pathway (blue) BR Signaling Pathway 16 Connecting protein forms and complexes with annotation => Modeling biology
Exploring Relations Substrate-centric: What PTM forms of a protein and their modifying enzymes are known? Enzyme-centric: What substrates are known for a given PTM enzyme? Interaction: What interacting partners are known for each PTM form of a given protein? Pathway: What protein modifications and enzymes are known in a given signaling pathway? iPTM Network Coupled with functional annotation and biological context (homology, disease, tissue/cell..) => Hypothesis generation and discovery 17
18 PRO Dissemination PRO Website ( ) Searching, browsing, downloading PRO Views Entry view Table summary OBO stanza, OWL Ontology hierarchy Cytoscape network PRO Link: Persistent URL: OBO Foundry ( NCBO Bioportal ( Reciprocal links from/to collaborative resources