Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Similar presentations


Presentation on theme: "The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)"— Presentation transcript:

1 The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

2 Project Aim To design a Online Portal to search and visualise protein complexes Including cross-referencing to source databases and beyond Export to interested parties in a format of their choice Incorporate the data into network analysis tools Emphasis on major model organisms, chosen to span the taxonomic range – Homo sapiens, Saccharomyces cerevisiae, Escherichia coli Mus musculus, Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces pombe, Arabidopsis thaliana All data held in IntAct DB – share editor, protein update mechanism, QC procedures Separate search and visualisation facility wwwdev.ebi.ac.uk/intact/complex/

3 The complex portal Create stable complex identifiers Joined curation effort  benefit to all collaborating databases: Resource sharing Elimination of redundancies  benefit to user: One central resource that links to all source databases

4 Definition: stable protein complexes A stable set (2 or more) of interacting protein molecules which can be co-purified and have been shown to exist as a functional unit in vivo. Non-protein molecules (e.g. small molecules, nucleic acids) may also be present in the complex. What is not a stable complex? Two proteins associated in a pulldown / coimmunoprecipitation with no functional link Enzyme/substrate, receptor/ligand or similar transient interactions Exception - obligate complex that requires substrate/ligand, e.g. PDGF receptors

5 Source Databases PDBe (EBI) – almost 1000 complexes imported ChEMBL (EBI) – 81 complexes imported, more to come with each release MatrixDB (Sylvie Richard-Blum, Univ. of Lyon) Mining UniProt – yeast (Bernd Roechert, SIB – manually) Reactome – human (EBI) Manual curation from IMEx DBs & the literature Gramene – Arabidopsis Unmaintained web resources – CYGD (yeast), CORUM (human), E. coli website, 3D Complexes (Sarah Teichmann, EBI),

6 Data captured currently for IntAct complexes Participants – proteins (UniProt), small molecules (ChEBI), nucleic acids (Ensembl, ChEBI, RNACentral?) Species Stoichiometry – when known Topology (= binding sites) – when known

7 Data captured currently for IntAct complexes Complex-specific, free-text annotation fields: Function and context – UniProt-style (visible in search results) Assembly, e.g. homodimer, heterotetramer… Physical properties, e.g. MW, size, topology/assembly Ligands Disease

8 Data captured currently for IntAct complexes Complex names: Recommended name: most recognisable name from literature, use GO component if specific complex exists in GO Systematic name: based on Reactome’s new CV names – ‘string of gene names with stoichiometry’ Synonyms: all other names the complex may be known as

9 Data captured currently for IntAct complexes Structured annotation using GO (BP, MF, CC) Cross references to experimental evidence: IMEx (+ non-IMEx IntAct & DIP), PDB, EMDB Cross references to related complex data: Reactome (human) ChEMBL PubMed (for further information) Intenz (enzyme EC numbers) OMIM (disease) ECO (evidence code ontology)

10 Parallel Annotation of complexes in GO Project start > 400 complex terms in GO CC, mostly children of GO:0043234 protein complex – lacking hierarchal structure Good collaboration with GO to provide structured annotation Parent terms mainly based on complex function TermGenie (TG) Standard Form Otherwise use TG Free Form Some complexes still direct children of GO:0043234 protein complex Adding “logical definitions” / “cross-products” / “extensions” e.g. “capable_of x activity”

11 ECO – Evidence Code Ontology ECO:0000353 physical interaction evidence used in manual assertion (=IPI) full experimental evidence for the complexes is present ECO:0000266 - sequence orthology evidence used in manual assertion (=ISO) only limited experimental evidence exists for a complex in one species (e.g. mouse) but it is desirable to curate the complex which has been curated in another species (e.g. human) and orthologous gene products exist in the former species, e.g. PDGFs ECO:0000306: inference from background scientific knowledge used in manual assertion, if: no or only partial experimental evidence can be found but the complexes are generally assumed to exist, e.g. GABA receptors exist in ChEMBL

12 Download At present: One PSI-MI xml 2.5.4 file for all complexes on ftp site From next IntAct release: One file per complex within a folder per species on ftp site and a zip file per species Future: Separate files for each complex accessible on each complex details page List of files for complexes from search results list Database specified dumps Network analysis appropriate format (as developed by MIPS)

13 Project status Website will move to production site end March Further development (particularly graphics) will be made public over the next 6 months Curation priorities – Human (mouse), yeast, Ecoli - user requests Exports to GOA (process and component) and UniProt under discussion.

14 Future Plans - Display Add search filters, e.g. Species –almost done GO terms ECO  Advanced Search Links to ‘experimental evidence’ and ‘related complexes’ searches Schematic view of complex Add existing widgets/BioJS components to show content from other databases directly in the Complex Portal (BioJS) - crystal structure, pathway, enzyme reactions etc

15 Future Plans - Functionality Concept of ‘sets’ – important for Reactome import Hierarchy of complex sets  specific complex  sub- complex Introducing features to indicate, e.g. complex-drug binding sites

16 Complexes on demand 1. Request via ‘Contact us’ button 1.Name & components 2.Experimental paper 3.Full details including Function, stoichiometry and topology.. or we give you access to editor to create your own

17 17 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

18 Summary of ‘User Survey’ and own goals

19 Summary of ‘User Survey’ - Search

20 Summary of ‘User Survey’ - Display

21 Summary of ‘User Survey’ - Features Expression Atlas?

22 Summary of ‘User Survey’ - Features Manually for mouse ECOxref to exp-evidence

23 Summary of ‘User Survey’ - Features Definition??? Reactome

24 Summary of ‘User Survey’ - Features

25 Summary of ‘User Survey’ - Downloads

26 IntAct and Complex Portal homepage

27 Complex Portal UniProt-style display

28 Complex Portal tab-style display


Download ppt "The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)"

Similar presentations


Ads by Google