EBI Proteomics Services Team – Standards, Data, and Tools for Proteomics Henning Hermjakob European Bioinformatics Institute SME forum 2009 Vienna
Annotation Context UniProt Proteins Archive PRIDE Mass Spec IntAct Interactions Reactome Pathways Integration and dissemination EnVision DAS
The Proteomics Identifications Database (PRIDE) Centralized, standards compliant, public data repository for proteomics identifications Open source Open data 50,287,408 spectra 2,555,194 protein identifications Detailed annotation of meta-data Jones, P, et al: PRIDE: new developments and new datasets. Nucleic Acids Res Jan;36(Database issue):D
PRIDE data content
PRIDE web interface – overview
PRIDE web interface – experiment and protein
PRIDE web interface – mass spectra
PRIDE web interface – project comparison
PRIDE BioMart
The spectacular bit: across-BioMart queries! Question: “Which proteins, identified in PRIDE experiment 2, are involved in nucleotide metabolism” PRIDE Reactome
The IntAct Molecular Interaction Database Centralized, standards compliant, public data repository for protein interactions Open source Open data binary interaction reports S. Kerrien, et al: IntAct – Open Source Resource for Molecular Interaction Data. Nucleic Acids Res Jan;35(Database issue):D Orchard, S. et al: The minimum information required for reporting a molecular interaction experiment (MIMIx). Nat Biotechnol Aug;25(8):
The IntAct Molecular Interaction Database
Reactome Human pathway knowledgebase Manually curated Open source, open data Collaboration between EBI, OCRI and NYU Online since 2003 Matthews L, et al: Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res Nov 3.
Pathways 870 Reactions2900 Proteins2900 Complexes2250 References4200 Reactome content
authors summary species GO term other species Pathway description
Pathway participants UniProt Ensembl MIM KEGG ChEBI Compound Entrez Gene Hapmap UCSC RefSeq PubChem
SkyPainter ‘Painting’ the reaction map with user- supplied data, e.g. over- and under- expressed genes from a microarray analysis Animation for time series experiments Overrepresentation analysis, e.g. disease candidate genes concentrated to a pathway
#IDvalue1 P P Q P O P P P P P O Q Q P O P Q P O P O P P P P P P P … Usable identifiers: UniProt RefSeq Ensembl MIM Entrez Gene KEGG COMPOUND ChEBI Affymetrix GO SkyPainter
SkyPainter coloring according to the numeric values provided
SkyPainter Overrepresentation analysis
The Team EU: –ProDaC (to 03/2009) –ProteomeBinders –BioSapiens –Felics –LipidomicNet –APO-SYS –PSIMEx (since 03/2009) EMBL Wellcome Trust NIH The Funding
?
Lab B Private Data in PRIDE “Collaboration” Comparison Reviewer Lab A Lab C PRIDE private mode Publicly available data Private mode allows data analysis within a collaboration PRIDE tools are already accessible in private mode, in particular experiment comparison (alpha) On manuscript submission, reviewers can access the data in standard format
Lab B Private Data “Collaboration” Reviewer Lab A Lab C PRIDE private mode Publicly available data Private mode allows data analysis within a collaboration PRIDE tools are already accessible in private mode, in particular experiment comparison (alpha) On manuscript submission, reviewers can access the data in standard format On manuscript publication, the data becomes public