Presentation is loading. Please wait.

Presentation is loading. Please wait.

Class 3 2009 European Resources Protein Focused. Protein Databases EBI – European Bioinformatics Institute

Similar presentations


Presentation on theme: "Class 3 2009 European Resources Protein Focused. Protein Databases EBI – European Bioinformatics Institute"— Presentation transcript:

1 Class 3 2009 European Resources Protein Focused

2 Protein Databases EBI – European Bioinformatics Institute http://www.ebi.ac.uk/

3 What is the difference between dealing with nucleotide DBs and protein DBs?

4 Protein information Name & description Gene encoded from Organism Function (only one?) Enzyme? Ligands? PTMs? Interactions? Biological processes. Structure. Sequence. Localization More...

5 Protein DB -short history Pre-UniProt Swiss-Prot: created in July 1986; since 1987, a collaboration of the SIB and the EMBL/EBI; TrEMBL: created at the EBI in 1996 as a computer-annotated protein sequence database supplementing Swiss-Prot. It was introduced to deal with the increased data flow from genome projects

6 PIR EBI SIB

7 The three-layered approach The UniProt Archive (UniParc) UniProtKB + all other protein sequences publicly available Completeness The UniProt Reference Clusters (UniRef) Non-redundant views of UniProtKB + selected UniParcsets Speed The UniProt Knowledgebase (UniProtKB) Central database of annotated protein sequences and functional information UniProtKB/Swiss-Prot + UniProtKB/TrEMBL

8 Protein DBs Swiss-Prot - manually annotated. TrEMBL – translated EMBL, automatically annotated. UniProtKB – The UniProt Knowledge UniParc – The Achieve pf UniProt PIR - Protein Information Resource UniRef – The UniProt Reference Clusters PDB – Protein Data Bank – structure PRIDE – Resource for experimental proteomics (not in this class)

9 Databases growth www.genome.jp/en/db_growth.html

10 Protein DBs Swiss-Prot - manually annotated 2005- ~100,000 2009 - ~400,000

11 . TrEMBL – translated EMBL, automatically annotated.

12

13 Protein Names Different DBs – different accessions AccessionsDB P12345TrEMBL MAPK_HUMANSwiss-Prot (to be changed..) NP_123456 XP_123456 RefSeq UniRef100_P99999 UniRef90_P99999 UniRef50_P99999 UniRef ENSP00000123456Ensembl

14 Protein DBs Swiss-Prot - manually annotated. TrEMBL – translated EMBL, automatically annotated. UniProtKB – The UniProt Knowledge UniParc – The Achieve pf UniProt PIR - Protein Information Resource UniRef – The UniProt Reference Clusters PDB – Protein Data Bank – structure PRIDE – Resource for experimental proteomics (not in this class)

15 Principles

16 More in UniProt a complete annotated protein sequence database The Universal Protein Resource for protein sequences.UniProt A non-redundant archive of protein sequences extracted from public databases and contains only protein sequences. UniProt Archive Features clustering of similar sequences to yield a representative subset of sequences. This produces very fast search times. UniProt/UniRef A repository specifically developed for metagenomic and environmental data. UniProt/UniMES

17 Protein DBs Swiss-Prot - manually annotated. TrEMBL – translated EMBL, automatically annotated. UniProtKB – The UniProt Knowledge UniParc – The Achieve pf UniProt PIR - Protein Information Resource UniRef – The UniProt Reference Clusters PDB – Protein Data Bank – structure PRIDE – Resource for experimental proteomics (not in this class)

18 How is it built?

19

20 http://beta.uniprot.org/ What’s in UniProt?

21 EBI interface

22 PIR – Protein Information Resource Protein Family Classification System Integrated Protein Knowledgebase Integrated Protein Literature, Information and Knowledge

23 END If you got lost…(class exercise) some more slides…

24 EB-eye search

25

26

27 NCBI - Entrez


Download ppt "Class 3 2009 European Resources Protein Focused. Protein Databases EBI – European Bioinformatics Institute"

Similar presentations


Ads by Google